Next: Screening Crystals
Up: Web-Ice
Previous: Autoindex and strategy calculation
Subsections
Data processing
The tab Process is used to process data sets and,
optionally, to attempt structure solution by molecular
replacement. Data autoindexing and integration are carried out with
LABELIT and XDS. The data are analyzed with POINTLESS and XTRIAGE, scaled and merged with AIMLESS
(previously with SCALA, and transformed to amplitudes with TRUNCATE. A free R-set is also generated using 5% of the reflections.
If a PDB file is supplied, MOLREP is run, and the best
solution is refined with REFMAC5. Selected statistics and graphical
output from these programs is displayed on the results pages; Input
scripts and full log files are written to disk, under the
/data/$USER/webice/process directory. These files can also be
inspected via the web-ice application.
Since June 2017, Web-Ice is used to process automatically the
data from the runs in the Blu-Ice Collect tab, if a minimum of 5
images are collected. Older processed data are stored in a separate
directory, which can also be browsed from Web-Ice.
See the introduction to Web-Ice for the appropriate
software references.
When the Process tab is selected, the user is shown the User
Datasets page (it will be empty if Web-Ice processing has never been
used before. Once a data set has been processed, this page will list
the user supplied dataset name, creation time, the name of the first
image in the set and other information about the job status. the Delete link can be used to
clear the results from the /data/''user''/webice directory and delete
the dataset name; the images will not be deleted.
Figure 18:
Process tab page showing a summary of all processed datasets.
|
The processing run results can be browsed by clicking on the
Dataset Name of interest.
The runs can be sorted by ascending or descending alphabetical
order or by the time of
creation. The Update reloads the page and displays any new
results generated since the last visit to the page.
Once data processing has been started, the run parameters cannot be
modified. In order to change processing parameters, a new run must be
started from scratch.
After approximately one week, old processed datasets are moved
to a different directory. The list of these datasets can be displayed
by clicking on the Old Processed Datasets link.
Clicking on the New Dataset link in the Process navigation
toolbar open the data processing input form. If
a form has been filled in previously for another run, the form will be pre-populated
with the same values.
Figure 19:
New run form page.
|
- The Dataset Name is a compulsory field. The name must be a single word of
alphanumeric characters and it must be unique: a name used for another processing
run will be rejected.
- The full path to the first image in the data set must is also a
compulsory item. The Browse button to the right of the input box
can be used for locating the images in the user's data
directories. The Hide button can be used to remove the file list.
- The full path of the last image can also be provided. By
default, all the data with the same root name as the first image will
be processed. The last image can be used to omit the last images
in the data set, for example, if they show signs of radiation damage.
- By default, all the reflections in the maximum circle inscribed
in the detector will be processed. To process all the data into the
corners of the detector or data concentrated in the middle of the
detector, enter the appropriate resolution cutoff value in the Resolution input box.
- The I/Sigma box can be used to apply a resolution cutoff
based on the Mn(I/sd(I)) value from SCALA. If this option is
chosen and the Mn(I/sd(I)) in the highest resolution bin is smaller
than the specified I/Sigma, the approximate resolution is calculated
based on the desired cutoff and the higher resolution shells are
omitted from subsequent data processing.
- If the model is given, it will be used to search for a solution
by molecular replacement.
Once the dataset form have been filled up, processing is
initiated by clicking on the Create Run button. The browser will
display the results page, which shows the
status of the job and links to summarized results and plots by all the
different programs.
Figure 20:
Results page page during run time.
|
If automated data processing is enabled at a given beamline,
Blu-Ice will automatically start a data processing run when each data
collection run is finished, for the purpose of providing quick
feedback about the data quality.
The processing run will be given a unique
name, based on the collected images root names, and displayed in the
User Datasets list. Clicking on the run name will show a summary
of the input and links to the summarized results and plots. Note that
automatic data processing runs with no resolution or I/sigmaI
cutoffs. The scaling and merging statistics in the Summary page
will provide an estimate of the resolution limit of the data, which
may then be applied on manual reprocessing.
The Logs page links to a time-stamped
log listing the programs that have been run, and the log
file. The Log can be refreshed while the processing is running by clicking on the Update Log button.
Hint: In the log, search for the ISigma asymptotic (ISa) value in the
CORRECT.LP XDS job. Usable data sets tend to have a ISa above
10. The higher this number, the better the data are.
The Error Log lists fatal errors during execution of a
program or script. It can be useful to trace the ultimate cause of a
problem during processing.
Figure 21:
Data processing log.
|
The Summary page displays a summary of the data statistics
after scaling and merging, analysis with XTRIAGE, space group
and unit cell determination with POINTLESS and refinement (only if a
model has been supplied).
Figure 22:
Final scaling statistics displayed in the Summary sub-tab.
|
Hint: In the scaling statistics summary, look at the lowest
resolution bin numbers, since an overoptimistic sample to detector
distance setting during data collection can result in very poor high
resolution and overall statistics. Usable data sets tend to have R factors
lower than 0.05, Mn(I/sd(I)) higher than 10, a correlation coefficient
close to 1 and completeness between 95 and 100% in this resolution
bin. Also, if you expect an anomalous signal and the program detects
none, you probably need higher quality diffraction or a higher multiplicity.
This page lists the L and Z test plots from XTRIAGE, useful to
determine if the crystal is twinned and, if a model has been provided,
the change in R-factor after some cycles of refinement.
Figure 23:
Example of plot extracted from the program log files.
|
The Details link can be used to browse the command files,
output files and full logs from the processing job, stored under the
/data/''user''/webice/process/''DatasetName'' directory. The DataProcessing_0.
directory contains subdirectories for each program used for
processing. Each of these subdirectories, in turn, contains the
command file (''run.csh''), output files and logs (''stdout.txt'') for
each program run.
Figure 24:
Subdirectory list under DataProcessing_0 in the Details sub-tab.
|
Other directories less interesting to the user are the scratch
directory, which contains intermediate output files; plots, containing
the data displayed in the browser ''Plots'' page, and _metadata
with input and results related to the processing software framework.
The command files written to the webice/process/ subdirectories can
used for reprocessing the data set with different input
parameters. If you are not familiar with the programs used for
processing, consult the documentation available online in the
software pages
or consult
with the beamline support person.
|