ARP_WATERS (CCP4: Deprecated Program)

NAME

ARP_WATERS (ARP/wARP v5.0) - Automated Refinement Procedure for refining protein structures.

SYNOPSIS

arp_waters XYZIN foo_in.brk MAPIN1 foo_2fofc.map MAPIN2 foo_fofc.map XYZOUT foo_out.brk
[Keyworded input]

IDENTIFICATION

Automated Refinement Procedure

Version 5.0

User Guide


DESCRIPTION

This CCP4 distribution is not the full distribution of the ARP/wARP suite, and includes only the programs arp_waters (which is actually version 5.0 of the arp_warp program), prepform, prepshel and t_shift, and the script arp_waters_plots.sh (renamed from arp_warp_plots.sh).

The complete ARP/wARP package contains additional automated scripts and alpha versions of new programs (for automated building of protein structures in electron density maps; see "Automated protein model building combined with iterative structure refinement" Perrakis, A., Morris, R.J.H. and Lamzin, V.S., Nature Struct. Biol. 6 (1999) 458-463), and is freely available to academic users from the ARP/wARP homepage, http://www.arp-warp.org. Industrial users are asked to contact the authors for a license agreement.

The version of ARP distributed by CCP4 also contains minor changes which enable the writing of "summary tags" into the program output - see the libhtml documentation for details of these tags (and how to suppress them!). Please note that these changes do not in any way affect the running of the program, and are purely cosmetic.

In addition this version of ARP is substantially older than the current version distributed by EMBL, and is retained only for the purposes of adding waters (hence the change of name). Details of the current ARP/wARP suite (including how to get it) can be found at the ARP/wARP homepage, http://www.arp-warp.org/.

Contents

Introduction

The Automated Refinement Procedure, ARP_WATERS, is a program package for protein structure refinement. It combines in an iterative manner the reciprocal space structure factor refinement with updating of the model in real space. The latter attempts to mimic and automate a typically time extensive model rebuilding session at the graphics. The real space update is based on identifying and removing poorly defined atoms and the addition of potential new sites. This utilises some general shape properties of the electron density syntheses as well as stereo-chemical criteria.

The ARP)WATERS (actually ARP/wARP version 5.0) can be used in the following ways:

1.
Refinement of MR solutions
2.
Improvement of MAD and M(S)IR(AS) phases
3.
Averaging of multiple refinements
4.
Automatic tracing of the density map and model building (not available in CCP4 version)
5.
Building of the solvent structure
6.
Ab initio structure determination for metalloproteins at very high resolution

For a more detailed description of the ARP see the references given below.

The ARP/wARP procedure requires the use of reciprocal space refinement, density map calculation and the ARP/wARP software itself. The least-squares minimisation can be done with the CCP4 programs PROTIN / REFMAC with an optional additional scaling (e.g. using RSTATS). Use of other programs for least-squares minimisation, e.g. SHELXL, requires additional conversion to the CCP4 format which is provided within the ARP_WATERS package. Density map calculations are carried out with the CCP4 programs FFT and MAPMASK.


Author information

Users are requested to report any bugs or suggested changes to the authors.

Victor S. Lamzin
EMBL Hamburg Outstation,
c/o DESY, Notkestrasse 85,
22603 Hamburg, Germany
Tel. +49-40-89902-121, Fax +49-40-89902-149,
E-mail victor@embl-hamburg.de
Anastassis Perrakis
EMBL Grenoble Outstation,
c/o ILL, Avenue des Martyrs, B.P. 156,
38042 Grenoble CEDEX 9, France
Tel. +33-476-207632, Fax +33-476-207199,
E-mail perrakis@embl-grenoble.fr

References

Any application of ARP_WATERS should actually refer to ARP/wARP version5.0, and should cite a relevant publication (see the reference):

1
V. S. Lamzin and K. S. Wilson.
Automated refinement of protein models.
Acta Cryst., D49:129-149, 1993.

2
V. S. Lamzin and K. S. Wilson.
Automated refinement for protein crystallography.
Methods in Enzymology, 277:269-305, 1997.

3
A. Perrakis, T. K. Sixma, K.S. Wilson, and V. S. Lamzin.
wARP: improvement and extension of crystallographic phases by weighted averaging of multiple refined dummy atomic models.
Acta Cryst., D53:448-455, 1997.

4
D. Pignol, C. Gaboriaud, J. C. Fontecilla-Camps, V. S. Lamzin, and K. S. Wilson.
How to escape from model bias with a high resolution native data set - structure determination of the PcpA-S6 subunit III.
Acta Cryst., D52:345-355, 1996.

5
E. J. Asselt van, A. Perrakis, K. H. Kalk, and V. S. Lamzin.
Accelerated X-ray structure elucidation of a 36 kDa muramidase/transglycosylase using wARP.
Acta Cryst., D54:58-735, 1998.

Acknowledgements

The authors are especially grateful to:


Using ARP_WATERS

Applications

The areas of application of ARP_WATERS (actually ARP/wARP Version 5.0) include:

1.
Refinement of MR solutions
If the initial model (a Molecular Replacement solution) needs to be substantially improved then unrestrained xyzB reciprocal space refinement may be carried out with ARP/wARP performing updating of the whole model. Resolution of the data should be 2.0 Å or higher. The output is a set of ARP atoms (the ARP model). The (3F_o-2F_c / 2mF_o-DF_c, \alpha_c) map should be calculated from the ARP model and analysed carefully (yes, it's graphics time). The initial or the ARP model is then rebuilt to fit this map. Very often, if the X-ray resolution is high enough and the initial model is not completely wrong, the ARP atoms are located at approximately the true protein atom positions even in the case of unrestrained refinement. So they can be quite happily used as guides for rebuilding.

Please note, that for difficult cases approaches such as described for application #4 might work better even when starting from a molecular replacement solution.

2.
Improvement of MIR(AS) phases
ARP/wARP can be used to build a protein-like model consisting of a set of non-connected atoms (free atoms model) into a MIR map. This model is then refined as described above for #1.

3.
Averaging of multiple refinements
ARP/wARP can be used to prepare models and command scripts for several independent refinement runs as described for #1 and #2. The results are then processed in such a way that each reflection is given a weighted average phase, alphawARP, and a figure of merit, FOMwARP. The results, especially for modest resolution, are better compared to a single ARP/wARP refinement. The (F_o, alphawARP, FOMwARP) map is then calculated and should be inspected. Resolution of the data should be 2.3 Å or higher.

4.
Automatic tracing of the density map and model building
This is not available as part of the CCP4 distribution of ARP/wARP. Please visit the ARP/wARP homepage at http://www.arp-warp.org to obtain the full distribution from the authors.

5.
Building of the solvent structure
If the initial model is more or less correct, i.e. an R factor of about 30 % or less, and essentially only the solvent needs to be improved, restrained (standard) reciprocal space refinement is carried out with ARP/wARP performing automatic adjustment of the solvent structure. Resolution of the data should be 2.5 Å or higher. The output is the protein model with the solvent molecules transformed with symmetry operations to lie close around the protein. The (3F_o-2F_c / 2mF_o-DF_c, alpha_c) and (F_o-F_c / 2mF_o-DF_c, alpha_c) maps should be inspected.

6.
Ab initio structure determination for metalloproteins
ARP/wARP was successfully applied to the small, 52 amino acid protein rubredoxin. This structure could be solved ab initio. The success was clearly due to the the presence of the FeS4 cluster in the protein. The positions as derived from the Patterson synthesis were used as a starting model. This initial model gave an R factor of 53% at 0.92 Å resolution. The resulting ARP model gave an R factor of 16% and map correlation to the final model map of 90%. Subsequently the successful solution was obtained with X-ray data truncated to 1.6 Å.

Model and Data Requirements

Quality of initial model

As the ARP/wARP real space update of the model is carried out on the basis of electron density maps calculated with model phases, the starting model for the refinement should be reasonable. The higher the resolution of the native dataset the less reasonable the starting model can be: if you have 1 Å data for a metalloprotein, a reasonable model is the metal itself.

Quality of X-ray data

The data normally should be of high resolution. Unrestrained xyzB refinement with ARP/wARP at lower resolution can potentially lead to a poorer quality density map. The X-ray data should be complete, especially in the low resolution range (5 Å and lower). If the low resolution strong data are systematically incomplete (e.g. missing or overloaded reflections), the density map, even in the case of a good model, is usually discontinuous and is inconsistent with the model. Because ARP/wARP involves updating on the basis of density maps, such discontinuity can lead to incorrect interpretation of the density and as a result to slow convergence or even non-interpretable maps.

In general, the number of X-ray reflections should be at least 6 times higher than the number of atoms in the model.

Limitations

As ARP/wARP runs in conjunction with programs of the CCP4 suite all limitations of the latter remain. ARP/wARP itself is limited to:

1.
The CCP4 conventions should be set up before running ARP/wARP

2.
Density maps and reflection MTZ files in the CCP4 format

3.
Maximum map section size is 400,000 points. The maximum number of map sections is 1,000. The maximum number of atoms in extended real space asymmetric unit is 250,000

4.
Only acentric space groups (typical for proteins) and P1 are supported

5.
ARP/wARP operates with coordinate files in the standard PDB format

Automated Scripts

The full distribution of ARP/wARP contains a number of automated scripts which are designed to help avoid mistakes and generally improve the user-friendliness of the programs. These scripts are not provided with the CCP4 distribution of ARP/wARP (which is any case substantially older than the current release of ARP/wARP) and so if you want to use them you will need to obtain the full distribution from the ARP/wARP homepage at http://www.arp-warp.org/.

Supplementary Use of ARP_WATERS

After restrained refinement is complete and before using the graphics it is worth knowing which parts of the model should be corrected.
ARP_WATERS can be used for this purpose.

arp_waters XYZIN input.BRK MAPIN1 3Fo-2Fc.MAP XYZOUT temp << eof
MODE UPDATE ALLATOMS
CELL number number number number number number
SYMMETRY number/string
RESOLUTION number number
REMOVE ATOMS 50 CUTSIGMA 1.0
END
eof

The output of this job will contain a list of the 50 worst (from ARP/wARP 's point of view) atoms which do not agree with the electron density. These atoms should be inspected first. The input MAPIN1 should be the (3F_o-2F_c / 2mF_o-D_Fc,alpha_c) map.

Updating Old Command Files

If you have a working command file from a previous release just change the ARP part to look like this:

arp_waters XYZIN input_coordinates MAPIN1 3Fo-2Fc_map_file \
MAPIN2 Fo-Fc_map_file_name XYZOUT output_coordinates << eof
MODE UPDATE ALLATOMS/WATERS
[CELL cell parameters]
[REFINE waters/allatoms]
SYMM spacegroup
RESOLUTION resmin resmax
FIND ATOMS number CHAIN string CUTSIGMA number/AUTO
REMOVE ATOMS number CUTSIGMA number [MERGE number] [KEEP ZEROOCC]
END
eof

Keyworded input to ARP_WATERS

The ARP_WATERS input is keyworded. For example to give the cell parameters to the program we use the keyword CELL followed by the actual numbers, for instance CELL 40.86 52.34 87.69 90 90 90

An input card may also be followed by a number of subkeywords (this should become clear on further reading). The first keyword in a file MUST BE MODE and the last one MUST BE END. Other keywords may appear in any desired order. The order of the subkeywords has no restrictions.

Different ARP/wARP modes, require different input files and different keywords. Examples are given below. The slash symbol (/) separates alternative subkeywords. Only the first four characters of each keyword or subkeyword (except END) are needed to actually identify it.

The available keywords are:

MODE, CELL, SYMMETRY, RESOLUTION, FIND, REMOVE, REFINE, MIRBUILD, SHAKEMODEL, LABIN, LABOUT, END

The Keywords

MODE Must be the first keyword.

update allatoms/waters initialises the update mode. allatoms indicates that both protein and water atoms from the model will be considered for update. waters indicates that only water atoms (residue name HOH or WAT) will be updated. Metals will be treated as non-water atoms. The distance constraints for the addition of new atoms are: the shortest distance between new atom and any of the existing atom is 1.0 Å (allatoms) and any of the O or N of the existing atoms (waters) is 2.3 Å; the longest distance is 3.3 Å in both cases. The distance constraint for removal is 3.5 Å or longer to any of the existing atoms. Partially occupied atoms will not be used for merge, their occupancy is accounted for in removal. These atoms are used anyway as seeds (parent atoms) for the new atom search.

mirbuild initialises the mirbuild mode. The pseudo protein set of atoms will be placed into the input density map. The distance constraints are 1.1 to 1.8 Å  between the atoms.

shakemodel light/allatoms initialises the shaking mode for a shock-like modification of the current model. light indicates that only atoms with atomic number 8 (oxygen) or lower will be treated. allatoms indicates application to any atom in the model regardless of their type.

reflaver initialises the mode of weighted averaging of structure factors obtained from multiple refinements of several slightly different models.

CELL Cell parameters a, b, c, alpha, beta, gamma in Å and degrees. This keyword is optional for MODE update allatoms/waters and shakemodel light/all atoms and is obligatory for MODE mirbuild.
SYMMetry The crystal symmetry. Can be given either as a space group name or number (e.g. P212121 or 19). Obligatory for MODE update allatoms/waters, mirbuild and shakemodel light/all atoms.
RESOlution Resolution of the X-ray data (Rmin, Rmax). Obligatory for MODE update allatoms/waters, mirbuild and reflaver.
FIND The addition of new sites in MODE update allatoms/waters.

After atoms you should give the number of atoms to add. At the end of refinement (it may take 20 to 50 cycles) the model should contain all atoms. The target number of atoms in the final model can be estimated by multiplying the number of protein atoms by 1.2, the 20% extra corresponds both to ordered water molecules and weaker, slightly disordered, ones which are important for the pseudo solvent continuum. The number of atoms allowed to be added in each cycle depends on the resolution. A simple empirical guide is that the maximum number to add is N X 0.08/d3, where N is the current number of atoms and d is the highest resolution in Å. Thus at a resolution of 1.8 Å and a coordinate file of 2,000 atoms the maximum number to be added is 27. New atoms will be automatically assigned a temperature factor on the basis of the density height.

The string after chain is the chain identifier for new atoms. All new atoms will have this chain identifier and be numbered sequentially.

The subkeyword after cutsigma can be either the number or auto. The number is a MAPIN2 density cutoff. Atoms will be looked for in density above cutsigma times r.m.s. density. A value of 3 to 4 is typical. The statistically significant density threshold can be defined automatically if auto is used. This can be used for MODE update waters as it prevents too many extra atoms being added. However it may not work satisfactorily if the resolution is lower than 1.5 Å or the model is too far from being finally refined.

REMOve Removal of atoms in MODE update allatoms/waters. The removal of atoms influences the success of refinement to a much greater extent than addition of new atoms and should certainly be used.

The number after atoms is the maximum number of atoms to reject at each cycle. A value of about 25 to 100% of the number of atoms to be added is recommended. The actual number will be defined by the program.

The number after cutsigma gives the MAPIN1 density cutoff. Atoms will be considered for rejection only if they are located in density below cutsigma times r.m.s. density. A value around 1 is recommended.

The number following the merge keyword is the shortest distance between two atoms if they are to be merged. Partially occupied atoms are not used for merging. The keyword is optional. Any pair closer than this will be inspected. In a case of a water-water pair the atom with the higher temperature factor will be rejected and the second assigned to the weighted average xyz and 1/B parameters. If any water appears to be at the merging distance to a non-water (protein or metal) atom, it will be removed. A merging distance value of 0.6 Å is default for mode update atoms and the value of 2.2 is recommended for mode update waters where the default is no merging.

keep zeroocc is an optional keyword. Default is to remove atoms with zero occupancy from the PDB file.

REFIne This initialises the sphericity based real space refinement of individual atoms. The keyword is optional in MODE update allatoms/waters.

The subkeyword can be either allatoms (all atoms will be refined - not recommended unless the resolution is about 1.0 Å) or waters (strongly recommended for analyse waters mode, especially if the resolution is higher than 2.0 Å).

MIRBuild Obligatory keyword for MODE mirbuild.

The number after atoms indicates the approximate number of atoms to be placed into the MIR(AS) MAPIN2 map. It should correspond to the total number of atoms expected to be in the model. The number after models specifies how many different models can be output. It may be 1, 2 or 3. These different models are subsequently used for multiple refinement and weighted averaging.

SHAKemodel Obligatory keyword for MODE shakemodel. There are four optional subkeywords.

The number after bexcl is the highest temperature factor cutoff. Atoms with higher temperature factors will be excluded from the PDB file.

The two numbers after breset define the low and high limits for truncation of atomic temperature factors.

The number after randomise defines the r.m.s. uniform random shift in Å to be applied to the coordinate set.

The three numbers after shift define the systematic shift along in Å the crystallographic axes to be applied to the coordinate set.

LABIn Obligatory keyword for MODE reflaver. Input MTZ file labels for structure factors from multiple refinements have to be given, e.g. FP=FP SIGFP=SIGFP FC1=FC1 PHIC1=PHIC1 etc. The maximum number of FCx/PHICx is 8. free is optional.
LABOut Obligatory keyword for MODE reflaver. Output MTZ file labels for weighted average structure factors, phases and figures of merit should be provided.
END Must be the last data card terminating input to ARP/wARP.


On-line help

The ARP/wARP input pre-processor gives warnings or error messages if something is wrong. These should be carefully checked. It is also advisable to check ARP/wARP input prior to submitting a long refinement job.

Here are a few examples of how the on-line commands can be used. To start just type 'arp_waters' and then the keyword you are interested in.

arp_waters
END
Input must start with the keyword MODE

arp_waters
MODE
Keyword MODE must be followed by 1 field(s)

Expected format:

MODE update waters/allatoms
MODE mirbuild
MODE shakemodel light/allatoms
MODE reflaver

arp_waters
MODE UPDATE WATERS
Optional keywords:
CELL cell parameters
REFINE waters/allatoms

Required keywords:
SYMM spacegroup
RESOLUTION resmin resmax
FIND ATOMS number CHAIN string CUTSIGMA number/AUTO
and/or REMOVE ATOMS number CUTSIGMA number [MERGE number] [KEEP ZEROOCC]
END (must be the last keyword)

arp_waters
MODE UPDATE WATERS
CELL
An error message:

This Data Card in not understood
Keyword CELL must be followed by 6 field(s)

Expected format:

CELL a b c alpha beta gamma

arp_waters
MODE UPDATE WATERS
CELL 30 45 37 90 90 90 A
This Data Card in not understood
CELL 30 45 37 90 90 90 A
Cannot accept field shown by arrows:
CELL 30 45 37 90 90 90 ==>A<==

arp_waters
MODE UPDATE WATERS
CELL 30 45 37 90 90 90
SYMM 4
RESOLUTION 20 1.5
FIND ATOMS 10 CHAIN W CUTSIGMA 3.0
REMOVE ATOMS 10 CUTSIGMA 1.0
END
Asymmetric unit limits 1/1   1/2   1/1

Comments: Space group 4 P21
Comments: Cell parameters 30.000 45.000 47.000 90.000 90.000 90.000
Comments: Remove 10 old atoms if below 1.0 sigma in MAPIN1
Comments: Analyse waters only for removal
- WARNING - This is not a standard use of ARP
- use of MERGE data card is advisable

Comments: Look for 10 new atoms in MAPIN2
Above threshold of 3.0 sigma
- WARNING - This is not a standard use of ARP
- use of CUTSIGMA AUTO option is recommended
- assuming that MAPIN2 is Fo-Fc map

Comments: New atoms will not be put closer than 2.30 to existing atoms
Comments: New atoms will be selected if there is N or O exists within 3.30
Comments: New atoms will not be put closer than 2.30 to each other
Comments: New atoms will have B-factors assigned on the basis of MAPIN2
- density hight as expected for resolution range 1.50 20.00
- MAPIN2 is assumed to be Fo-Fc map in absolute scale
Comments: New atoms will have chain name W

- No real space refinement will be made
- WARNING - This is not a standard use of ARP
- real space refinement of waters is advisable

So ARP/wARP actually accepts the command file input and the program only gives comments and warnings (if everything else is formally correct). It will also make additional checks during the run.


Monitoring and Troubleshooting

Input Processing

ARP/wARP checks identity in the input cell parameters and those from the coordinate and map file headers. ARP/wARP does not check whether the cell parameters are meaningful at all, i.e. it will accept CELL 67.1 82.2 79.9 102.2 98.9 100.3 together with SYMM P212121.

ARP/wARP checks whether the orthogonalisation matrix derived from CELL is consistent with the matrix written at the top of the coordinate file.

ARP/wARP will refuse to accept a negative value of the number of atoms to update but does not check whether these numbers are not too high, i.e. are consistent with the formula given above.

ARP/wARP does not check whether the input MAPIN1 is indeed a (3F_o-2F_c / 2mF_o-DF_c, alpha_c) map or if MAPIN2 is really a (F_o-F_c / mF_o-DF_c, alpha_c) map.

ARP/wARP does not check the input coordinate file in terms of proper connectivity, residue and atom names, etc.

Output

ARP/wARP outputs several useful quantities. These are: the number of atoms merged, the number of atoms removed, the sphericity functions indicating whether atoms are well shaped - a value of about 0.05 to 0.10 (the lower the better) is reasonable, the result of improvement of the sphericity function if sphere-based real space refinement is used, the statistically significant threshold in difference density (if FIND cutsigma auto is provided) for addition of new atoms, the number of atoms added.

The auto option provides an attempt to be objective in adding atoms. The actual number of atoms to remove depends both on REMOVE cutsigma value and atoms number). If the user during reshuffling the structure asked for not enough removal, the result would be that not enough new atoms are found. If the requested number for removal is too high (but assumed to satisfy the formula given above) - more new atoms will be found. A situation where each cycle ARP/wARP removes less than about 2-3 atoms (for typical structure of 1,000 to 3,000 atoms) and finds the same number of new ones and the R factor does not change indicates that convergence has been achieved. There is no reason to run millions of cycles. Usually refinement essentially converges after 10 to 20 cycles. However if the density is still getting better the number of cycles can be increased to 50 or even 100.

Viewing ARP_WATERS Log Files

It is important to monitor the ARP/wARP output. In general look at log files. All ARP log files can be formatted for viewing all kinds of interesting graphs with CCP4 program xloggraph by running 'arp_waters_plots.sh log_file_name'.

Checking Convergence

Several parameters can be used as convergence criteria. The first criterion is map quality. A map with coefficients (3F_o-2F_c/2mF_o-DF_c, alpha_c) is calculated from the last ARP model. The crystallographic R factor is a reasonable quantity to monitor.

What to do if the R factor stays at the values around 30%:
(Check with something like grep 'all_R' logs/1_arp_1.log) If for example after 5 or 10 cycles, R dropped to 28-34% and stayed there for the next 10 cycles without any tendency to drop further, you may be in trouble. Try to change from Fast to Slow protocol or opposite, try to introduce phase restraints, change advanced parameters, panic, cry, etc.! We are working on more sensible suggestions all the time, so as a last resort contact us! Your feedback is needed and appreciated!

Crashing Scripts

Usually CCP4 defines environment MANPATH as complementary to the existing MANPATH. During execution of remote shells MANPATH does not exist, and this crashes remote scripts! Copy the ccp4.setup file to your local directory, and simply remove the line setenv MANPATH, and then set ccp4init to that file.

Please also check (and change if necessary) the line setenv CCP4_OPEN NEW to setenv CCP4_OPEN UNKNOWN.


Examples

A typical set of ARP/wARP commands for applications #1, 2, 5 and 6 (unrestrained or restrained refinement for MR, MIR, ab initio solutions or building of solvent structure) could look something like this:

arp_waters XYZIN input_coordinates MAPIN1 3Fo-2Fc_map_file \
MAPIN2 Fo-Fc_map_file XYZOUT output_coordinates << eof
MODE update allatoms/waters
[CELL cell parameters]
[REFINE waters/allatoms]
SYMM spacegroup
RESOLUTION resmin resmax
FIND atoms number chain string cutsigma number/auto
REMOVE atoms number cutsigma number [merge number] [keep zeroocc]
END
eof

Keywords FIND and REMOVE are half optional, by that we mean that at least one of them must be given. Both MAPIN1 (3Fo-2Fc / 2mFo-DFc, alpha_c) and MAPIN2 (Fo-Fc / mFo-DFc, ac) maps must be provided.

Another typical set of ARP/wARP commands, this time for application #2 (filling the MIR(AS) map with a set of pseudo protein atoms for further unrestrained refinement or multiple refinements):

arp_waters MAPIN2 Fo-Fc_map_file XYZOUT1/2/3 output_coordinates << eof
MODE mirbuild
CELL cell parameters
SYMM spacegroup
RESOLUTION resmin resmax
MIRBUILD atoms number models number
END
eof

Input MAPIN2 is the available starting map. Several models for multiple refinements are output to XYZOUT1/XYZOUT2/XYZOUT3.

Yet another typical set of ARP/wARP commands, now for application #3 (obtaining different independent models for multiple refinement):

arp_waters XYZIN input_file XYZOUT output_file << eof
MODE shakemodel light/allatoms
[CELL cell parameters]
SYMM spacegroup
SHAKEMODEL [ bexcl n1 ] [ breset n1 n2 ] [ randomise x ] [ shift x y z ]
END
eof

And another typical set of ARP/wARP commands, again for application #3 (averaging of multiple refinements of different independent models):

arp_waters HKLIN mul_ref_Fs HKLOUT nice_output << eof
MODE reflaver
RESOLUTION resmin resmax
LABIN input labels for
FP  SIGFP  [FREE]  FCx  PHICx
LABOUT output labels for
FCAVER  PHAVER  FOMAVER
END
eof



ARP_WATERS and SHELXL

SHELXL is part of the SHELX-97 program package and should be obtained directly from the author, George M. Sheldrick, Göttingen University SHELX homepage.

The most common use of ARP/wARP with SHELXL shelx97 is for restrained refinement with individual atomic anisotropic displacement parameters (as provided by SHELXL) combined with updating of the solvent structure by ARP/wARP . This application is limited to the fact that individual atomic anisotropic displacement parameters can be refined only if the resolution of the X-ray data is higher than 1.5 Å, ideally approaching atomic resolution (1.2 Å).

There are currently no automated scripts for this application. An old-style command shell script is given in the $CEXAM/unix/non-runnable directory (arp_waters_shelx.com). The script includes iterative runs of the following programs:

1.
SHELXL (SHELX-97) for restrained anisotropic refinement
Some recommendations for the shelx.ins file:
CGLS 2. Use of more cycles within SHELXL lowers the ARP_WATERS contribution
CELL, LATT/SYMM and SHEL should be consistent with cell, symm and resol in the script
WPDB -1
LIST 3
ISOR and CONN should include O1 > last - as the number of waters changes with each cycle
See the SHELX-97 Manual for further details.

2.
PREPFORM (ARP/wARP Suite) for conversion of SHELXL files

3.
F2MTZ (CCP4) for conversion to the CCP4 MTZ format
Column label assignments should be edited if necessary

4.
CAD (CCP4) for sorting the MTZ file
Column label assignments should be edited if necessary

5.
FFT (CCP4) for map calculation
One map is calculated with coefficients 3Fo-2Fc, another with Fo-Fc Column label assignments should be edited if necessary

6.
EXTEND (CCP4) for map extension

7.
ARP_WATERS (ARP/wARP Suite) for solvent update
The maximum number of atoms to add and to remove should not exceed the value of 0.08 X N/dmax3, where N is the current number of atoms in the model and dmax is the high resolution limit.

8.
PREPSHEL (ARP/wARP Suite) for back conversion to SHELXL format

When writing a shell script take care to define the following variables at the top of the file: name (root file name), last (starting file number), cycles (number of refinement cycles), count, title, resol (resolution limits), cell (cell parameters), grid (grid for map calculation), xyzlim (boundaries for real space asymmetric unit for ARP_WATERS), symm (space group number) and sfsg (space group for map calculation)


Simple toxd example script found in $CEXAM/unix/runnable/

  • arp_waters.exam (Example of finding waters.)

    Comprehensive example scripts found in $CEXAM/unix/non-runnable/

  • arp_waters_refmac.com
  • arp_waters_sfall.com
  • arp_waters_shelx.com

    SEE ALSO

    protin refmac fft mapmask