REVISE (CCP4: Supported Program)

NAME

revise - estimates FM using MAD data, where FM is an optimised value of the normalised anomalous scattering.

SYNOPSIS

revise hklin foo_in.mtz hklout foo_out.mtz
[Keyworded input]

DESCRIPTION

REVISE is for MAD data only.

When anomalous data are collected at a variety of wavelengths such as in a MAD experiment the relative scale of the data can be affected in different ways such as different absorption effects or changes in the beam intensity. These fluctuations could ultimately lead to the success or failure of the data to solve the structure as the accuracy required of MAD data is extremely high. If the data could be re-scaled in some way to smooth out or remove these fluctuations it may be easier to determine the positions of the anomalous scatterers. This is what the program REVISE aims to do.

REVISE is used to modify MAD data in order to estimate FM, the normalised anomalous scattering magnitude as defined in Equation (2). The difference between FM and the normalised structure factor E is that FM contains a scale factor and a temperature factor. FM can be used to calculate anomalous Patterson maps or for input to Direct Methods to locate the positions of anomalous scatterers.

The program is based on two features which only exist in MAD data. For each reflection, one can write these equations:

  [(FPHn(-))**2 - (FPHn(+))**2]/f"n = constant              (1)

  [(FH'n)**2 + (FH"n)**2]/[(f'n)**2 + (f"n)**2] = (FM)**2   (2)

where:

 FPHn(+) - total structure factor for h,k,l.
 FPHn(-) - total structure factor for -h,-k,-l.
 FH'n - real part of anomalous scattering structure factor.
 FH"n - imaginary part of anomalous scattering structure factor.
 n - wavelength n.
 f'n - real component of anomalous scatterer.
 f"n - imaginary component of anomalous scatterer. 
and both should be independent of wavelength.

The FPHn(+) and FPHn(-) can be modified to satisfy Equation (1) minimizing any fluctuations. Equation (2) is used to estimate the total anomalous scattering for each reflection [reference 1]. The program uses trial and error to find a suitable range and aims to minimize differences in FM between all the wavelengths. A figure of merit indicates the minimum point and then the program takes the average value of FM for each wavelength as a final value of FM. In general the use of FM rather than anomalous differences can lead to better results in the determination of the positions of the anomalous scatterers.

The program can handle up to ten wavelength anomalous scattering data sets. A minimum of at least two sets are required. REVISE does not require native protein data, FP, and can only copy FP from input file to output file.

KEYWORDED INPUT

The various data control lines are identified by keywords. Only the first 4 characters of a keyword are significant. The keywords can be in any order, except END (if present) which must be the last. Numbers and characters in "[ ]" are optional. The compulsory keywords are LABIN and WAVE. The available keywords are:
END, EXCL, LABIN, LABOUT, RESO, TITLE, WAVE

TITLE <title>

This is a character title which replaces the old title in the MTZ file.

RESO <rmax>

<rmax> the high resolution limit in Angstrom. If this command is absent, the default is to use all reflections in the file.

LABIN <program label>=<file label>...

This COMPULSORY keyword defines which entries in the MTZ file are to be used in the calculation. The following <program label> can be assigned:
  FP        SIGFP        DP        SIGDP
  FPH1(+)   SIGFPH1(+)   FPH1(-)   SIGFPH1(-)
  FPH1      SIGFPH1      DPH1      SIGDPH1
  FPH2(+)   SIGFPH2(+)   FPH2(-)   SIGFPH2(-)
  FPH2      SIGFPH2      DPH2      SIGDPH2
...........
  FPH10(+)  SIGFPH10(+)  FPH10(-)  SIGFPH10(-)
  FPH10     SIGFPH10     DPH10     SIGDPH10
Example:
  LABI -
  FPH1(+)=FP1 SIGFPH1(+)=SFP1 FPH1(-)=FN1 SIGFPH1(-)=SFN1 -
  FPH2(+)=FP2 SIGFPH2(+)=SFP2 FPH2(-)=FN2 SIGFPH2(-)=SFN2 -
  FPH3(+)=FP3 SIGFPH3(+)=SFP3 FPH3(-)=FN3 SIGFPH3(-)=SFN3

LABOUT <program label>=<file label>...

This keyword allows the user to assign their own labels to the extra entries created in the output file. All labels specified in LABIN will automatically be in the output file. The following <program label> can be assigned:
  FPHM1(+)  SIGFPHM1(+)  FPHM1(-)  SIGFPHM1(-)
  FPHM1     SIGFPHM1     DPHM1     SIGDPHM1
  FPHM2(+)  SIGFPHM2(+)  FPHM2(-)  SIGFPHM2(-)
  FPHM2     SIGFPHM2     DPHM2     SIGDPHM2
....................
  FPHM10(+) SIGFPHM10(+) FPHM10(-) SIGFPHM10(-)
  FPHM10    SIGFPHM10    DPHM10    SIGDPHM10
  FM        SIGFM
Example:
  LABO -
  FPHM1(+)=FP1_mod  SIGFPHM1(+)=SFP1_mod - 
  FPHM1(-)=FN1_mod  SIGFPHM1(-)=SFN1_mod - 
  FPHM2(+)=FP2_mod  SIGFPHM2(+)=SFP2_mod -
  FPHM2(-)=FN2_mod  SIGFPHM2(-)=SFN2_mod -  
  FPHM3(+)=FP3_mod  SIGFPHM3(+)=SFP3_mod -  
  FPHM3(-)=FN3_mod  SIGFPHM3(-)=SFN3_mod -
  FM=FM_RE SIGFM=SFM_RE

WAVE <No. of data set> [LAM <wavelength>] [FPR <f'>] [FDP <f">]

The COMPULSORY keyword WAVE specifies the wavelength each data set was collected at (WAVE 1..... WAVE 10). It is essential to specify the f' (FPR) and f" (FDP) at each of these wavelengths. The values do not need to be very accurate, but it is important that they are at least in the correct order of magnitude.
Example:

  WAVE 1 LAM 0.9000  FPR -1.622  FDP 3.285
  WAVE 2 LAM 0.9795  FPR -8.198  FDP 2.058
  WAVE 3 LAM 0.9809  FPR -6.203  FDP 3.663

EXCL [RISO <riso>] [RANO <rano>] [SIGM <sigm>]

This optional keyword allows the criteria for excluding data to be set.

If [|DISO|/FPH] > <riso> then rejection occurs. Default: <riso> = 0.10 (10%);
If [|DANO|/FPH] > <rano> then rejection occurs. Default: <rano> = 0.50 (50%);
If [|FPH|/SIGFPH|] < <sigm> then rejection occurs. Default: <sigm> = 0.0.

Example:

EXCL RISO 0.15 RANO 0.40 SIGM 3.0

END

This states that the end of input has been reached. If present, this must be last keyword.

INPUT AND OUTPUT FILES

The input files are the keyword file and a standard MTZ reflection data file.
Input:
HKLIN input data file(MTZ).
Output:
HKLOUT output data file(MTZ).

Here are the definitions for each label:

 Name           Item

 H, K, L        Miller indices.

 FP             F value for native protein.
 SIGFP          Sigma(FP).
 DP             Anomalous difference for native data.
 SIGDP          Sigma(DP).

 FPHn(+)        FPH(h,k,l) for wavelength 'n'.
 SIGFPHn(+)     Sigma(FPHn(+)).
 FPHn(-)        FPH(-h,-k,-l) for wavelength 'n'.
 SIGFPHn(-)     Sigma(FPHn(-)).

 FPHn           FPHn = 0.5 * (FPHn(+) + FPHn(-)).
 SIGFPHn        Sigma(FPHn).
 DPHn           DPHn = FPHn(+) - FPHn(-).
 SIGDPHn        Sigma(DPHn).

 FPHMn(+)       Modified FPH(h,k,l) for wavelength 'n'.
 SIGFPHMn(+)    Sigma(FPHMn(+)).
 FPHMn(-)       Modified FPH(-h,-k,-l) for wavelength 'n'.
 SIGFPHMn(-)    Sigma(FPHMn(-)).

 FPHMn          FPHMn = 0.5 * (FPHMn(+) + FPHMn(-)).
 SIGFPHMn       Sigma(FPHMn).
 DPHMn          DPHMn = FPHMn(+) - FPHMn(-).
 SIGDPHMn       Sigma(DPHMn).

 FM             anomalous contributions after applying REVISE.
 SIGFM          Sigma(FM).

PRINTER OUTPUT

The log output starts with details of the input keyword data lines. Information from the input MTZ file follows. An error message will be printed if any illegal input in the keyword data lines have been found and the program will stop.

Statistics of the ratio

[(FPHn(-))**2 - (FPHn(+))**2] / [(FPHm(-))**2 - (FPHm(+))**2]

between data set n and data set m are then printed in 10 resolution ranges, for both before and after the revise procedure. It follows from equation (1) that this ratio should be equal to the ratio of f"n/f"m, and the revise procedure will tend to ensure this. Distributions of the ratios before the revise procedure are also given as XLOGGRAPH plots, and these can be used as a guide to data quality.

Details of the output file are printed at the end of log file.

REFERENCE

  1. Fan Hai-fu, Woolfson, M.M. & Yao Jia-xing, (1993). Proc. R. Soc. Lond. A 442, 13-32.

AUTHORS

Yao Jia-xing and Eleanor Dodson.

EXAMPLES

revise \
hklin $HOME/test.mtz \
hklout $SCRATCH/test-revise.mtz\
<< eof
TITLE   testing revise
LABI -
FPH1=FSe1 SIGFPH1=SIGFSe1 DPH1=DSe1 SIGDPH1=SIGDSe1 -
FPH2=FSe2 SIGFPH2=SIGFSe2 DPH2=DSe2 SIGDPH2=SIGDSe2 -
FPH3=FSe3 SIGFPH3=SIGFSe3 DPH3=DSe3 SIGDPH3=SIGDSe3


LABO -
FPHM1=FSe1_mod SIGFPHM1=SIGFSe1_mod DPHM1=DSe1_mod -
SIGDPHM1=SIGDSe1_mod FPHM2=FSe2_mod SIGFPHM2=SIGFSe2_mod -
DPHM2=DSe2_mod SIGDPHM2=SIGDSe2_mod FPHM3=FSe3_mod -
SIGFPHM3=SIGFSe3_mod DPHM3=DSe3_mod SIGDPHM3=SIGDSe3_mod -
FM=FM_RE SIGFM=SFM_RE


WAVE 1 LAM 0.9000  FPR -1.622  FDP 3.285
WAVE 2 LAM 0.9795  FPR -8.198  FDP 2.058
WAVE 3 LAM 0.9809  FPR -6.203  FDP 3.663

EXCL RISO 0.15 RANO 0.40 SIGM 3.0

END
eof