MTZUTILS (CCP4: Supported Program)

NAME

mtzutils - Reflection data files utility program

SYNOPSIS

mtzutils hklin[1] foo_in_i.mtz [ hklin2 foo_in_j.mtz ] hklout foo.mtz
[Keyworded input]

DESCRIPTION

The MTZ utility program MTZUTILS is provided for the purpose of creating a new re-arranged or edited MTZ reflection data file from one or two existing files. Consider also CAD, which has similar functions and should be used if you are not sure your input files are in the same region of reciprocal space.

KEYWORDED INPUT

The program reads one or two input reflection data files and creates a single output reflection data file. The basic functions are selected using the file control option specification keywords. The available keywords and options are:
(i) General options
AXIS, CELL, COLUMN, HEADER, HISTORY, ONEFILE, RESOLUTION, RUN | GO | END, RZONE, SCALE, SORT, SYMMETRY, TITLE
(ii) File Control Options
CONCAT, EXCLUDE, INCLUDE, MERGE, ONEFILE, UNIQUE

RUN | GO | END

Terminate the keyworded input and start processing. [Optional, but advisable.]

SYMMETRY

Space group name or number. Replace Symmetry information in the output MTZ file.

Note that MTZUTILS should not be used to change the spacegroup of multirecord MTZ files, since it will not update the reflection indices or symmetry flags appropriately for the new spacegroup - use REINDEX instead.

SORT_ORDER h,k,l

Input a combination of the strings H, K, and L. This is stored but not currently used.

CELL <a> <b> <c> [ <alpha> <beta> <gamma> ]

Replace cell information in the output MTZ file (alpha, beta, gamma default to 90.00). This will update the cell dimensions for all datasets in the output file. If you want finer control, then use the CAD program, or the corresponding CCP4i interface "Edit MTZ Datasets".

HISTORY <string>

Add to the history stack.
The string is added to existing history, in order.
Key_Word History
File_Number_1 History
File_Number_2 History
up until MAXHIS reached

TITLE <title>

Edit MTZ Titles. Examples:
   TITLE 1 NOCHANGE 
   TITLE 2 NOCHANGE
   TITLE NOCHANGE            # ==> from File_Number_1
   TITLE REPLACE string    # ==> from File_Number_1
   TITLE 1 REPLACE string  # ==> from File_Number_1
   TITLE 2 REPLACE string  # ==> from File_Number_2
   TITLE ADD string        # ==> from File_Number_1
   TITLE 1 ADD string      # ==> from File_Number_1
   TITLE 2 ADD string      # ==> from File_Number_2

COLUMN_LABELS [ <file> ] <program label>=<file label>...

Edit column label names. Examples:

  COLUMN_LABELS Tom=Huey Dick=Dewey Harry=Luey     
                       # ==> from File_Number_1
  COLUMN_LABELS 1 Tom=Huey Dick=Dewey Harry=Luey   
                       # ==> from File_Number_1
  COLUMN_LABELS 2 Tom=Huey Dick=Dewey Harry=Luey   
                       # ==> from File_Number_2

For the MERGE option NO column editing is allowed.

For Column Editing FIRST for assignments with a set of files with column labels:

      File_1    H K L A B
      File_2    H K L A C

   using keywords as:
       COLU 2 A=D
       INCLUDE 1 A B
       INCLUDE 2 D C

HEADER BRIEF | HIST | ALL

Controls printing of MTZ information. STRING is one of following:
NONE (default)
no header output
BRIEF
brief header output
HIST
brief, with MTZ history
ALL
full header o/output from MTZ reads

RESOLUTION <limit1> <limit2>

Followed by minimum and maximum resolution for the output file.
resmin resmax (any order) OR smin smax (any order) resolution limits for output file.

AXIS <zone>

Output reflection file restricted to given zone(s)
Use one or more of: H00, 0K0, 00L, HH0, -HH0, HHH, HK0, 0KL, H0L, HHL

RZONE <rzone>

To select a zone then RZONE must be followed by 5 integers e.g.

              +h +k +l = 3n

==>      RZONE 1  1  1   3    0

and
                     l = 2n + 1

==>      RZONE 0  0  1   2    1

SCALE ...

Scale column labels by multiplying them with an input scale factor.
Input may be in one of the following styles:
SCALE ALL J scale_value OR SCALE ALL I scale_value
This option scales all intensities in the file (column type = J) by the value 'scale_value'.
SCALE ALL F scale_value
As above but scales all column type = F
SCALE ALL D -1
Reverses the sign of all anomalous values in the mtz file.
SCALE label_a1 label_an scale_value1 ... label_n1 label_nn scale_valuen
Applies scale values to specified labels.
Scale in mtzutils is useful if you wish to scale columns in multi-record mtz files. Warning: For the SCALE ALL F and SCALE ALL J options an attempt is made by the program to also scale any associated sigma values, anomalous differences and their sigmas (if present). If the labels are not in a standard format then the program may try to scale the wrong columns. In this case you may need to scale specific column labels as in the final example above. The SCALE input may only be used with the ONEFILE option.

ONEFILE

This is compulsory if only one file is being used with the EXCLUDE/INCLUDE options and you have assigned HKLIN2. If HKLIN2 isn't assigned, ONEFILE will be assumed.

FILE CONTROL OPTION SPECIFICATION

These options enable the user to select a particular task. Available options are INCLUDE, EXCLUDE, UNIQUE, CONCAT and MERGE. Options INCLUDE and EXCLUDE are specific to a single input file, while options UNIQUE, CONCAT and MERGE apply to both input files. Option ONEFILE is required if there is just a single input file. As only two input files are allowed, the file specification options INCLUDE and EXCLUDE cannot be used with options UNIQUE, MERGE or CONCAT.

If there is only one input file then the ONEFILE option is needed after the keywords for the first input file and succeeding keywords (except the cell dimensions, if required) are omitted.

When operating on two files, the data items that are not defined in a record, will be set to the missing number flag. For instance when using the MERGE option, the unique columns in file 2 will not be defined when writing out a record from file 1. The converse is also true. Note that multi-record files should not contain missing reflections, thus the output file could not be used as a normal multi-record file.

(a) File control option INCLUDE

This option applies to one of the input files. The file control option keyword containing the code INCLUDE containing a list of column labels of data items to be copied to the output file. Column labels for h, k and l should not be given among these labels. If the input file has not got any of the labels requested for inclusion, the job will be aborted. Title editing and column label editing instructions are allowed.

(b) File control option EXCLUDE

This option applies to one of the input files. The file control option keyword containing the code EXCLUDE containing a list of column labels of data items to be excluded when creating the output file. Column labels corresponding to h, k and l should not appear among these label strings as they are taken care of automatically. The program is aborted if any requested label string is not found among edited column labels of the file. Title editing and column label editing keywords are allowed.

Examples:


   EXCLUDE Tom Dick Harry     # ==> from File_Number_1
   EXCLUDE 1 Tom Dick Harry   # ==> from File_Number_1
   EXCLUDE 2 Tom Dick Harry   # ==> from File_Number_2

   INCLUDE Tom Dick Harry     # ==> from File_Number_1
   INCLUDE 1 Tom Dick Harry   # ==> from File_Number_1
   INCLUDE 2 Tom Dick Harry   # ==> from File_Number_2
   INCLUDE                    # ==> ALL columns from File_Number_1
   INCLUDE ALL                # ==> ALL columns from File_Number_1
   INCLUDE 1 ALL              # ==> ALL columns from File_Number_1
   INCLUDE 2 ALL              # ==> ALL columns from File_Number_2

(c) File control option ONEFILE

If the option INCLUDE or EXCLUDE is used when only one input file is required, the file control option ONEFILE should be used to indicate to the program that only one input file exists.

(d) File control option UNIQUE

This file control option specifies that each column of the two input files with a unique label is to be copied to the output file and that whenever a particular reflection appears in both the input files, the data should be merged into a single record of the output file. Note that unique columns are recognised from the edited labels of the input files. If a column label is found in both the input files then the data value from the first file is copied to the output record unless it is the distinguished missing value, in which case the value from the second file (HKLIN2) will be copied to the output.

Both the input files should have identical labels for h, k and l otherwise the output file will contain three extra columns containing the values of h, k and l derived from the second input file. The output file from this option is of the single record/reflection type (a normal MTZ file).

WARNING: if the input data sets are not properly sorted on h, k and l (the first three columns), the output file may become of mixed type with both types of record present.

(e) File control option CONCAT

This file control option specifies that the data records of the two input files are to be copied to the output file. The option is used to create a multiple record type output file from the two input files by merging them. The output file contains edited titles from both the input files and edited labels from the first input file. The labels of the first input file should be edited to become identical with the existing labels in the second input file failing which the program will abort.

In this option the reflections are sorted. For sorting purposes the missing number flag, for both files, is changed to a large negative value. This is similar to SORTMTZ. The missing number flag is then reset to that of the first file or the default (NaN).

(f) File control option MERGE

This option creates a multi-record type merged MTZ file from two input MTZ files. The columns in the two input files need not be identical. The column labels in the output file will be the common labels from the two files, the unique labels from file 1 and the unique labels from file 2.

Title Editing Option

The title of the output file is derived from the titles of the input files after appropriate editing based on the title editing options. At present, the available options are REPLACE, NOCHANGE and ADD. For each input file there may be one keyword of this type. The option REPLACE indicates that the existing title of the input file should be replaced by a new title before copying to the output file. The new title information is given on the same line. The option NOCHANGE (default) indicates that the title information of the input file should be copied as it is the output file. The option ADD indicates that the output file should have all the title information supplied on the rest of the line, along with that of the input file.

Column Label Editing Options

These keywords allow the re-naming of column labels of an input file before copying to the output file. For the options UNIQUE and CONCAT, these apply to the first input file. The keywords contain entries of the type

           label1=label2

where label1 is an existing column label and label2 is the replacing label for the particular column. There should be at least one blank between two such assignment statements. The statements may be spread over COLUMN keywords if required and are terminated by. If label1 is not found for a particular assignment then label2 is tried and if still a failure the program will continue after giving an error message.

INPUT AND OUTPUT FILES

The input files are:
HKLIN1
The first input reflection data file
HKLIN2
The optional second input reflection data file
HKLOUT
The output reflection data file

The output file is a reflection data file which is normally in standard MTZ format though it may be of mixed record type if the option UNIQUE is used with unsorted input files or will be of multiple record type if the option CONCAT is used. The missing number flag for the output file is either defined as the one set in the first file or the default NaN. The input files need not have the same missing number flag.

NOTES

Where a value is not taken from an input file, a default value of 0 will be supplied for the output file.

PRINTER OUTPUT

The line printer output gives details of the input reflection data files as they are read, details of the commands input to the program and details of the output reflection data file which has been created.

PROGRAM FUNCTION

The MTZ utility program is provided for the purpose of creating a new re-arranged or edited MTZ file from one or two existing files. The programs cannot perform any calculation on the data items appearing in the data records. The program is meant to create a new file with
  1. re-arranged columns
  2. re-named column labels
  3. changed title information
  4. items selected from one or two existing files
  5. multiple records for each reflection from two single record files.

The program cannot be used to exclude any data records or to create a file with multiple header labels. The program leaves the input files unaltered and deletes dummy labels before copying to the output file.

BUGS

The functions of this program should be expressed in terms of the relational calculus operations (join, project, select etc.). (MTZ files basically contain two RDB tables, the header information keyed on keywords (like CELL) and the reflections keyed on the combined H, K and L columns.) If it was rewritten with this in mind it might be clearer and less buggy.

SEE ALSO

cad

EXAMPLES

UNIQUE example


    #!/bin/sh
    mtzutils     hklin1 fvb_f.mtz 
                 hklin2 2hfl_vhsearch.mtz 
                 hklout unique.mtz 
                 << eof
    HISTORY  testing unique
    CELL 86.16 111.93 71.71 90.0 90.0 90.0
    HEADER ALL
    UNIQUE
    RUN
    eof

UNIQUE with column editing

    #!/bin/sh
    mtzutils     hklin1 fvb_f.mtz 
                 hklin2 2hfl_vhsearch.mtz 
                 hklout unique.mtz 
                 << eof
    SYMMETRY P21212
    HISTORY  testing unique
    CELL 86.16 111.93 71.71 90.0 90.0 90.0
    HEADER ALL
    COLUMNS fvb_F=tom fvb_SIGF=Harry
    COLUMNS 2 2hfl_F=DiCK
    UNIQUE
    RUN
    eof

EXCLUDE example

    #!/bin/sh
    mtzutils     hklin1 fvb_f.mtz 
                 hklin2 2hfl_vhsearch.mtz 
                 hklout unique.mtz 
                 << eof
    SYMMETRY P21212
    HISTORY  testing unique
    CELL 86.16 111.93 71.71 90.0 90.0 90.0
    HEADER ALL
    COLUMNS fvb_F=tom fvb_SIGF=Harry
    COLUMNS 2 2hfl_F=DiCK
    EXCLUDE fvb_DANO
    EXCLUDE 2 2hfl_PHCAL
    RUN
    eof

AXIS example

#!/bin/sh
mtzutils     hklin1 iv96.mtz 
             hklout h0l_0kl.mtz 
             << eof
ONEFILE
INCLUDE ALL
AXIS H0L 0KL 
RUN
eof

RZONE example

#!/bin/sh
mtzutils     hklin1 iv96.mtz 
             hklout zone.mtz 
             << eof
ONEFILE
INCLUDE ALL
! h+k = 2n
RZONE 1 1 0   2 0
RUN
eof

SCALE example

#!/bin/csh -f
mtzutils     hklin1 int1.mtz 
             hklout int1-3.mtz 
             << eof
ONEFILE
SCALE ALL I 3.0  ! could also use SCALE ALL J here
RUN
eof
-- This will attempt to find the sigmas columns to scale as well --

SCALE example - changing the sign of your anomalous data


(This is sometimes necessary if your detector software hasn't been set up correctly.)
#!/bin/sh
mtzutils     hklin1 int.mtz 
             hklout int_new.mtz 
             << eof
ONEFILE
SCALE ALL D -1
RUN
eof

SCALE example - choosing specific columns

#!/bin/csh -f
mtzutils     hklin1 int.mtz 
             hklout int_new.mtz 
             << eof
ONEFILE
SCALE F1 SIGF1 1.5 F2 2.0
RUN
eof
-- This will print a warning that you are scaling a column 
   without its sigma value - but proceeds anyway --
value. 

MERGE examples -combine merged native and sorted derivative data

#!/bin/csh -f
mtzutils hklin2 m6cb3_sort.mtz \
         hklin1 m6c8_r \
         hklout temp_m6cb3_resort << eof
merge
eof
#
sortmtz hklin temp_m6cb3_resort hklout m6cb3_resort << eof
H K L M/ISYM BATCH
eof
-- Combine together merged native & sorted derivative data, by
   interleaving reflection records
   Must resort data after this step

AUTHORS

Eleanor Dodson and Howard Terry