PROFESSS (CCP4: Supported Program)

NAME

professs - determination of NCS operators from heavy atoms

SYNOPSIS

professs XYZIN foo.pdb [XYZOUT bar.pdb]
[Keyworded input]

DESCRIPTION

'professs' is a tool to help in the identification of NCS related atoms from a list of heavy atom positions. It assembles atom triplets into similar triangles applying any necessary symmetry operators. It is less easy to use than 'findncs', but unlike that program it runs extremely quickly.

'professs' takes as input a list of the heavy atom sites, either as a PDB or an ".ha" file. Crystallographic symmetry equivalents are generated for the sites and the extended list searched for triangles of 3 atoms, with all spacing less than a given cutoff distance (see DISTANCE keyword). The results are sorted and tabulated according to the sum of the 3 distances, so that the user may identify equivalent triangles belonging to NCS related molecules.

It then finds the NCS operators relating the three atoms which generate pairs of similar triangles. Additional atoms which obey the same operators are added to the list over several cycles. The related atoms are then reduced to a common basic set so that the operators can be compared without the confusing effects of crystallographic symmetry. Any 'loops' within the resulting groups, associated with proper rotational NCS, are listed. Duplicate operators are identified and removed.

After all unique operators are found they are sorted according to the number of atom pairs and the loop order, and a list of operators output in this order.

The angles between the best operators with proper rotational NCS are tabulated. This can help indicate whether there is higher NCS symmetry in the set of sites; e.g. hexamers or tetramers require that there are orthogonal NCS operators between the same atom sets 3folds perpendicular to 2-folds for hexamers, 3 orthogonal sets of 2folds to make a tetramer.

If XYZOUT is assigned, a PDB file is output. If keyword LIST is requested, this PDB file contains first the triplets of atoms which make up each triangle, with each one given a different segment ID. For teaching purposes this file may be fed into 'lsqkab' to determine the operators relating the original triangles. It will then give the full list of atom sets including all additional related atoms.

If LIST is not specified, XYZOUT will have only the full list of atom sets including all additional related atoms for the operators which generate the most complete matches.

INPUT/OUTPUT FILES

XYZIN
Input PDB file containing the heavy atom positions. If the CRYST1 keyword is present in the file, this will also provide the unit cell dimensions, and possibly the spacegroup. Atoms are renumbered according to their input order, and identified by this serial number along with the symmetry operator applied throughout the output.
XYZOUT (OPTIONAL)
Output PDB file containing two sets of coordinate listings. The atom number throughout will reflect the input order. Atom names, B factors and occupancies are unchanged but the Chain ID, SEGID, and residue numbers will be altered.
  1. triangles of atoms, grouped for input to 'lsqkab'. Each triangle is given a separate seg ID, and each atom within the triangle is numbered, 1, 2 or 3. The atom order is chosen such that the distances are ranked d12 < d23 < d31 (if the differences in the distances are less than the distance tolerance - an isosceles triangle - then both orderings are produced).
  2. full set of related atoms. The second half contains the largest full set(s) of matched atoms. Here the chain IDs are given as F for the first, and S for the second elements. There are some "REMARK" records to give the operators relating the paired sets.

KEYWORDED INPUT

CELL, DISTANCE, END, LIST, SYMMETRY, TIDYINPUT, TOLERANCE, VERBOSE

CELL <a> <b> <c> <alpha> <beta> <gamma>

Unit cell parameters. Override the cell parameters given in XYZIN.

SYMMETRY <spacegroup_name>

Space group symmetry. Override the spacegroup given in XYZIN.

DISTANCE <distance>

Maximum interatomic distance for analysis in Angstroms. Atom pairs further apart than this distance will be ignored. Default: 25Å.

TOLERANCE <tolerance>

Tolerance on interatomic distances in Angstroms. Distances differing by less than this distance will be considered equal. This is useful when triangles are approximately isosceles or equilateral, in which case the atom order will be ambiguous. Specifying a tolerance will cause all equivalent triangles to be produced.

The tolerance is used in the second stage to choose which atoms will be included into the match sets. After determination of the operators, atom pairs within this distance will be added to the list.

Try 1-3 Angstroms, depending on the quality of your heavy atom positions. Default = 1.0.

TIDYINPUT [FRAC <Xfc> <Yfc> <Zfc>] [ORTH <Xoc> <Yoc> <Zoc>]

Tidy up the input coordinates to place them close to the specified coordinate. This occurs before the rest of the calculation. Symmetry and cell numbers in the log file will refer to the tidied coordinates. If the keyword is given without a coordinate, then the atoms will be placed close to the origin with slight preference for the positive octant. Default: not set.

VERBOSE

Generate a few extra diagnostics. Default: not set.

LIST

  1. All triplets of atoms which make up suitable triangles are output to XYZOUT (if assigned).
  2. All non-identical complete atom sets related by an operator are output to the XYZOUT file. If XYZOUT is assigned, the default is that only sets containing at least half the maximum number of atoms found in a set are output.

END

End input.

Reading the Output:

The program first lists the triangles of atoms which it has found, then it analyses each pair of triangles as a possible NCS match. For each possible operator, a list of all matching atoms is given. For each pair of atoms, a 'loop factor' is listed. If the NCS operator is an N-fold rotation, the atom will be part a 'loop' of N atoms (unless one is missing). This, along with an appropriate 3rd polar angle, can confirm the existence of a proper NCS operator.

Atoms are numbered according to the input order, and identified by this serial number along with the symmetry operator applied. this is coded by 4 numbers listed in square brackets. The first of these is the number of the crystallographic symmetry operators, and the other three are the unit cell translations applied after the symmetry operator.

If you expect higher orders of NCS check the table of angles between the best operators. This can help indicate whether there is higher NCS symmetry in the set of sites; e.g. hexamers or tetramers require that there are related NCS operators between the same atoms.

A PDB file may be output.

Problems:

AUTHOR

Kevin Cowtan, York (originally named 'eleanorinabox').

SEE ALSO

dm, lsqkab