CONTACT (CCP4: Supported Program)

NAME

contact - computes various types of contacts in protein structures

SYNOPSIS

contact XYZIN foo.pdb
[Keyworded input]

DESCRIPTION

Program for computing various types of contacts in protein structures. Can also analyse water hydrogen bonding. The program uses a bricking algorithm in which atoms are segregated into 6x6x6 A boxes and contact searching is limited to neighbouring boxes; this is very fast. CONTACT reads a standard Brookhaven data bank file which must contain SCALE cards if MODE = IMOL or AUTO.

Maximum residue number is 9000. Maximum number of atoms is 96000. Restrictions on the number of symmetry operations allowed in IMOL and AUTO modes, present in older versions, have been eased.


  ( PARAMETER (MXATOM=96000,MXRES=9000,MXSYMM=230) )
  MXSYMM is: maximum number of symm. operations + 27

KEYWORDED INPUT

The program uses the routine PARSER to read in the control data. It only checks the first 4 characters of each keyword which can be upper or lower case. The order of the cards is not important and there are plenty of default values (see examples).

The possible keywords are:

AMODE, ANGLE, ATYPE, FROM, BIGSEARCH, HEXCLUDE, HOH, LIMITS, METAL, MODE, NOLIST, SOURCE, SPACEGROUP, SYMM, SYMTIT, TARGET, TITLE, TO

TITLE <title>

(default: no title)
Title used on the printer output. <string> - character string up to 75 characters.

MODE <mode>

<mode> = ALL,IRES,ISUB,IMOL or AUTO (default: MODE IRES).
ALL
for all interatomic distances for chosen residues.
IRES
for interresidual contacts for chosen residues. It is similar to 'ALL' except that distances between atoms of different residues only will be computed, and distances between main-chain atoms from adjacent residues are also suppressed.
ISUB
for intersubunit contacts (subunits must have different chain names in the Brookhaven file).
IMOL
for intermolecular contacts. This mode requires symmetry information, see keywords SYMM and SPACEGROUP. The program looks for contacts using the supplied symmetry operations. If the symmetry operators are supplied via a spacegroup specification, then the identity operator is removed. The identity operator can be specified explicitly using a SYMM card.

Main-chain to main-chain and side-chain to side-chain contacts are suppressed if the atoms are on the same or adjacent residues and the target symmetry operation is the identity.

AUTO
as IMOL, but additional (primitive) lattice translations are generated automatically and combined with the supplied space-group symmetry in a search for intermolecular contacts. The identity operator is suppressed for lattice translations equal to (0,0,0), so that contacts within the same asymmetric unit are not listed.

The default is to use only single translations (e.g. +A ,-A, +A-B ...etc), which works well if the molecule is reasonably positioned within the cell (not outside). To extend the search to a larger volume (up to two lattice translations in all directions) you must also specify the BIGSEARCH keyword.

Additional output for MODE AUTO

(as in the original contact.f the default is do SOURCE = all input atoms to TARGET = all input atoms)

using


  contact xyzin   /homes/henrick/pdb/pdb4ins.ent <<eof
  MODE AUTO
  ATYPE ALL
  LIMITS   2 3.66
  eof

gives for LIMITS 2 3.66 Angstrom for PDB file 4ins

Sorted summation for Number of Contacts for atoms between symmetry related molecules Excluding water molecules


 Num contacts TransSymm       Symmetry
 =================================================
     995        555 002      2: -X+Y,  -X,  Z
     818        555 001      1: -Y,  X-Y,  Z
     337        554 007      7: -Y+1/3,  X-Y+2/3,  Z+2/3
     331        455 005      5: -X+Y+2/3,  -X+1/3,  Z+1/3
     179        554 002      2: -X+Y,  -X,  Z
     145        554 009      9:   X, Y, Z
     143        556 009      9:   X, Y, Z
      76        556 001      1: -Y,  X-Y,  Z
       9        554 008      8: -X+Y+1/3,  -X+2/3,  Z+2/3
       9        555 004      4: -Y+2/3,  X-Y+1/3,  Z+1/3
       4        555 008      8: -X+Y+1/3,  -X+2/3,  Z+2/3
       3        555 007      7: -Y+1/3,  X-Y+2/3,  Z+2/3
       3        454 005      5: -X+Y+2/3,  -X+1/3,  Z+1/3
       2        554 004      4: -Y+2/3,  X-Y+1/3,  Z+1/3

Note: The 2 Zinc insulin dodecamer requires the symmetry operations (in addition to identity) -X+Y, -X, Z and -Y, X-Y, Z, hence in the above table of contacts the cutoff at 6Ang for a significant oligomer is clear.

For LIMITS 2 3.66 Angstrom one gets Sorted summation for Number of Contacts for atoms between symmetry related molecules Excluding water molecules


 Num contacts TransSymm       Symmetry
 =================================================
      49        555 002      2: -X+Y,  -X,  Z
      39        555 001      1: -Y,  X-Y,  Z
      10        554 007      7: -Y+1/3,  X-Y+2/3,  Z+2/3
       9        554 009      9:   X, Y, Z
       9        556 009      9:   X, Y, Z
       9        554 002      2: -X+Y,  -X,  Z
       8        455 005      5: -X+Y+2/3,  -X+1/3,  Z+1/3

Again the cutoff is clear.

Note: The summation of all atom contacts between symmetry related molecules excludes water molecules, but includes any other HETATM labelled atoms.

Again if there is no spacegroup name on the CRYST1 record of the input PDB file then the keyword SPACEGROUP can be used.

ATYPE <atype> (ALL or NON-CARBON)

(default: ATYPE ALL)
ALL
all types of atoms will be used in computations.
NON-CARBON
carbon atoms omitted.

AMODE (=ATYPE)

SPACEGROUP <spacegroup name>

followed by the spacegroup name as given in $LIBD/syminfo.lib e.g. 'R 3 :H' or 'H 3' or H3 or 146. Only required when MODE is IMOL or AUTO. SYMTitles are automatically generated. The identity operation is not applied for lattice translations equal to (0,0,0). If no SYMM or SPAC keyword is given, and symmetry information is required, the program will attempt to find the spacegroup from the CRYST1 line of the input PDB file.

BIGSEARCH

This keyword only takes effect for MODE AUTO, in which case it extends the search from +/-1 lattice vector (the default in MODE AUTO) to +/-2 lattice vectors in every direction (thus searching a larger volume of space).

SYMM <symmetry>

where <symmetry> is a symmetry operation as in the International Tables. Only required when MODE is IMOL or AUTO, and more than one card can be given. If the identity is not explicitly given, it will be automatically added for lattice translations not equal to (0,0,0). If the identity is explicitly given, it will be included for all lattice translations including (0,0,0).

SYMTIT <string>

(default: 'symmetry n')
<string> is any meaningful description (up to 15 characters long) of symmetries entered on SYMM cards. SYMTIT cards, if present, must be given in the same order as the SYMM cards. Example: SYMTIT 21 along a

If SYMTIT cards are not supplied, the symmetry description will be : symmetry 1, symmetry 2, etc...

If spacegroup symmetry operations are required, and have not been supplied by the SPACEGROUP or SYMM cards, then they will be taken from the input PDB file CRYST1 line if they are there. SYMMTitles are automatically generated in this case.

ANGLE <angh> <ango> <dmin> <maxnb> <bmax>

(default: 120.0 90.0 2.3 4 50.0) Note: <maxnb> is an integer.

When using the ANGLE option, the program calculates the hydrogen position for those target nitrogen atoms where the hydrogen position is unambiguous (i.e., excluding NZ on Lys and N terminus). The angle O...H...N is calculated and printed. For source...oxygen hydrogen bonds, the angle source...O__Bonded carbon is calculated. Limits on both these angles (ANGH and ANGO) must be supplied, and bonds with angles less than these limits are rejected. Suitable values are 120 and 90 degrees.

The ANGLE option can be used to search for hydrogen bonds within the protein, and the bond angles will be calculated as described above. Note that mainchain-mainchain and sidechain-sidechain contacts within one residue or to an immediately adjacent residue are suppressed in this mode. The minimum bond length (DMIN) for bonds to be included and the maximum number of bonds (MAXNB) are read from the same card, in free format, as is the MAXIMUM temperature factor for a source atom to be included in contact search and water analysis (BMAX).

An analysis of all water-protein and water-water hydrogen bonds and the bond angles at the water oxygen is also given. Note: water residue name should be WAT and the water atom name must be O (NOT O1 or OW etc) when using this option.

The ANGLE option may be used in conjunction with all modes, but to ensure the numbers are correct (especially the number of waters in the first and second hydration shells) the IMOL (or even better AUTO) mode should always be used, and identity (symm x,y,z) included to generate all lattice contacts for analysis of water interactions (only water molecules should be selected as the SOURCE range in this case).

The ANGLE operation in this version should be the same as the original with the exception except perhaps as I am not certain about the original all waters with occupancy less that 1.0 are excluded from water analysis in this version.

In addition both residue names for waters of HOH or WAT are allowed.

HOH (=ANGLE)

LIMITS <dmin> <dmax>

(default: LIMITS 0.0 3.6)
<dmin>
the minimum distance between atoms to be included in the printout
<dmax>
the maximum distance to be printed out. Because of the bricking algorithm (see DESCRIPTION section above) some contacts may be missed if <dmax> is greater than 6.0 A.

HEXCLUDE

Ignore all hydrogen atoms in the input file. Default is to include them if they are there.

NOLIST

Turh off printing of individual contacts and just print overall summary.

SOURCE <n1> [<n2>]

n1, n2
range of source residues. You may input as many SOURCE cards as you like (within array limits). The maximum residue number allowed is 9000. If the range is made of just one residue only n1 is required.

TARGET <n1> [<n2>]

n1, n2
range of target residues. You may input as many TARGET cards as you like (within array limits). The maximum residue number allowed is 9000 (MXRES). If the range is made of just one residue only n1 is required.

FROM | TO

Atom selection can also be carried using the keywords FROM and TO as used in the CCP4 program DISTANG, i.e. allowed inputs are:

  (i) based on atom numbers

  FROM ATOM 1 TO 561
  TO ATOM 1 TO 561

  (ii) based on residue numbers but now allowing for chain names

  FROM RESIDUE ALL CHAIN A 1 to 125
  TO   RESIDUE ALL CHAIN W 1 to 256

The general input expression is:

  FROM [ [ ATOM <inat0> [TO] <inat1> ] | 
          [ RESIDUE <ires0> [TO] <ires1> ]] ...
             [ CHAIN <chainid> ALL | ONS | CA ]
 TO [ [ ATOM <jnat0> [TO] <jnat1> ] | 
          [ RESIDUE <jres0> [TO] <jres1> ]] ... 
             [ CHAIN <chainid> ALL | ONS | CA ]
  
If ATOM is specified it is followed by <inat0> and <inat1>, respectively the first and last target atoms checked. I.e. FROM atoms <inat0> to <inat1> are checked against TO atoms <jnat0> to <jnat1>. (Atoms are numbered 1 to NAT, in the order read, but the residue order can be varied without restriction. Beware: atoms with occ=0.0 are not counted.)

If RESIDUE is specified it is followed by <jres0> and <jres1>, respectively the first and last target residues checked, and optionally subsidiary keywords:

   
 CHAIN <chainid> ALL | ONS | CA
      ALL (default) will select all atoms in the requested residues.
      ONs will select just the oxygen and nitrogen atoms. 
      CA will select just the CA atoms.

i.e. FROM residues <ires0> to <ires1> are checked against TO residues <jres0> to <jres1> for the appropriate class of atoms.

Example:

   
  will give just source CHAIN A to TARGET CHAIN B

  contact xyzin   /homes/henrick/pdb/pdb4ins.ent <<eof
  LIMITS   2 3.66
  FROM RESIDUE ALL CHAIN A 1 to 21
  TO   RESIDUE ALL CHAIN B 1 to 30
  eof

METAL <Metal> [ <metal-ligand distance> ]

Metal coordination geometry option. The METAL keyword will automatically find all atomic distances from all atoms of element type <Metal> to all other atoms. The element type of the metal is picked up from the ATOM NAME column in the PDB input file (NOT the element type at the end of the line, which is only present in newer PDB released entries).

If MODE is not specified elsewhere then the use of the METAL keyword will automatically set the MODE to AUTO, and look for contacts with symmetry related atoms - in this case the symmetry operations must be specified in some manner; see the MODE keyword for more details.

<metal-ligand distance> defaults to a value of 2.35 Å, if not explicitly specified. Pairs of atoms which are less than this distance apart will be marked with a "***" symbol (see Comments on output). Use the LIMITS keyword to set the closest and furthest distance for finding contacts, though note that METAL will automatically set a closest distance of at least 0.25 Å, overriding that set by LIMITS if it is smaller.

The output consists of a list of metal-ligand contacts; for each metal atom a table of angles is also printed for all those atoms closer than <metal-ligand distance>. These are the angles at the metal position formed by each pair of contacting atoms.

Example 1:


  contact xyzin   /homes/henrick/pdb/pdb4ins.ent <<eof
  METAL ZN 2.35
  eof

Example 2:

If there is no spacegroup name on the CRYST1 line in the input PDB file use (or you can still select individual symmetry operations with SYMMETRY keyword as in the original contact.f)


  contact xyzin   /homes/henrick/pdb/pdb4ins.ent <<eof
  METAL ZN 2.35
  SPACEGROUP R3
  eof

This gives the output:

zn     1  ZN  His    10B CD2  ...  3.08    [      ]   1: -Y,  X-Y,  Z
              His    10B CE1  ...  3.02    [      ]   9:   X, Y, Z
              His    10B CE1  ...  3.02    [      ]   1: -Y,  X-Y,  Z
              His    10B NE2  ...  2.11 ***[      ]   9:   X, Y, Z
              His    10B NE2  ...  2.10 ***[      ]   1: -Y,  X-Y,  Z
              His    10B NE2  ...  2.11 ***[      ]   2: -X+Y,  -X,  Z
              Wat   201  O    ...  2.19 ***[      ]   1: -Y,  X-Y,  Z
              Wat   201  O    ...  2.19 ***[      ]   2: -X+Y,  -X,  Z
              Wat   251  O    ...  3.32    [      ]   9:   X, Y, Z  
              Wat   251  O    ...  3.32    [      ]   1: -Y,  X-Y,  Z
              Wat   251  O    ...  3.32    [      ]   2: -X+Y,  -X,  Z
              His    10B CD2  ...  3.09    [      ]   9:   X, Y, Z
              His    10B CD2  ...  3.09    [      ]   2: -X+Y,  -X,  Z
              His    10B CE1  ...  3.02    [      ]   2: -X+Y,  -X,  Z
              Wat   201  O    ...  2.19 ***[      ]   9:   X, Y, Z
zn     2  ZN  His    10D CD2  ...  3.14    [      ]   1: -Y,  X-Y,  Z
              His    10D CD2  ...  3.14    [      ]   2: -X+Y,  -X,  Z
              His    10D CE1  ...  2.96    [      ]   1: -Y,  X-Y,  Z
              His    10D CE1  ...  2.96    [      ]   2: -X+Y,  -X,  Z
              His    10D NE2  ...  2.08 ***[      ]   1: -Y,  X-Y,  Z
              His    10D NE2  ...  2.08 ***[      ]   2: -X+Y,  -X,  Z
              Wat   513  O    ...  2.32 ***[    -C]   9:   X, Y, Z
              Wat   513  O    ...  2.32 ***[    -C]   2: -X+Y,  -X,  Z
              His    10D CD2  ...  3.14    [      ]   9:   X, Y, Z
              His    10D CE1  ...  2.96    [      ]   9:   X, Y, Z
              His    10D NE2  ...  2.08 ***[      ]   9:   X, Y, Z
              Wat   513  O    ...  2.32 ***[    -C]   1: -Y,  X-Y,  Z

 for Metal atom  ZN    ZN      1
 ============================================
 1 His    10B NE2       
 2 His    10B NE2           98.9
 3 His    10B NE2           98.7    98.8
 4 Wat   201  O             93.9    90.4   163.0
 5 Wat   201  O            163.1    93.9    90.2    74.9
 6 Wat   201  O             90.2   163.2    93.7    74.8    74.8
                               1       2       3       4       5

 for Metal atom  ZN    ZN      2
 ============================================
 1 His    10D NE2       
 2 His    10D NE2          103.4
 3 Wat   513  O            154.0    96.6
 4 Wat   513  O             96.6    87.7    67.5
 5 His    10D NE2          103.4   103.4    87.7   154.0
 6 Wat   513  O             87.7   154.0    67.5    67.5    96.6
                               1       2       3       4       5
Example 3:
   METAL AL 2.25
For pdb entry 1kdn the aluminium to ligand contacts are within the same residue, not between residues, nor are symmetry related contacts involved. Note that the order of the atoms in the input file is important in this case:
--- this will work

HETATM 1150 AL   AF3 A 157      69.404  29.227   2.379  1.00 22.45
HETATM 1151  F1  AF3 A 157      69.611  30.838   3.207  1.00 23.32
HETATM 1152  F2  AF3 A 157      68.410  27.856   3.138  1.00 23.97
HETATM 1153  F3  AF3 A 157      69.936  28.691   0.589  1.00 19.15
 
--- this will fail

HETATM 1151  F1  AF3 A 157      69.611  30.838   3.207  1.00 23.32
HETATM 1152  F2  AF3 A 157      68.410  27.856   3.138  1.00 23.97
HETATM 1153  F3  AF3 A 157      69.936  28.691   0.589  1.00 19.15
HETATM 1150 AL   AF3 A 157      69.404  29.227   2.379  1.00 22.45

COMMENTS ON OUTPUT

Please note: this is not a definitive explanation of the output! See also individual keyword information for more details of the output associated with specific keywords.

The main output of the program is a list of the contacts. The information is arranged in columns as follows:

First three columns: describe the "source" atom
           1 - residue name
           2 - residue number
           3 - atom name
Second three columns: describe the "target" atom
           4 - residue name
           5 - residue number
           6 - atom name
In MODEs ALL, IRES and ISUB the target atoms are simply taken from those listed in the input pdb file (limited by the TARGET keyword, if present). In these modes, the next three columns are:
           7 - distance between target and source (angstroms)
           8 - hydrogen bond angle (if ANGLE keyword is present)
           9 - "***", "  *"  or "   " (i.e. blank)
See the comments for keyword ANGLE above for more details of the contents of column 8; it will be empty if ANGLE is not specified. In column 9, "***" indicates the strong possibly of a hydrogen bond at this contact (distance < 3.3 A), " *" indicates a weak possibility (distance > 3.3 A). Blank indicates that the program considers there is no possibility of a hydrogen bond. (Nb: when using the METAL keyword, column 9 instead marks those contacts which lie within the <metal-ligand distance>.)

The final three columns will be empty except in MODE IMOL or AUTO. In these cases, the target atoms are generated from those in the pdb file by a combination of a symmetry operation followed by a lattice translation. These columns give information about the operations used to create these "symmetry related" atoms:

          10 - lattice translations
          11 - number identifying symmetry operation
          12 - name identifying symmetry operation
See the SYMM, SYMTIT and SPACEGROUP keywords for more details of the number (col. 11) and name (col. 12) of the symmetry operators. Column 10 will contain output of the form
          e.g.  [  +B-C] or [+A    ].
Here A, B and C refer to the primitive lattice vectors a, b and c respectively, which are generated from the CRYST1 card in the pdb file. So in these two modes, target atoms may seem to appear more than once - but they will be distinguished by having different associated symmetry and/or translation operations, which means they are symmetry related to the atom listed in columns 4-6 but are at different physical positions.

EXAMPLES

1) Intermolecular contacts:

contact xyzin  holo9c4.brk <<eof
TITLE INTER-MOLECULAR CONTACTS FOR HOLO-GAPDH
ATYPE ALL
MODE  IMOL
limits  0.0  3.9
symtit  +A
symm    1+X,Y,Z
symtit  +C
symm    X,Y,1+Z
symtit  21
symm    -X,1/2+Y,-Z
eof

2) Intermolecular contacts (AUTO mode). Default symmetry labels.

contact xyzin xisomerase.brk << eof
mode   auto
symm -Y,X-Y,1/3+Z
symm Y-X,-X,2/3+Z
symm Y,X,-Z
symm -X,Y-X,1/3-Z
symm X-Y,-Y,2/3-Z
eof

3)Contacts between specified parts of the molecule:

contact xyzin holo9c4.brk << eof
TITLE  NAD-WATER CONTACTS
limits 0.0  3.9
SOURCE 336
TARGET 337 1000
eof

4)Bond lengths in NAD molecule (residue number 336):

contact xyzin holo9c4.brk << eof
TITLE NAD BONDS
mode    ALL
limits  0.0  1.9
SOURCE  336
TARGET  336
eof

5)Intersubunit contacts for a fragment of the chain. Subunits must have different chain names in the Brookhaven data file.

contact xyzin holo9c4.brk << eof
title  INTERSUBUNIT CONTACTS
mode   ISUB
limits 0.0  3.9
source 180 210
eof

6) Analysis of water hydrogen bonding (all water contacts)

contact xyzin holo9c4.brk << eof
TITLE WATER CONTACTS
mode auto
angle
symm x,y,z
symm -X,1/2+Y,-Z
SOURCE  340 1000            !water molecules
eof

7) Hydrogen bonds and angles for a piece of chain

contact xyzin holo9c4.brk << eof
ANGLE
LIMITS   0.0 3.4
SOURCE   33 88
TARGET  1 334
eof

8) For MODE AUTO with TO and FROM keywords

  contact xyzin   /homes/henrick/pdb/pdb4ins.ent <<eof
  MODE AUTO
  LIMITS   2 3.66
  FROM RESIDUE ALL CHAIN A 1 to 21
  TO   RESIDUE ALL CHAIN D 1 to 30
  eof

will give all symmetry related chain D atoms to identity chain A atoms, i.e.

 Thr     8A O   His     5D CD2  ...  3.00         +C1: -Y,  X-Y,  Z
 Ser     9A C   His     5D NE2  ...  3.45         +C1: -Y,  X-Y,  Z
 Ser     9A O   His     5D CD2  ...  3.39         +C1: -Y,  X-Y,  Z
                His     5D CE1  ...  3.62         +C1: -Y,  X-Y,  Z
                His     5D NE2  ...  2.60 ***     +C1: -Y,  X-Y,  Z
 Gln    15A CG  Phe    25D CE2  ...  3.57     -A    5: -X+Y+2/3,-X+1/3,Z+1/3
 Asn    18A CG  Thr    27D OG1  ...  3.64     -A    5: -X+Y+2/3,-X+1/3,Z+1/3
 Asn    18A OD1 Thr    27D OG1  ...  2.75 *** -A    5: -X+Y+2/3,-X+1/3,Z+1/3

9) Combining selection TO/FROM with AUTO and ANGLE

  contact xyzin   /homes/henrick/pdb/pdb4ins.ent <<eof
  MODE AUTO
  LIMITS   2 3.66
  FROM RESIDUE ALL CHAIN A 1 to 21
  TO   RESIDUE ALL CHAIN D 1 to 30
  ANGLE
  eof

AUTHOR

Tadeusz Skarzynski, Imperial College, London, 1.12.88
(the ANGLE (HOH) mode by Andrew Leslie)

SEE ALSO

NCONT - new contact seeking program with flexible atom selection syntax
ACT - alternative contact seeking program
DISTANG - alternative contact seeking program