SC (CCP4: Supported Program)

NAME

sc - Determine Sc shape complementarity of two interacting molecular surfaces

SYNOPSIS

sc XYZIN foo.pdb [ SCRADII radii.lib ] [ SURFIN1 foo1_in.srf SURFIN2 foo2_in.srf SURFOUT1 foo1_out.srf SURFOUT2 foo2_out.srf ]
[Keyworded input]

DESCRIPTION

The shape correlation statistic Sc (Lawrence and Colman, 1993) can be used to quantify the shape complementarity of protein/protein interfaces and give an idea of the "goodness of fit" between two protein surfaces. The program SC will calculate values of Sc and related statistics for the interface region between two molecules in a Brookhaven coordinate file.

SC also allows the normal products to be merged into GRASP surface files for display in GRASP (Nicholls, 1993).

KEYWORDED INPUT

The input comprises three sections:

Section 1: Molecule definition (compulsory)

The molecule definition commands are used to select which atoms in the input file are to make up the two individual molecules for the Sc calculation. Entries for this section appear twice, once for each molecule (see EXAMPLES):

AT_EXCL, AT_INCL, CHAIN, MOLECULE, ZONE

Section 2: Parameter definition (optional)

The default values for the parameters are set inside the program at compilation time (in the file defaults.h), and should be suitable for most applications. In particular you should avoid using different values for PROBE_RADIUS, TRIM and WEIGHT if you intend to compare your values of Sc with the results of other calculations, or with values found in the literature.

DOT_DENSITY, INTERFACE, PROBE_RADIUS, TRIM, WEIGHT

Section 3: Grasp input/output (optional)

These commands are only required if you want to merge the results of the Sc calculations with existing GRASP surface files for the purposes of graphical display.

GRASP_BACKGROUND, GRASP_MATCH

See NOTES ON GRASP FILES if you intend to use the merging facility.

KEYWORDS

MOLECULE <n>

This selects which molecule to put the subsequent selection in; <n> is either 1 or 2. This keyword is followed by a combination of CHAIN, ZONE, AT_EXCL and/or AT_INCL keywords, which then select the atoms which will be included as the molecule. Selection via these subsequent keywords is logically sequential.

CHAIN <chn>

Include a particular chain. All atoms in chain <chn> will be included in the selected molecule.

ZONE [ <chn1> ] <res1> [ <chn2> ] <res2>

Include a zone of residues. All atoms in and between the named residues will be included in the selected molecule. The chain names <chn1> and <chn2> should be omitted if the chain identifier field is blank within the coordinate file. <res1> and <res1> define the residue sequence numbers (not type) that delimit the selected zone.

AT_EXCL [ <chn> ] <res> <atm>

Exclude a particular atom. The atom identified by chain <chn>, residue sequence number <res> and atom name <atm> will be excluded from the selected molecule. The chain name <chn> should be omitted if the chain identifier field is blank within the coordinate file.

AT_INCL [ <chn> ] <res> <atm>

Include a particular atom. The atom identified by chain <chn>, residue sequence number <res> and atom name <atm> will be included from the selected molecule. The chain name <chn> should be omitted if the chain identifier field is blank within the coordinate file.

PROBE_RADIUS <rad>

[Default: 1.7 Å]

Sets the radius of the probe sphere which is used to define the solvent excluded surface.

Note:You should avoid changing the probe radius if you intend to cross-compare the results of the Sc calculation with values obtained elsewhere, as the comparison will be invalid if different probe radii are used.

DOT_DENSITY <dots>

[Default: 15 dots/Å2]

The density of the dots used to calculate the molecular surface - higher values (more dots per unit area) give higher precision but also take longer to run.

TRIM <trim>

[Default: 1.5 Å]

Sets the distance used to generate the peripheral band.

The peripheral band consists of those surface points which are part of the buried portion of the molecular surface but which lie within a distance <trim> of the non-buried (i.e. solvent accessible) surface. Points in the peripheral band are omitted from the calculations.

Note: You should avoid changing the width of the peripheral band if you intend to cross-compare the results of the Sc calculation with values obtained elsewhere, as the value of Sc depends on the width of the excluded band.

INTERFACE <dist>

[Default: 8 Å]

Distance determining which atoms are used in the calculations. See PROGRAM FUNCTION for details about this parameter before changing it.

WEIGHT <w>

[Default: 0.5 Å-2]

This sets the value of the weighting factor used in the calculation of the surface complementarity function S(A->B). (See PRINTER OUTPUT for the definition of S(A->B).)

Note: You should avoid changing the weighting factor if you intend to cross-compare the results of the Sc calculation with values obtained elsewhere, as the value of Sc depends on the weighting used.

GRASP_MATCH <tol>

[Default: 1.5 Å]

The tolerance for equivalencing GRASP and SC surface points. The strategy employed by the program is to assign to each GRASP surface vertex the weighted normal dot product associated with the nearest Connolly surface point to that vertex. If no point employed within the Sc calculation is found within a distance <tol> of the vertex then the vertex is deemed to be part of the non-interacting surface. The value of <tol> will depend on the dot density and resolution of the respective surfaces. The non-interacting surfaces are assigned a general property 1 value assigned by the GRASP_BACKGROUND keyword (below).

GRASP_BACKGROUND <val>

[Default: -2.0]

General Property 1 value for vertices that lie more than GRASP_MATCH from any Connolly point within the interacting surfaces. The aim here is simply to set up a distinctly different value that can hence be displayed in a separate colour within GRASP.

END

End keyworded input.

INPUT AND OUTPUT FILES

Input files

XYZIN
[Compulsory]
input pdb file containing the coordinates of the molecules for which the shape complementarity will be assessed. Note that multiple conformations are not permitted for atoms at the interface of the molecules. There are also appear to be problems with H atoms in XYZIN, see KNOWN PROBLEMS below.
SCRADII
[Optional]
reference file containing the radii which will be assigned for atoms in XYZIN. This defaults to $CLIBD/sc_radii.lib but can be reassigned on the command line, for example if you have a modified reference file containing extra Van der Waals radii. You may have to specify the path explicitly to stop the program looking in $CLIBD for your file.
Note: It is recommended that you do not alter the existing entries if you intend to cross-compare the results of the Sc calculation with values obtained elsewhere, as the comparison will be invalid if different atomic radii are used.
SURFIN1, SURFIN2
[Optional]
Two GRASP surface files, one for each molecule. These surfaces will have the weighted normal dot product assigned to each vertex in the interface.
See also NOTES ON GRASP FILES.

Output files

SURFOUT1, SURFOUT2
[Only required if SURFIN1, SURFIN2 are specified]
Output files for the GRASP surfaces, appended with the weighted normal dot product as General Property1. These can be re-read into GRASP for display.

PRINTER OUTPUT

The program output includes the following loggraph tables for each of the molecules.

  1. Histograms of the distance functions between surfaces, D(1->2) and D(2->1)

    D(A->B) is defined as

    D(A->B)(xA) = |xA - x'A|2

    where xA is a point on the interface (i.e. buried) surface of molecule A and x'A is the nearest surface point to xA on molecule B. (It is noted that differences in shape complementarity are less well discerned by these simple distance metrics. See Lawrence and Colman, 1993.)

  2. Histograms of the surface complementarity functions, S(1->2) and S(2->1)

    S(A->>B) (also referred to as the weighted normal dot product) is defined as

    S(A->B)(xA) = (nA.n'A) exp [-w(|xA - x'A|)2]

    where xA, x'A have the same meanings as above, nA,n'A are the normals to the surfaces at those points, and w is a weighting factor.

The shape correlation Sc is then defined as

Sc = [ { S(A->B) } + { S(B->A) } ]/2

where the braces denote the median of the S(A->B), S(B->A) distributions. (See Lawrence and Colman, 1993 for more detailed descriptions of these functions.)

Interfaces with Sc = 1 will mesh precisely, interfaces with Sc approximately zero will effectively be uncorrelated in their topography.

Note that Sc may become rather meaningless when the buried area becomes small, and hence it may not be a good measure for small crystal contacts. This is simply because as the overall buried area becomes smaller and/or more convoluted or disjointed in shape, the percentage removed as part of the peripheral band increases substantially.

PROGRAM FUNCTION

This program computes Sc between two molecules in a numerical fashion. The algorithm is fully detailed in Lawrence and Colman, 1993. Briefly: the molecular surfaces are represented as a series of discrete points (Connolly, 1983) of sufficiently high surface sampling density (set by the DOT_DENSITY keyword) and S(1->2) and S(2->1) are then evaluated at these points.

The interface surfaces are defined as being the portion of the molecular surface of molecule 1 which is buried from solvent by its interaction with molecule 2 (and vice versa). The molecular surface itself is defined (Richards, 1977) as the union of contact and re-entrant portions demarcated by a probe sphere of a given radius (set by the PROBE_RADIUS keyword).

Only atoms within the INTERFACE distance of any "buried" atoms (defined in the Connolly sense) are selected for initial surface computation. This parameter does not enter formally into the evaluation of Sc, its purpose is simply to speed up the computation by excluding from consideration atoms remote from the interface. The program in reality computes not the entire surface for the individual molecules, but rather only for the subset of atoms within the INTERFACE distance from the other molecule. A portion of this surface is non-physical, as it is buried with the core of the individual molecule, however its presence does not affect the computation of Sc as it is remote from the interaction. If there is any doubt about the validity of this approach for a particular molecule, the program should be rerun with a larger value for this parameter to ensure that the computation is stable. Subsequently, a periphery band of buried points are removed if they lie within a distance TRIM of any solvent accessible surface points.

Cross-comparison of Sc numbers between proteins (i.e. characterisation of surfaces as more or less complementary than other types of surface) is the main interest in SC. This is only valid if the same values of the critical parameters (probe radius, width of the peripheral band, atomic radii, weighting factor) are used in both computations. To this end it is recommended that the default values for the PROBE_RADIUS, TRIM width and the atomic radii set in the sc_radii.lib file should be used, so that the results will be comparable with other literature values.

The program includes a modified version of Michael Connolly's subroutine "mds" for calculating molecular surfaces; the original code can be obtained from his website at http://www.biohedron.com. The version contained in SC is provided here with the consent of Michael Connolly. The modifications include a minor bug fix, and use of the CCP4 library routines for exiting on fatal errors (``CCPERR'') and for calculating vector products (``CROSS'').

INTERACTION WITH GRASP

Sc itself cannot be computed satisfactorily within GRASP, as GRASP uses a rather different approach to surface definition. However qualitative display of the weighted normal products S(A->B) is possible - this is achieved by a simple mapping of this value from the one surface to the other.

  1. Within GRASP compute a molecular surface for each of the interacting molecules.
  2. Write these surfaces out within GRASP.
  3. Read these surfaces into SC and perform the Sc computation. The surfaces will automatically be written out with the S(A->B) values assigned to each surface.
  4. Read the modified surfaces back into GRASP.
  5. Colour the modified surfaces according to General Property 1. An appropriate colour ramp will need to be set up within GRASP to achieve the desired effect. It will probably also be necessary to "open" the interface up via rotating one of the surfaces, otherwise you won't see anything.

There are however some limits to SC's interaction with GRASP. See the NOTES ON GRASP FILES below.

NOTES ON GRASP FILES

To the best of our knowledge, GRASP is only available for Silicon Graphics machines, and since the surface files it produces contain unformatted data these files are not generally portable to other systems, e.g. Digital Alphas.

SC will make a check on the compatibility of input surface files before trying to read them in. In cases where it detects a problem, the files will not be read in, no merging will be performed, and no output surface files will be generated. In these cases, if GRASP output is required it will be necessary to run SC on another machine which has compatible conventions for reading and writing unformatted data.

There have been some reports of bugs in GRASP 1.3.6 which have caused problems with the GRASP output from SC. Please let us know if you experience problems which might be due to such bugs.

FORMAT OF THE RADII FILE

It will be necessary to edit the radii file used by the program, if your input file contains atoms which are not in the file already. It is not recommended that you change the values of radii already in the file, as this will compromise comparison of your calculated Sc values with values used in the literature.

Each entry in the file is a single line with three fields separated by spaces, of the format:

Residue_name    Atom_name      Radius

Either of the name fields can contain one or more wildcards (i.e. the asterisk character '*') to match to multiple residues or atoms, e.g. O* will match to O1, O2 etc. Unidentified residue/atom combinations will cause the program to stop.

The default radii file is sc_radii.lib in $CLIBD; to use a modified radii file in a different directory, assign the filename and path via the SCRADII logical name.

KNOWN PROBLEMS

It is essential to remove ALL multiple conformations from the input PDB file (XYZIN). If multiple conformations are present in the file then the program may terminate with an message ERROR IN CHAIN CARD (from the CCP4 libraries) - in which case it is recommended that you check that there are no remaining multiple confirmations.

There also appear to be problems with H atoms in XYZIN. The program may stop with error message "SC: imaginary contain". Stripping H atoms from XYZIN seems to cure it. It is not known how general this problem is, nor why it occurs.

If these problems persist, then please report it to CCP4.

EXAMPLES

Two non-runnable Unix example scripts (using Grasp input) found in $CEXAM/unix/non-runnable/

  • sc.exam

    REFERENCES

    1. Michael C. Lawrence and Peter M. Colman J. Mol. Biol., 234, p946 - p950 (1993)
    2. M. L. Connolly J. Appl. Crystallogr., 16, p548 - p558 (1983)
    3. F. M. Richards Annu. Rev. Biophys. Bioeng, 6, p151-176 (1977)
    4. A.J. Nicholls Biophys. J., 64, A116 (1993)

    AUTHOR

    Version 2.0

    Copyright Michael Lawrence,
    Biomolecular Research Institute,
    343 Royal Parade Parkville Victoria Australia

    SEE ALSO