|Home | 2000 | 2002 | 2003 | 2004 | 2005|
The first Structural
Molecular Biology (SMB) Summer School entitled "Making the
most of your synchrotron trip" was held at the Stanford
Synchrotron Radiation Laboratory (SSRL) from 19th-23rd September 2000.
School was funded as part of the Training and Dissemination program
of the NIH-NCRR Synchrotron Radiation Structural Biology Resource
at SSRL, SSRL's SMB program's National Center for Research Resources
(NCRR) grant from the National Institutes of Health. Additional
funding, through corporate sponsorship from Agouron
Pharmaceuticals, Compaq Computer Corporation, Silicon Graphics,
Area Detector Systems Corporation and the Collaborative Computing
Project Number 4 was used to extend the original schedule, to
provide extra student capacity and to finance additional
travel funds for overseas
tutors. Both Silicon Graphics and Compaq Computer Corporation
contributed computing resources that were essential for the
tutorial sessions. Student places to attend the school were limited
and the registration was heavily oversubscribed within a few days
of the original announcement. In the end, the original capacity was
stretched to allow a total of 26 students to attend. A team of 22
tutors, comprising SSRL staff members and 12
internationally recognized crystallography experts presented an
intensive training program consisting of lectures, complemented by
hands-on tutorial sessions. A broad cross-section of
state-of-the-art macromolecular crystallography techniques and
methods were covered.
The school opened with a warm welcome by Keith Hodgson, P.I. of the NIH-NCRR Resource and the SLAC associate director for of the SSRL facility. He outlined the history behind the SMB research program at SSRL and described how far the experimental facilities and methods have advanced since the first experiments were conducted in the early 1970's. Keith closed with a glimpse of what the future might hold in store for the structural biology community, by highlighting the recent progress with the design of the Linac Coherent Light Source (LCLS) free-electron laser concept and the proposed single bio-molecule imaging experiments, which may become possible at such a facility. Peter Kuhn (SSRL and Stanford University) then gave an overview of the role of synchrotron radiation in structural biology research. He focused on the latest opportunities in macromolecular crystallography and showed how the SSRL facility is being developed to meet many new and exciting challenges, by providing state-of-the-art facilities to the user community. The subsequent lectures in the opening session focused on some of the techniques available for structural biology research at synchrotron radiation sources, with an emphasis on how these techniques can complement the structural information derived from macromolecular crystallography. Graham George (SSRL) and Hiro Tsuruta (SSRL) introduced the students to X-ray Absorption Spectroscopy (XAS) and Small Angle X-ray Scattering (SAXS), which are being pioneered and developed as part of the SMB program at SSRL.
During the second session the focus of the program turned to macromolecular crystallography. The preliminary lectures emphasized some of the sample preparation issues that need to be considered in a crystallography project, before collecting data at a synchrotron radiation source. David Bushnell (Stanford University) described methods for selenomethionine incorporation, in a variety of expression systems, as a way of preparing suitable samples for Multi-wavelength Anomalous Dispersion (MAD) experiments and Bill Weis (Stanford University) discussed several laboratory techniques for obtaining protein samples better suited to crystallization. He also outlined some more rational approaches to heavy atom derivatization. In the last lecture of the opening day Elspeth Garman (University of Oxford, UK) described the identification and optimization of cryo-protection and freezing protocols for macromolecular crystals.
The tutorial sessions on the first day served to familiarize the users with the core mission of the SMB facility at SSRL. The students were introduced to some experimental techniques, including crystal freezing and xenon/krypton derivatization and they were shown some of the experimental beam lines. Since the SSRL facility was shutdown for a period of maintenance and upgrade the students also had the opportunity to explore the interior of the SPEAR storage ring.
The second day of the summer school was dedicated to data collection and processing. Paul Ellis (SSRL) opened the session by discussing how to avoid many common mistakes during data collection. Ana Gonzalez (SSRL) followed this with a comparison of various strategies and approaches specifically related to MAD phasing experiments. Elspeth Garman offered advice and insight on the collection of ultra-high resolution diffraction data. All the speakers in this session emphasized the importance of planning a careful diffraction experiment, since the quality of the experimental data is crucial for both all subsequent structure solution and refinement calculations. Most of the errors introduced during the data collection are difficult or impossible to correct at later stages. However, many of the problems that occur during a data collection experiment can be readily detected during the data reduction and scaling steps, which highlights the importance of processing the diffraction data during the data collection experiment. Andrew Leslie (MRC, University of Cambridge, UK) introduced the first steps in data processing from autoindexing and determining an optimal data collection strategy, through integration to post-refinement and then Phil Evans (MRC, University of Cambridge, UK) discussed the subsequent scaling and merging steps.
During the second day's tutorial session the students were shown how to process an example data set with MOSFLM and then carry out the scaling and merging with SCALA. The students were able to experiment with a variety of options in each of the programs and monitor their effect on the data quality. The students were also introduced to the CCP4 graphical user interface. As part of the same tutorial session, Aina Cohen gave a presentation and tutorial on using the SSRL scripts to run MOSFLM and SCALA in a rapid and highly automated fashion. These scripts allow even the most inexperienced users to process their data during the diffraction experiment.
After the tutorial session on the second day Graeme Laver (FRS, Australian National University, Canberra, Australia) gave a special guest lecture on his experiences investigating and designing vaccines anti-viral drugs against influenza. His adventures with the flu were filled with amusing anecdotes about journeys to far off islands in search of new scientific insights into the various flu-strains. The lecture was followed by the summer school banquet, which was held at SSRL. The event provided the students and tutors with the opportunity to mingle, relax and chat about their experiences.
Day three was focused on the topic of heavy atom location, which is a key step in the structure determination of many macromolecules. Particular attention was given to the determination of large selenomethionine substructures, often used in MAD phasing experiments. Phil Evans opened the session with a thorough introduction to basic Patterson theory and the methods used for interpreting Patterson maps. Thomas Schneider (University of G÷ttingen, Germany) then described the automated Patterson search methods implemented in the program SHELXS. Ashley Deacon (SSRL) introduced the students to some of the theory behind the so-called “direct methods” of structure determination. He described the Shake-and-Bake algorithm and showed how it could be used to determine large multi-selenium substructures. Thomas Schneider closed the session with a description of the "half-baked" variant of the Shake-and-Bake algorithm, as implemented in the new program SHELXD.
The tutorial session on the third day provided the students with the opportunity to use all the methods described above for Patterson interpretation and substructure determination. Thomas Schneider took the students step-by-step through a manual Patterson superposition example and then showed them how to interpret the output from the automated version incorporated in SHELXS. The students also used the direct methods program SnB to solve a small macromolecular structure and a small selenium substructure example. Thomas Schneider then returned to show the students how to use the new SHELXD program to rapidly solve large selenium substructures. Finally, Duncan McRee (The Scripps Research Institute) demonstrated many of the features available in Xtalview for both Patterson interpretation and general heavy atom location.
Phase determination was the main topic on the fourth day of the summer school. Eleanor Dodson (University of York, UK) gave a lecture on approaches to MAD phasing and a second lecture on phase improvement techniques. Paul Adams (CCI, Lawrence Berkeley National Laboratory) followed this with a discussion of MAD and Single-wavelength Anomalous Diffraction (SAD) phasing strategies with the program CNS. The final two lectures of the day highlighted some of the recent advances that have been made in automating the phase determination process in macromolecular crystallography. In the first, Thomas Terwilliger (Los Alamos National Laboratory) described the automated structure solution and density modification capabilities of the program SOLVE and then Anastassis Perrakis (EMBL, Grenoble outstation, France) presented the automated model building and refinement algorithms incorporated into the ARP / wARP package.
Similar to previous days the tutorial session focused on allowing the students to gain hands-on experience using the programs that had been described to them in the earlier lectures. This tutorial session really highlighted the speed with which all necessary phasing calculations can be carried out and how easy it is to set up scripts to control the programs. In the case of CNS and CCP4 graphical user interfaces provide access to all the programs functionality and in the case of SOLVE a simple script is able to carry out the entire structure determination.
On the fifth and final day of the summer school lectures focused on how to proceed rapidly from an initial interpretable electron density map to a fully refined macromolecular structure. Duncan McRee (The Scripps Research Institute) gave two talks on the program Xtalview. In the first he discussed automatic map interpretation and in the second he gave insights and advice on refinement and validation. Anastassis Perrakis discussed the capabilities and limitations of the ARP/wARP package and illustrated them with several example structures. Thomas Schneider talked about getting the best possible model from an atomic resolution refinement and how to rationally introduce additional parameters into the refinement. Finally Paul Adams highlighted many of the powerful new features available for refinement within the CNS program. A final short tutorial session gave the students the chance to use the ARP/wARP and CNS packages.
The SMB group at SSRL would like to extend their warm thanks and appreciation to the tutors who traveled from far and wide to provide such a valuable learning experience for all the students and to many of the SSRL staff who participated and helped to organize the event. We hope we can build on the success of this year's event and make the next SMB Summer School an even bigger success.
The first day of the summer school was divided into two sessions. The first provided an overview of structural biology research at synchrotron radiation sources. Talks were given by Keith Hodgson (SSRL / Stanford University) on the SSRL Structural Molecular Biology (SMB) program, Peter Kuhn (SSRL / Stanford University) on the role of synchrotron radiation in structural biology research and the latest developments in macromolecular crystallography, Graham George (SSRL) on x-ray absorption spectroscopy and Hiro Tsuruta (SSRL) on small angle x-ray scattering. The second session focused on sample preparation techniques relating to macromolecular crystallography projects and included talks by David Bushnell (Stanford University, Bill Weis (Stanford University) and Elspeth Garman (University of Oxford, UK).
The first session opened with a welcome and introductory talk by SSRL director Keith Hodgson, P.I. of the NIH-NCRR Synchrotron Radiation Structural Biology Resource at SSRL and SLAC Associate Director for SSRL, who gave an overview of the SMB developments at SSRL, from the first experiments conducted in the early 1970's to the state-of-the-art facilities available through the current SMB program. He also outlined some of the future prospects at SSRL, with new opportunities presented by enhanced x-ray sources, including the upgrade of the Stanford Positron Electron Accelerating Ring (SPEAR) to a 3rd generation synchrotron facility and the design and implementation of the Linac Coherent Light Source (LCLS). The proposed LCLS will mark a breakthrough in structural biology studies, by allowing the imaging of single bio-molecules in the femtosecond time-resolution regime.
The subsequent lectures in the opening session provided an overview of the specific structural biology techniques available to experimenters through the SMB program at SSRL. Peter Kuhn presented the first of these lectures. He outlined the relationship between the major structural biology research methods, including small angle scattering, x-ray absorption spectroscopy and macromolecular crystallography. He then proceeded to describe the latest instrumentation and software developments available for macromolecular crystallography. He showed how ultra-high resolution studies and the structure determination of large macromolecules and macromolecular assemblies are now possible. He showed some of the key developments that have facilitated these new scientific applications, including kappa diffractometers, large area matrix CCD detectors and advanced computer environments, based on high-productivity multiprocessor servers. He also presented the unified data collection environment, based on the BLU-ICE software developed at SSRL. He showed how this environment also meets the needs of the SSRL “Collaboratory for Protein Crystallography”, by providing remote access to both data collection and data analysis facilities. Finally, he described how it paves the way towards an Automated Structural Analysis of Proteins system, that is being designed as part of the Joint Center for Structural Genomics (JCSG). Throughout the lecture examples of several macromolecular structures, solved using the SSRL facilities, were given.
The second lecture was given by Graham George and was devoted to X-ray Absorption Spectroscopy (XAS). In the first part of his lecture he gave a simple, but comprehensive description of the physical foundations of both XAS and Extended X-ray Absorption Fine Structure (EXAFS). He then reviewed the commonly used experimental techniques, the instrumentation available on SSRL Beam Line 6-2 and the data reduction and analysis methods. He also explained the importance of running XAS experiments at cryogenic temperatures. He showed that XAS experiments can provide important information about the oxidation states and also give very accurate estimates of bond lengths. The coordination numbers and Debye-Waller factors can also be estimated, although with lower accuracy. In the second part of his lecture Graham gave several practical examples of EXAFS applications to the study of molybdenum- and tungsten-dependent enzymes, such as DMSO reductase and Xanthine oxidase. The history of the structure determination of DMSO reductase highlights the importance of combining crystal structure determination with EXAFS measurements. Until recently, the published crystallographic data for DMSO reductase did not match the EXAFS data with respect to the bond lengths for the molybdenum atom. Only the latest 1.3 ┼ resolution structure was able to correctly identify the coexistence of two different molybdenum species within the DMSO reductase active site and bring the crystal structure and EXAFS data into agreement.
In the final lecture of the opening session Hiro Tsuruta illustrated the applications of Small Angle X-ray Scattering (SAXS) in structural biology. He covered both solution scattering and time-resolved studies, which can complement high resolution structural studies by providing data on the conformational changes that frequently occur within proteins and in macromolecular complexes. Hiro highlighted several areas of particular interest, including the study of protein-protein interactions, protein folding and the assembly and maturation of virus particles and several other properties, which are impossible to investigate using regular cryogenic high-resolution X-ray diffraction methods. Small angle scattering data also offer a good way of verifying structures obtained by conventional crystallographic methods and provide a way to scale and correct X-ray diffraction data with data obtained by electron microscopy. Small angle diffraction data can also offer an alternative approach to low-resolution structure determination. Hiro included a description of the small angle crystallography instrument available on Beam Line 4-2 at SSRL. His talk was illustrated throughout with examples of experimental results obtained from both protein and virus molecules.
The second session of the day focused on sample preparation procedures used for macromolecular crystallographic structure determination. David Bushnell opened the session with a lecture devoted to the incorporation of selenomethionine into macromolecules. Selenomethionine incorporation is one of the major methods of sample preparation for Multi-wavelength Anomalous Dispersion (MAD) experiments. David outlined some of the major difficulties in the method, including the oxidation of selenium, which can blur the selenium absorption edge and changes in the crystallization conditions caused by the lower solubility of selenomethionyl proteins. David described several practical methods for the incorporation of selenomethionine, including the LeMaster method, the M9 method and the metabolic inhibition method. He detailed several practical recipes and also discussed the incorporation of selenomethionine into non-E. Coli expression systems.
Bill Weis devoted his lecture to several more broad-based sample preparation techniques for x-ray crystallographic studies. He reviewed strategies for limited proteolysis and more rational heavy atom incorporation. Limited proteolysis can be used to divide up multi-domain proteins and to remove flexible tail regions and purification tags. This procedure can increase the chances of crystallization and it can also help improve the intrinsic order of poorly diffracting crystals. Bill described the facilities and methods needed and also reviewed the criteria dictating the choice of protease. He outlined experimental procedures and emphasized that special handling procedures are necessary to prevent protease contamination in other areas of the laboratory. He then discussed other ways of monitoring sample homogeneity, caused by aggregation and chemical heterogeneity, including the use of a sizing column, dynamic light scattering, native gels and mass spectrometry. Bill suggested some rational heavy atom incorporation strategies, as an alternative to the traditional “Soak-and-Pray” method. He outlined strategies targeted primarily towards MAD phasing, because of the high quality resulting phases. The selection of K or L absorption edges and the calculation of the expected signals can provide a first assessment of whether an experiment is feasible. Bill made several useful suggestions to provide a more systematic approach to heavy atom incorporation, including the selection of heavy atom reagents, for example targeting mercurials if there are free cysteines, the introduction of additional cysteines by site-directed mutagenesis and the use of mass-spectrometry to assess whether protein derivatization has been successful prior to crystallization. Many of these approaches were tried and tested in the structure determination of syntaxin.
Elspeth Garman discussed crystal-mounting techniques for cryogenic data collection. Cooling samples to cryogenic temperatures is nowadays a universally accepted method of preventing or reducing the dose-dependent radiation damage that occurs to protein crystals during data collection. Historically, several methods of mounting frozen crystals onto the goniometer have been developed and tested. However, one method has won the widest acceptance and application in practical research. Elspeth described the method, where a crystal is soaked in a cryo-solution, containing the mother liquor and an anti-freeze agent and the crystal is then placed in a small loop made of thin fiber, where surface tension forces hold it in place. The crystal and loop are then immediately immersed in a cryogen, like liquid nitrogen. A rapid cooling rate, in the presence of antifreeze, greatly reduces the formation of ordered ice in the cryo-solution, which freezes in a glass-like phase. Elspeth described some variations in this technique, including the dialysis of the cryo-solution into the crystal mother liquor, as well as growing the crystal in mother liquor that already contains sufficient cryo-protectant. She gave a detailed description of several considerations and problems that are often encountered, including the use of a cryo-devices gas-stream for cooling crystals, the storage of frozen crystals, the possibility of combining the freezing protocol with heavy atom soaking for MAD experiments, the best way of dealing with ice accumulation during the diffraction experiment and the control of any crystal mosaicity increases. Elspeth gave recommendations on the choice of cryo-solutions and techniques for matching the osmolarity of these solutions with the mother liquor, to prevent any osmotic shock from causing the crystals to crack. Finally, Elspeth mentioned two crystal annealing procedures, which have been used by some researchers to reduce the mosaicity of frozen crystals.
The tutorial sessions on the opening day focused primarily on data collection techniques available at SSRL. The students were split into three groups and these groups did each of the tutorials in rotation. Elspeth Garman, assisted by Jessica Chiu (California Institute of Technology) and Ana Gonzalez (SSRL), helped the students practice some of the crystal mounting techniques Elspeth had described in her earlier lecture. Mike Soltis, Paul Ellis and Aina Cohen (all SSRL) gave students the opportunity to attempt xenon or krypton derivatization of protein crystals, using a specialized instrument that was developed at SSRL. Because of the accessibility of the krypton K-edge (14.3 keV), this technique provides an alternative approach to preparing suitable samples for MAD phasing experiments. Finally, Peter Kuhn gave a tour of the SSRL facility showing the students the internal components of the storage ring, as well as some of the structural biology experimental beam lines and the SSRL data collection control software used for macromolecular crystallography experiments.
The second day of the summer school was dedicated to data collection and processing. Talks were given by Paul Ellis (SSRL), Ana Gonzalez (SSRL), Elspeth Garman (University of Oxford, UK), Andrew Leslie (MRC, University of Cambridge, UK) and Phil Evans (MRC, University of Cambridge, UK). The overall session emphasized that it is worth spending some time planning a careful data collection experiment. The quality of the experimental data is crucial for both the structure solution and the refinement and most errors introduced during the data collection are difficult or impossible to correct at later stages. It was also generally pointed out that some problems with the data collection can be detected during the data processing and scaling. It is crucial, therefore, to process the data as quickly as possible, ideally while the crystal is still on the oscillation camera.
Paul Ellis gave the first lecture of the day on data collection and how to avoid common mistakes, which often result in incomplete or low quality data sets. He gave some guidelines for detecting the presence of heavy atoms from data processing statistics and recognizing split crystals by collecting at least two images at different crystal orientations. He explained how to calculate the optimal angular range for data collection and also the optimal oscillation angle for each image. Paul also gave advice on the best way to mount crystals to maximize the completeness of the data set and he emphasized the importance of maximizing the signal-to-noise ratio of the reflections.
Ana Gonzalez talked about MAD data collection. She described the effect of wavelength selection and data collection strategy on the accurate measurement of the anomalous signal and introduced several possible approaches. Ana discussed the advantages and disadvantages associated with each one, in terms of the time needed for the experiment, the accuracy of the anomalous signal measurement and the quality of the final electron density map. She also described the effect of the number of wavelengths used and the data redundancy on the quality of the experimental electron density maps.
Elspeth Garman spoke about ultra-high resolution data collection. She described how such high-resolution data can often answer questions about both static and dynamic disorder in macromolecules and their surrounding water structure. She also showed how such data can provide accurate information about the stereochemistry of proteins. Elspeth gave practical advice on high-resolution data collection and highlighted the importance of the low-resolution reflections, which must be measured during a separate, shorter exposure data collection pass. She emphasized the importance of avoiding overlapped reflections, by carefully choosing the rotation angle per image and also of monitoring the data collection , in order to detect radiation damage to the crystal.
Andrew Leslie described the data integration program MOSFLM and outlined its main functions. He explained in detail how to use the program to determine an optimal strategy and exposure time for data collection. Andrew also described the algorithm and procedures used in autoindexing, integration and profile fitting. He spent some time explaining background estimation for the image and pointed out that this is crucial in order to provide good error estimates for the data, which in turn has an impact on all subsequent stages of the structure determination process.
Phil Evans talked about the problem of scaling reflections from different frames and from different data sets. He pointed out that constraints introduced into the scaling process can often help correct empirically for systematic errors in the data. He gave a list of statistics provided by the scaling program SCALA (including R-merge, bias and normal probability plots) and he explained how to interpret them, in order to detect various problems with the data. Phil finished by discussing the particular problem of scaling MAD data and mentioned that local scaling is often useful to improve the quality of both the anomalous and the dispersive differences.
The afternoon tutorial session was dedicated to processing an example data set with MOSFLM and subsequently scaling the same data set with SCALA. The student participants were able to experiment with different options and corrections in the programs and analyze the effect on the data quality. The students were also introduced to the CCP4 graphical user interface. As part of the same tutorial session, Aina Cohen gave a presentation and tutorial on using the SSRL scripts to run MOSFLM and SCALA in a highly automated fashion. These scripts allow even the most inexperienced users to process their data during the diffraction experiment.
The third day was focused on the location of heavy atoms, which is one of the key steps in the structure determination process. Particular attention was given to the determination of the large anomalous scattering selenomethionine substructures associated with MAD phasing experiments. Talks were given by Phil Evans on basic Patterson theory and the hand interpretation of Patterson maps, Thomas Schneider (University of G÷ttingen, Germany) on automated Patterson searches as implemented in the SHELXS program and a second presentation on the half-baked direct methods procedure and Ashley Deacon (SSRL) on the Shake-and-Bake direct methods algorithm and its application to ab-initio structure determination and substructure solution.
Phil Evans gave the first lecture of the day. He introduced the basis of Patterson methods, by explaining the mathematical underpinnings of the Patterson function and by showing that the Patterson map gives the vectors between atoms. He outlined how Patterson maps can reveal heavy atom locations and also explained Patterson symmetry and Harker sections. He suggested how to calculate the best Patterson for heavy atom location and indicated the kinds of problems that are often encountered. He suggested that different resolution cut-offs should be tried and that care should be taken to exclude both unreasonably large differences and weak data from the Patterson calculation. He discussed the various approaches to solving a Patterson and illustrated them with many examples.
Thomas Schneider talked about automated methods Patterson methods,. He focused on the Patterson superposition methods implemented in the program SHELXS. He explained the super-sharp Patterson and also introduced the Patterson minimum function as a way of scoring the sites found. He explained how to interpret the “crossword” table output by SHELXS, using a 4 selenium atom example. He showed how the self- and cross-vectors coupled with the Patterson minimum function could be used to both identify and validate correct sites. Finally he introduced the SHELXD implementation of the Patterson superposition methods and explained how it could be used.
Ashley Deacon gave an introduction to direct methods of structure determination. He showed how triplet phase probability distributions could be derived from the diffraction intensity measurements and how these distributions were dependent on both the size of the structure and the magnitude of the normalized structure factors (E-values). He outlined the Shake-and-Bake dual-space direct methods algorithm, which iterates real and reciprocal space phase refinement and showed how it can be used to solve both ultra-high resolution structures of small macromolecules and also how it provides an effective approach to determining the complex sub-structures associated with large selenomethionyl proteins. He gave several examples including triclinic lysozyme and a 70 selenium atom epimerase structure determined by MAD phasing. He also suggested how both NCS symmetry and peak recycling could be used to validate the sites found in a large selenium substructure.
Thomas Schneider completed the days lecture program by detailing the so-called “half-baked” direct methods approach to structure determination, as implemented in the SHELXD program. He gave a quick overview of the information that can be extracted from a MAD phasing experiment and showed how these data can be used to derive the best coefficients for the SHELXD program. He also highlighted the importance of analyzing the correlations between the various anomalous scattering signals from each wavelength in a MAD experiment. He explained the core algorithm employed by SHELXD and showed how the solution can be scored. He gave several examples, which illustrated the power of the SHELXD approach and also showed some useful ways of assessing the data quality.
The afternoon tutorial session was dedicated to the use of the programs Shake-and-Bake, SHELXD and XtalviewXtalView.
Lectures on Friday morning were given by Eleanor Dodson (University of York, UK) on approaches to MAD phasing and a second talk on phase improvement techniques, Paul Adams (CCI, Lawrence Berkeley Laboratory) on MAD and Single-wavelength Anomalous Diffraction (SAD) phasing strategies, Thomas Terwilliger (Los Alamos National Laboratory) on automated structure solution and density modification and Anastassis Perrakis (EMBL, Grenoble, France) on automated model building and refinement.
Eleanor Dodson opened her discussion of MAD phasing at an elementary level, by explaining Friedel's law and Harker circles. She emphasized the need to have all models with the same hand and referred to the same origin. She also emphasized the importance of accurate error estimates, particularly in the case of MAD where the phase information is contained in small intensity differences. Her discussion focused on the pseudo-MIR approach to MAD phasing and the analytical (Hendrickson) method was only mentioned in passing. Eleanor discussed the use of probabilistic error-models. She pointed out that the probability models used for MIR are not valid for MAD, since the heavy atom sites are no longer independent. In fact for MAD the heavy atom sites are identical for all the data sets. She recommended the lack-of-closure error as a good measure of data quality, it is better than the figure-of-merit as the relative weighting of the data sets at different wavelengths is uncertain.
In her second talk Eleanor Dodson discussed phase improvement strategies. She showed how experimental phases, which generally contain significant errors, can be replaced with more accurate estimates. In general, this is achieved by constraining the phases to be consistent with some a priori knowledge about the molecular structure. These methods involve either the modification of the experimental electron density map or the use of additional non-crystallographic symmetry information. The map modification methods force the regions of disordered solvent between the protein molecules to be flat. New phases are calculated from the resulting map and these are either combined with, or substituted for, the previous phases. The traditional methods of solvent flattening, by directly modifying the electron density map, produce results that can incorporate errors in the original map. However, such model-bias can be largely obviated by the use of histogram matching. This automated procedure takes advantage of the fact that the density in the solvent regions of the map is essentially random, whereas the density in the protein regions is not. Eleanor discussed the use of non-crystallographic symmetry, which can provide very powerful phase constraints, however, it is first necessary to find the operators that relate the identical moieties. Eleanor suggested that, in some cases, if one unit can be found, then molecular replacement techniques can be use to locate the other ones.
Paul Adams covered the topics of MAD and SAD phasing as implemented in the CNS program package. He reviewed the procedure for selecting wavelengths for a MAD experiment and emphasized the importance of comparing the data sets from each wavelength, to see that they have the expected correlations with each other. In searching for the heavy atoms, particularly multiple selenium atoms he recommended looking for only about two-thirds of the expected number of atoms to avoid introducing spurious peaks. Paul suggested that the Patterson maps from multiple wavelengths can often be averaged to increase the signal-to-noise ratio, especially when a single difference-Patterson map isn't clearly interpretable. Correct Patterson solutions can be identified by a high correlation among the symmetry related vectors. Paul also recommended the lack of closure as a good measure of the error. He emphasized some practical considerations, such as the importance of accurately scaling the data sets to each other, preferably using an absolute scale. Such an absolute scale enables the use of measured f' and f" values and allows implausibly large anomalous and dispersive differences to be rejected. Any missing heavy atoms can readily be found from difference-Fourier maps, once a preliminary set of phases has been calculated. Paul recommended the use of log-likelihood maps, as described by Bricogne. These maps tend to have less bias than a simple difference-Fourier map. Paul reviewed density modification and emphasized the application of "flipping" (inverting) the random fluctuations in the solvent density, rather than flattening them. This approach reduces model-bias. Finally, Paul compared the MAD and SAD methods. A MAD experiment can provide phases directly, however, it can only be conducted at a synchrotron and it requires the collection of more data, whereas a SAD experiment can be carried out at the home laboratory and it requires just a single data set. However, the SAD approach requires additional density modification to resolve the phase ambiguity.
Thomas Terwilliger presented ways of finding a set of initial phases using the SOLVE program package. Partial solutions are initially found from Patterson maps and these solutions are then ranked and sorted. These solutions can then be tested and difference-Fourier maps can be used to identify new sites and complete the solutions. A key to making this work in an automated manner is the algorithm used to rank the partial solutions. Thomas described the method used in SOLVE, which checks for selfconsistent Patterson maps and examines the resulting electron density maps to see if they statistically resemble valid maps. The process of finding heavy atom sites is iterated until there is no further improvement in the score of the solution. Thomas described the program RESOLVE, which carries out solvent flattening by maximizing phase likelihood. This phase likelihood is estimated by statistically assessing the "believability" of the map. A good weighting of the experimental and calculated phases is required, together with a good estimate of the solvent content. Thomas is also currently developing software for the automatic identification of alpha helices, which will greatly assist in the recognition of correct electron density maps.
Anastassis Perrakis discussed the ARP/wARP package, which can be used to fit a model to an electron density map and also refine the phases. The procedure requires an initial electron density where the individual atoms can be resolved, since it works by placing discrete atoms into the electron density. Possible atomic positions are found by examining the value and curvature of the electron density and by assessing the distance between peaks. Chemical knowledge is enforced upon the map in the form of distance constraints. Fitting the map is essentially a pattern recognition problem and the difficulties were shown by an amusing series of drawings of a cat in various settings. A common fitting method is to find possible pairs of C atoms and to connect them into unbranched chains. Alternatively the map can be searched for regions that correspond to a tetra-peptide with good bond angles. Since the amino acid sequence is almost always known the next step is to take a tetrapeptide and to slide it along the putative backbone and to find a place where it matches the density, the sequence can then be used to fix the connectivity and also fit the side chains.
Lectures on Saturday morning were given by Duncan McRee (The Scripps Research Institute) on automatic map interpretation and a second talk on refinement and validation, Anastassis Perrakis on ARP/wARP, Thomas Schneider on atomic resolution refinement, and Paul Adams on refinement with CNS.
Duncan McRee discussed the use of XtalView for fitting a model to an electron density map. The program can semiautomatically trace ridgelines in the map, to identify the protein backbone. It can then add residues to this backbone if given a starting point and the direction of the chain. This process is semiautomatic, as the program occasionally makes mistakes. The fitting can be done iteratively as XtalView will automatically carry out solvent flattening and calculate new phases from the new atomic positions. The program will move atoms into nearby density to refine the model while fitting. The use of spectral B-splines to interpolate the map allows the maps to be calculated on a courser grid than would otherwise be required and this feature greatly accelerates the calculations.
In his second talk Duncan McRee discussed data validation during refinement. He emphasized the importance of measuring complete low-resolution data and of retaining "unobserved" data. The fact that a reflection is very weak is an important constraint on the structure. It is important to reject outliers in the data but only on valid statistical grounds, reflections should never be rejected solely because they disagree with the model. One must pay close attention to the data-to-parameter ratio in choosing what to refine. If waters have been added it is useful to sometimes delete them and recalculate the map, in this way any waters in noise peaks can be removed. At higher resolution where anisotropic temperature factors are used the thermal ellipsoids should be checked for chemically unreasonable shapes. Regions of high temperature factors may be wrongly fitted. If they have not been used in the refinement the bond torsion angles are a good check of the quality of the fit. As a rule-of-thumb the final R-factor should be less than 1/10th the maximum resolution and the free R-factor should be about 5% greater than this. However, this consideration can be complicated if there is non-crystallographic symmetry. Finally a series of maps and density histograms were presented that illustrated the effect of phase errors.
Anastassis Perrakis presented a series of examples of using ARP/wARP to automatically produce refined structures. In the case of the small protein rubredoxin diffraction data to 0.9 ┼ resolution were available and it was only necessary to provide the positions of the iron atoms and a good structure was obtained after about 20 cycles. With data truncated to 1.4 ┼ resolution it was necessary to also provide the positions of the 4 sulfur atoms in the Fe-S cluster. Xylanase, a larger protein, where 2.0 ┼ resolution data and SAS phases from a mercury atom were available, the structure refined rapidly after solvent flattening. For Leishmanolyisn, a protein of 474 residues, MIR phases to 3.2 ┼ were first extended to 2.5 ┼ resolution by solvent flattening and then to 2.0 ┼ resolution by ARP/wARP, in conjunction with automated tracing.
Thomas Schneider discussed the refinement of structures at very high resolution. The discussion was largely in terms of the observation-to-parameter ratio and the use of the SHELXL program. When anisotropic temperature factors are used there are 9 parameters/atom and if there are n conformations then each atom has 9 parameters/conformation plus n-1 parameters for the occupancy. At relatively low resolution, say 1.2 ┼, it may be necessary to restrain the temperature factors by requiring adjacent atoms to have similar values and limiting the size of the values along the direction of the bonds. Isolated waters can also be required to be reasonably spherical. Specific examples of the use of SHELXL to refine proteins at high resolution were presented. Thomas recommended the use of a "riding-hydrogen" model in structures beyond 2.0 ┼ resolution, where hydrogen atoms are attached to C, O and N atoms at a fixed distance and are not separately refined.
Paul Adams discussed refinement procedures with the CNS program. CNS minimizes a target function by moving the atoms in the model. The target function includes terms for the chemical reasonableness of the model and the agreement of the model with the observed diffraction intensities. This technique is liable to become trapped in local minima and therefore, simulated annealing is applied. Simulated annealing adds energy to all or some of the atoms in the model and allows them to move up the gradient and thus escape any local minima. It can use a model of the potential surface for the protein or it can move the atoms randomly. Unlike many refinement techniques simulated annealing has the ability to correct large errors and is therefore best used in the early stages of refinement. The dangers of over-fitting were discussed, especially in cases where there are more parameters than observations and the parameters are essentially being fitted to noise. The free R-factor is an excellent guard against over-fitting. Another method is to fix the bond lengths and angles and to refine the torsion angles. This approach reduces the number of parameters by about 10-fold and can be very useful at moderate resolution and in the initial stages of a high-resolution refinement, where it greatly increases the radius of convergence.
Content questions and comments: Ana Gonzalez and Ashley Deacon
Last modified: Friday, 29-Apr-2005 11:49:47 PDT.