Download the data from 1_intro.tgz

Files

Reflection data:	hypF-1gxu-1gxt-HG_scaleit1.mtz
Sequence file:	hypF_Ndom.seq
acylphosphatase model:	1v3z_B.pdb

Introduction to CCP4 programs for MR

The target is an acylphosphatase-like domain of hydrogenase maturation factor HypF from E.coli, see Rosano et al., JMB, 321, 785 (2002). HypF-ACP sulphate and phosphate complexes have been deposited in PDB as 1gxt and 1gxu respectively.

Across various tutorials we will solve the hypF structure by molecular replacement, using several programs and approaches, and the native 1gxu dataset to 1.3 Å resolution in space group H32. The target has 91 residues and a Matthews calculation strongly suggests only one molecule in the asymmetric unit. In this tutorial we focus on MR using the program Molrep.

N.B. hypF-1gxu-1gxt-HG_scaleit1.mtz includes the data from 1gxu, 1gxt, the Hg derivative, and some experimental phases based on the Hg sites. Do not forget to select the correct mtz-columns (FP1gxu, SIGF SIGFP1gxu) each time you define the input mtz-file.

Checking the data

We first use Sfcheck to check a few things about the data:

Select Data Reduction and Analysis → Check Data Quality → Analysis with sfcheck to open the sfcheck task window
Enter a title
Make sure that Run Rampage to analyse structure geometry and Run Procheck to analyse structure geometry are unselected (we do not yet have any coordinates) and Run Sfcheck to analyse experimental data only is selected
In the line MTZ in select the file hypF-1gxu-1gxt-HG_scaleit1.mtz
Select the labels F=FP1gxu, SIGF=SIGFP1gxu and FreeR=FREE
Check that a suitable filename has been generated for Sfcheck Output PS
Keep all defaults, and click Run → Run Now

Sfcheck produces a postscript file with some useful things (see under View Files from Job):

Anisotropy of data (it is not very anisotropic)
Pseudo-translation not detected (from analysis of the native Patterson map)
Overall B from Wilson plot of 21.9 A^2
Also check the log file: View Files from Job → View Job Results (new style) then click the Log File tab
This includes the results of a twinning test: Perfect twinning test <I^2> / <I>^2 : 2.0573
A value of 2.0 indicates untwinned data, whereas perfectly twinned data would have a second moment of 1.5

Choice of search models

The target is an acylphosphatase-like domain. A search of the PDB reveals two acylphosphatases with a sequence identity to the target of about 31%, viz. 1v3z and 1w2i. Each has two chains in the asymmetric unit, either of which could be used as the basis of a search model.

Normally you would use something like Chainsaw at this point to prepare a search model from the template. As an exercise, we are going to try MR straightaway.

Notes on Sequence Alignment

There are many ways of approaching this, and the different tools will give slightly different results. The sequence identity depends on the definitions used (i.e. treatment of gaps and alignment length), the specific alignment technique, and whether parts have been chopped out of the model.

Molrep Run 1

We will use chain B of 1v3z as the search model.

Select the Molecular Replacement module and open the Run Molrep - auto MR task window.
Enter a title
Do Molecular Replacement should be already selected.
For Data select the file hypF-1gxu-1gxt-HG_scaleit1.mtz
Select the labels F=FP1gxu and SIGF=SIGFP1gxu
For Model select the file 1v3z_B.pdb
(Optional) You can use an upper resolution cut off of 3 Å to speed up the calculation, see folder Experimental Data
Keep all defaults, and click Run → Run Now

When the job has finished, look at the log file (View Files from Job → View Job Results (new style) → Log File tab). Note the following:

Molrep automatically estimates:
```
INFO: expected number of models :    1
INFO: V_model:   61.6% (of asymm. part of u.c.)
```
which is correct. The estimate may be unreliable when there are many monomers in the asymmetric unit, in which case it can be set explicitly with the keyword NMON (see the folder Search Options)
Molrep checks whether or not an anisotropy correction is necessary:
```
INFO: Anisotropicy will not be used
```
The first table is a list of peaks of the Cross Rotation Function (CRF), sorted according to their heights. This is followed by a plot showing which peaks are related
The second table shows the best Translation Function (TF) for each of the CRF peaks (scored according to the correlation coefficient * PKmax). Other TF solutions can be viewed by following View Files from Job → Output Files .. → <proj_dir>_<job_no>_molrep.doc
The final table gives a list of solutions, sorted according to the score
Molrep reports a contrast higher than 3.0. This contrast value suggests a correct solution

Molrep Run 2

In fact, we can make use of our knowledge of the target, and this will often improve the solution. The search model has a moderately low sequence identity with the target and therefore the majority of the side chains are incorrect. Molrep can make use of the target sequence to improve the search model.

Select the previous job, and click ReRun Job
Most of the parameters should be set correctly, but you should change the title, and the name of the Solution file, so that it is different from the first job
This time, input the target sequence file hypF_Ndom.seq in the Sequence box
Click Run → Run Now

Look at the log file of this job.

After a section about the input MTZ file, there are details of the sequence alignment between the target sequence you have supplied and the sequence of the search model (i.e. the PDB file)
Molrep reports a sequence identity of about 30%. This is lower than other estimates because Molrep is more conservative in introducing gaps into the alignment
Molrep outputs tables for the CRF and TF as before
At this point it may not be apparent that the MR solution with the search model modifications has improved. The benefits of model preparation will become clearer when we refine the solutions

Checking the solution

The positioned model can be submitted for a few cycles of automated refinement, then checked manually against 2mFo-DFc and mFo-DFc maps, using a graphics program such as Coot. Since we have a good resolution dataset, the model can also be passed to ARP/wARP for rebuilding. Refinement, validation and model re-building are covered in other tutorials

Here we will give a brief demonstration of how to refine the solution models using Refmac

Select the Refinement module and open the Run Refmac5 task window
Enter a title
We will run Refmac with defaults, i.e. restrained refinement with no prior phase information
For MTZ in provide the file hypF-1gxu-1gxt-HG_scaleit1.mtz
For PDB in provide the output file from the first Molrep job
Leave everything else at defaults. Note this will run 10 cycles of restrained refinement

When the job has completed, double-click on the job name in the job list window to open the results page. For this example, we are only interested in making a quick assessment of whether or not MR has worked. To do this we will look at the R/R-free values before and after the 10 cycles of refinement. These are listed in the Result table

Repeat the above steps using the output PDB file from the second Molrep job. Note, do not overwrite the MTZ and PDB output files from the first refinement job! Compare the R/R-free values for both jobs. You can clearly see that modifying the search model has greatly improved the results. Nevertheless, the best way to judge whether a solution is correct is to look at the electron density map. From the Refmac results page, you can launch Coot with the refined map and model loaded by clicking on the Coot button under Output Files

The Molrep solution is related to the deposited structure 1gxu by the symmetry operation -Y+²⁄₃, X-Y+¹⁄₃, Z+¹⁄₃. Comparison of the structures in CCP4mg or Coot shows that the beta sheet and one of the two helices are well matched, but there are significant differences elsewhere.

In general, if we want to compare an MR solution to the deposited structure, then we need to take into account possible symmetry operations and possible changes of origin. Two solutions may be identical, even if it is not obvious from a quick look in a graphics program. This can be checked with the csymmatch utility:

Select the Symmetry match models task in module Coordinate Utilities
Enter the MR solution PDB file as the Work PDB in, and the deposited structure (1gxu) as Reference PDB in
Select Apply origin shift and hand correction and run

The log file reports the symmetry operator and change of origin which give the best match, and a normalised score for the match is reported. The output PDB file has this transformation applied, and can be compared to the reference PDB file. Of course, usually we don't have a deposited structure to compare with, but the same process is useful to compare different MR solutions.