iMODFIT Tutorial

This tutorial is a practical guide for learn how to flexibly fit atomic models into low/medium resolution EM maps with IMODfit. We included some working examples with simple instructions, as well as a benchmark set. Before you go through the tutorial, first get the corresponding files from download zone. This tutorial is divided in three parts:

 

Basic Flexible Fitting I. From open to close


First download, uncompress and untar the corresponding tutorial file. You will have all the necessary files to follow the tutorial. We recommend to create a new working directory, so all the output files will be stored on it.

To illustrate the basic procedure and the method performance, we are going to fit the high resolution structure of GroEL monomer in an open state (  1aon , cyan) into a 10Å resolution simulated EM closed map (grey) obtained from the closed structure (  1oel , yellow).

F I T T I N G
(open −−> closed)

To perform the fitting just type at the command prompt:

imodfit 1aon.pdb 1oel.ccp4 10 0 -t

where 1aon.pdb is the initial structure, 1oel.ccp4 is the target map, 10 is the resolution in Angstroms, and 0 is the density cutoff to take into account map densities. The −t option enables the output of the PDB movie with the fitting trajectory. Here is the screen output:

imodfit>
imodfit> Welcome to iMODFIT v1.28
imodfit>
imodfit> Model PDB file: 1aon.pdb
molinf> Protein   1 chain  1   segment  1 residues: 524 atoms: 3847
molinf> SUMMARY:
molinf> Number of Molecules ... 1
molinf> Number of Chain ....... 1
molinf> Number of Segments .... 1
molinf> Number of Groups ...... 524
molinf> Number of Atoms ....... 3847
molinf>
imodfit> Coarse-Graining model: Full-Atom (no coarse-graining)
imodfit> Selected model number of residues: 524
imodfit> Selected model number of (pseudo)atoms: 3847
imodfit> Target Map file: 1oel.ccp4
imodfit> Best filtration method: 2 FT(x10)=0.090s Kernel(x10)=0.080s
imodfit> Number of Inter-segment coords: 0 (Rot+Trans)
imodfit> Number of Internal Coordinates: 1033 (Hessian rank)
imodfit> Range of used modes: 1-206 (19.9%)
imodfit> Number of excited/selected modes: 4(nex)
imodfit>
imodfit>  Iter     score     Corr. NMA   NMA_time
imodfit>     0  0.336409  0.663591   0   4.33 sec
imodfit>   157  0.319447  0.680553   1   4.52 sec
.................................................
imodfit>  4219  0.024751  0.975249  17   4.15 sec
imodfit> 10000  0.017593  0.982407            END
imodfit>
imodfit> Movie file:                              imodfit_movie.pdb
imodfit> Final Model:                            imodfit_fitted.pdb
imodfit> Score file:                              imodfit_score.txt
imodfit> Log file:                                      imodfit.log
imodfit>
imodfit> Success! Time elapsed  00h. 04' 04''
imodfit> Bye!

The flexibly fitted structure is:   imodfit_fitted.pdb.

iMODFIT also outputs the following files: 

  • imodfit_movie.pdb --> fitting trajectory
  • imodfit_score.txt --> score file to check for convergence
  • imodfit.log--> used command log

Below some fitting trajectory snapshots (cyan) are represented simultaneously with the target structure (yellow). The final snapshot with the fitted structure is shown on the right.

The fitting result is only 1.78Å Cα RMSD from the target structure, and the final correlation was high: 0.982. The quality of fitness and the excellent secondary structure maintenance can be appreciated in the flash movies below (front and rear views in left and right, respectively). Note that in none case any secondary structure constraint was used.

For visualizing the results use your favorite program. To play the trajectory movie ( imodfit_movie.pdb ) we recomend VMD; but you can see it in   Jmol. The images and the movie were created using VMD with the POVray rendering engine.

Once you have ran iMODFIT you should check for convergence. To this end just plot the score (i.e. 1−correlation) as function of the iteration number using GNUplot with the imodfit_score.txt file:

> gnuplot -persist
> gnuplot> plot "imodfit_score.txt" u 1:2 w l
> gnuplot> exit

Note(1): If the the slope is not approximatelly horizontal at 5000−10000 you should run iMODFIT again to increase the number of maximum iterations (−i option). Alternatively you can continue the fitting process introducing the final fitted structure as initial model.

Note(2): The conformational refinement is an stocastic process; thus you should not expect to obtain exactly the same screen output and the same results between different runs.

 

Basic Flexible Fitting II. From close to open


iMODFIT allows to open "closed" structures as well. To perform this fitting just try the following command:

imodfit 1oel.pdb 1aon.ccp4 10 0 -t -o imodfit2

Front and rear views of the fitting results are shown in the flash movies below. The initial closed structure, 1oel, is shown in cyan, the target open one, 1aon, in yellow, and its corresponding open map in grey. The −o option is added to avoid overwriting previous results.

The final structure is again very close to the target, i.e. 1.74Å Cα RMSD (corr.=0.985). As in above case, the quality of fitness and the excellent Secondary Structure maintenance are evident. Also, you can observe the trajectory interactively:   Jmol.

 

The 52 proteins benchmark


To test our methodology we build a benchmark formed by 52 simulated fitting problems, comprising a wide variety of macromolecular motions. To this end we downloaded from the molecular motions database MolMovDB 26 different protein pairs with displacements greater than 2Å Cα RMSD and sizes ranging from 100 to 1000 aminoacids. Only those structures with less than 3% Ramachandran outliers (Molprobity) and without broken chains and missing atoms were admitted. The average displacement was 6.9Å with a standard deviation of 3.9Å. Note that each open/closed protein pair represents two different fitting problems, i.e. from open to closed and vice versa. It can be downloaded either from the iMODFIT "Install" page or directly from here.

Open Closed Motion Name   #Residues Cα RMSD (Å)
1l5e 1l5b [D-h-2] Cyanovirin-N   101 8.85
1e7xA 1dzsB [F-?-2] Virus MS2 coat protein   129 3.57
1cfd 1cfc [D-h-2] Calmodulin   148 10.22
1cbuB 1c9kB [F-s-2] Adenosylcobinamide Kinase   180 3.52
1ex6 1ex7 [D-h-2] Guanylate Kinase (GDK)   186 4.58
4ake 1ake [D-h-2] Adenylate Kinase (ADK)   214 8.26
1gggA 1wdnA [D-h-2] Glutamine Binding Protein   220 5.34
2lao 1lst [D-h-2] Lysine/Arginine/Ornithine (LAO) binding protein   238 8.67
1urp 2dri [D-h-2] Ribose Binding Protein   271 7.69
1ram 1leiA [D-?-2] NF-kappa B   273 4.96
5at1 8atc [S-a-2] Aspartate Carbamoyltransferase   310 2.41
1ckmA 1ckmB [D-h-2] mRNA capping enzyme   317 4.35
3dap 1dap [D-h-2] Diaminopimelic Acid Dehydrogenase   320 5.81
1bp5 1a8e [D-h-2] Transferrins (N-terminal lobe)   329 12.16
1jqj 2pol [D-s-2] Beta DNA polymerase III   366 2.81
1omp 1anf [D-h-2] Maltodextrin Binding Protein (MBP)   370 7.23
8adh 6adh [D-s-2] Alcohol Dehydrogenase (ADH)   374 2.43
9aat 1ama [D-s-2] Aspartate Amino Transferase (AAT)   401 2.21
1bnc 1dv2 [D-h-2] Biotin carboxylase   452 5.38
1rkm 2rkm [D-h-2] Oligopeptide-binding protein   517 5.77
1sx4 1oel [C----] GroEL   524 15.83
1i7d 1d6m [D-?-2] DNA Topoisomerase III   620 4.16
8ohm 1cu1 [D-h-2] HCV Helicase   645 6.06
1lfg 1lfh [D-h-2] Lactoferrin (LF)   691 8.08
1ih7 1ig9 [C----] Phage Rb69 DNA Polymerase   903 7.16
1su4 1t5s [C----] Calcium ATPase pump   994 17.99

MolMovDB's Motion Types:

  1. Motions of Fragments Smaller than Domains
    • [F-s-2] --> I.A. Motion is predominantly shear
    • [F-h-2] --> I.B. Motion is predominantly hinge
    • [F-?-2] --> I.C. Motion can not be fully classified at present
    • [F-n-2] --> I.D. Motion is not hinge or shear
  2. Domain Motions
    • [D-s-2] --> II.A. Motion is predominantly shear
    • [D-h-2] --> II.B. Motion is predominantly hinge
    • [D-?-2] --> II.C. Motion can not be fully classified at present
    • [D-n-2] --> II.D. Motion is not hinge or shear
    • [D-f-2] --> II.E. Motion involves partial refolding of tertiary structure
  3. Larger Movements than Domain Motions involving the Movement of Subunits
    • [S-a-2] --> III.A. Motion involves an allosteric transition
    • [S-n-2] --> III.B. Motion does not involves an allosteric transition
    • [C----] --> C----. Complex Protein Motions