How to deal with huge systems?

As in iMOD, in iMODFIT, the maximum macromolecular size allowed to perform NMA is constrained by the amount of memory needed to diagonalize Hessian and Kinetic energy matrices, and it depends on the employed architecture. For example, in a standard 32-bit linux box the maximum memory addressed per program is about 2Gb, so it can solve systems up to approximately 15000 degrees of freedom (DoF), i.e. about 7000-8000 aminoacids in proteins or 3000 nucleotids in nucleic acids. On the other hand, 64-bit machines are only limited by available RAM. For example, the diagonalization of a 50000 DoFs system would need a 64-bit computer with about 30Gb of RAM memory.
Therefore  a dimensionality reduction is mandatory when the system under study becomes huge for standard computers. To this end, we can fix some internal coordinates (ICs)  to reduce the number of degrees of freedom and fit the matrices into memory. There are three ways to accomplish this:
  • Fixing custom ICs.
  • Fixing by secondary structure.
  • Fixing ICs randomly.
Here we are going to comment the simplest way to reduce the dimensionality, which is to fix randomly some ratio of dihedral angles. Other possibilites to fix IC are discussed in the following tips, as well as in "How to fix ICS?" section (from iMOD´s tips).
To illustrate this reductionist approach we purpose the following practical examples with GroEL monomer and a HUGE viral system:
 
Fitting the open GroEL monomer structure into the closed 10Å map.

The following command will fix the 50% of available dihedral angles. Note the inter-chain rotational/translational degrees of freedom are always maintained mobile.
imodfit 1aon.pdb 1oel.ccp4 10 0 -r 0.5 -t -o imodfitR05
The fitting result ( imodfitR05_fitted.pdb) is only 1.78Å Cα RMSD from the target structure employed to simulate the map, and the final correlation was high: 0.982. The quality of fitness and the excellent secondary structure maintenance can be appreciated in the flash movie below as well as in the interactive the trajectory movie: Jmol.
Fitting the closed CCMV capsid into the swollen 10Å map.

To illustrate iMODFIT's performance on a huge system we are going to fit the closed capsid structure into a simulated 10&Aring map obtained from the swollen structure pdb. This "swelling", represents a huge concerted motion about 24.0 Å Cα RMSD where 2nm sized openings appear in the protein shell. Atomic structures were obtained from VIPERdb. Input structures are shown below.
As a consequence of both motion and system size we had to carry it out in two steps using some extra options:
imodfit 1cwp_prot.pdb ccmv_swln_10A3.ccp4 10 0 -e 0.05 --nowrmsd --6Dref_delay 20000 -i 200000 -r 1 -t -o ccmvfit
imodfit ccmvfit_fitted.pdb ccmv_swln_10A3.ccp4 10 0 -n 0.99999 --prob plain -i 300000 -r 1 -t -o ccmvfit2
In first step we added: -e 0.05 to speed up convergence, --nowrmsd to disable the Weighted-RMSD alingment (which is not suited to huge motions), and --6Dref_delay to disable the pose refinement step until the structure was close enought to the target map. Note that the fitted structure obtained in this step was still about 6 Å Cα RMSD away form target.
In a second refinement step, we used all available modes (-n 0.99999) with a different mode selection scheme (--prob plain). Since we were closer to the target structure, --nowrmsd, -e, and --6Dref_delay options were removed. Note that in both steps only the rotational/translational ICs were considered (-r 1).
Now a successful fitted structure is obtained. It's only 1.61 Å Cα RMSD from the target structure employed to simulate the map. You can check its beautiful movie below.

How to avoid Secondary Structure distortions?

To avoid Secondary Structure distortions, the SS elements can be fixed. This may result specially useful dealing with huge systems and/or experimental maps.

The easiest procedure is fixing both the IC of α-helices and β-strands. To this end add the "-S HE" option. This way the unique mobile ICs will be the coils dihedral angles and inter-segment rotations/translations:

imodfit 1aon.pdb 1oel.ccp4 10 0 -S HE -t -i 30000 -o imodfitHE

The "HE" characters introduced after "-S" parameter are the single character identifier for α-helices and "E" for β-strands. In addition, you may fix any combination of SS with as many SS identifiers as you need; for example, "−S EC" will fix beta and coil SS ICs.

By default, the SS is computed internally existing three SS types available: "H", "E" and "C" (coil); but any user defined SS can be provided using the −−ss option, as long as the SS identifiers are single characters. For example, you can use DSSP to compute SS ( 1aon.dssp ) and with the aid of this simple perl script you can convert it into our SS file format (1aon.ss).

An increase in the number of iterations (−i 30000) was needed in order to ensure convergence.

In the screen output you can check both the dimensionality ("Mobile Internal Coordinates") and diagonalization time ("NMA_time") reduction respecting to the basic command (tutorial section):

imodfit>
imodfit> Welcome to iMODFIT v1.28
imodfit>
imodfit> Model PDB file: 1aon.pdb
molinf> Protein 1 chain 1 segment 1 residues: 524 atoms: 3847
molinf> SUMMARY:
molinf> Number of Molecules ... 1
molinf> Number of Chain ....... 1
molinf> Number of Segments .... 1
molinf> Number of Groups ...... 524
molinf> Number of Atoms ....... 3847
molinf>
imodfit> Coarse-Graining model: Full-Atom (no coarse-graining)
imodfit> Selected model number of residues: 524
imodfit> Selected model number of (pseudo)atoms: 3847
imodfit> Target Map file: 1oel.ccp4
imodfit> Best filtration method: 2 FT(x10)=0.100s Kernel(x10)=0.080s
imodfit> Number of Inter-segment coords: 0 (Rot+Trans)
imodfit> Number of Internal Coordinates: 1033 (Hessian rank)
imodfit> Input CG-model Fixed Internal Coordinates: 668
imodfit> Input CG-model Mobile Internal Coordinates (size) = 365
imodfit> Range of used modes: 1-73 (20.0%)
imodfit> Number of excited/selected modes: 1(nex)
imodfit>
imodfit> Iter score Corr. NMA NMA_time
imodfit> 0 0.336409 0.663591 0 0.75 sec
imodfit> 199 0.326737 0.673263 1 0.76 sec
imodfit> 384 0.286446 0.713554 2 0.68 sec
.................................................
imodfit> 8850 0.032497 0.967503 17 0.71 sec
imodfit> 30000 0.024034 0.975966 END
imodfit>
imodfit> Movie file: imodfitHE_movie.pdb
imodfit> Final Model: imodfitHE_fitted.pdb
imodfit> Score file: imodfitHE_score.txt
imodfit> Log file: imodfitHE.log
imodfit>
imodfit> Success! Time elapsed 00h. 09' 59''
imodfit> Bye!

The fitting result (  imodfitHE_fitted.pdb) is only 1.99Å Cα RMSD from the target structure , and the final correlation was high: 0.976. The quality of fitness and the excellent secondary structure maintenance can be appreciated in the flash movie below. In addition you can play interactively the trajectory movie in   Jmol.

How to fix domains?

Any domain/s in our macromolecule can be fixed using the −f option. This way both dimensionality will be reduced and undesired distortions will be avoided. For illustrative purposes we will fix two GroEL domains.

GroEL is composed of three domains: apical (top red), intermediate (cyan), and equatorial (bottom red). The apical domain interacts with folding intermediates and GroES, the equatorial domain hydrolyzes ATP, and the intermediate domain is flexible and connects these apical and equatorial domains.

To obtain a successful fitting the structure should be maintained as flexibile as possible, i.e. if we fix very much the structure will be very rigid and the target conformation will not be addressed. To this end the apical and equatorial domains will be fixed while keeping the intermediate domain fully flexible. In the mask file imodfitDOM.fix only the ICs belonging to regions of the intermediate domain are keept mobile: 136-193 and 331-409 (indexed from 1 to 524). Further information about the mask file generation is provided in iMod tips section (to obtain a dummy mask (fully mobile) use the command: imode 1aon.pdb --save_fixfile -o imodfitDOM)

imodfit 1aon.pdb 1oel.ccp4 10 0 -i 50000 -t -f imodfitDOM.fix -o imodfitDOM

The output is the following:

imodfit>
imodfit> Welcome to iMODFIT v1.28
imodfit>
imodfit> Model PDB file: 1aon.pdb
molinf> Protein 1 chain 1 segment 1 residues: 524 atoms: 3847
molinf> SUMMARY:
molinf> Number of Molecules ... 1
molinf> Number of Chain ....... 1
molinf> Number of Segments .... 1
molinf> Number of Groups ...... 524
molinf> Number of Atoms ....... 3847
molinf>
imodfit> Coarse-Graining model: Full-Atom (no coarse-graining)
imodfit> Selected model number of residues: 524
imodfit> Selected model number of (pseudo)atoms: 3847
imodfit> Target Map file: 1oel.ccp4
imodfit> Best filtration method: 2 FT(x10)=0.090s Kernel(x10)=0.080s
imodfit> Number of Inter-segment coords: 0 (Rot+Trans)
imodfit> Number of Internal Coordinates: 1033 (Hessian rank)
imodfit> Input CG-model Fixed Internal Coordinates: 760
imodfit> Input CG-model Mobile Internal Coordinates (size) = 273
imodfit> Range of used modes: 1-54 (19.8%)
imodfit> Number of excited/selected modes: 1(nex)
imodfit>
imodfit> Iter score Corr. NMA NMA_time
imodfit> 0 0.336409 0.663591 0 0.73 sec
imodfit> 199 0.324457 0.675543 1 0.50 sec
imodfit> 379 0.285427 0.714573 2 0.49 sec
.................................................
imodfit> 12857 0.080700 0.919300 14 0.50 sec
imodfit> 50000 0.064124 0.935876 END
imodfit>
imodfit> Movie file: imodfitDOM_movie.pdb
imodfit> Final Model: imodfitDOM_fitted.pdb
imodfit> Score file: imodfitDOM_score.txt
imodfit> Log file: imodfitDOM.log
imodfit>
imodfit> Success! Time elapsed 00h. 16' 18''
imodfit> Bye!

The solution is only 3.77 Å Cα RMSD from the target structure used to simulate the map. This is a good result for a structure with 75% of its ICs fixed. Note that, to obtain a good result with such a fixed structure, the number of iterations was increased (-i 50000). If you increase the number of iterations even more you will reach lower RMSD.

You can play interactively the trajectory movie in Jmol, check the fitted structure imodfitDOM_fitted.pdb and watch the flash movie below. For illustrative purposes the fixed domains were colored in red.