DALAI-GA User Guide

Once the program is downloaded. Now you are ready to reconstruct the shape and dimensions of your problem macromolecule. This section gives a brief overview of the necessary steps to run DALAI-GA. Nevertheless we recommend to follow the tutorial with the given examples.

Input files

The current version of the program is designed to work with three input files: 

  • Parameter File (dalai_ga.ini)
  • SAXS normalized profile (FILEINPUT)
  • Initial model search space (MODELINI, optional)

All you need to do is to fill three proper input files put them in the directory where you want to have the output and launch the program.

SAXS profile

This is the filename of your data set. i.e. the target intensity profile data set from a monodisperse solution.   The format is a two column ASCII file.  Each lines give the S-vector modulus (defined as the reciprocal of the Bragg spacing i.e. 2 (sin(theta)/lambda, in reciprocal Anstromg units ) and Intensity values, blank separated. It must be normalized to 1.0  at S=0  (WARNING: THE POINT AT S=0 WITH I=1 SHOULD NOT BE INCLUDED). Click here to get an example of a target profile used in the tutorial. If the scattering vector values are in Q units (2pi S), they must be transformed into S units. 

Initial search space

An initial model composed of closely packed identical spheres, in where the GA will search a mass distribution compatible with the target SAXS profile. This file is in PDB format with a 2-line header. Click here to see an example of an ellipsoid of 150x100x100 composed of  162 beads of 7A radius. The program generates this initial search space  with the following input parameters file.

Parameter file

The input parameter file is called dalai_ga.ini  and it includes all the necesary  parameters and the input file names. The following table gives a detail description o such parameters:  

Parameter   Description Example Value
FILEINPUT The SAXS profiles filename. An example is shown for the target profile used in the tutorial. 2bb2.int
FILEMODEL The search space bead model in PDB format. Here you can find an example L162.pdb
ELLSIZE DALAI_GA can generate an initial search space, a triaxial ellipsoid with the  dimensions given in this parameters. In this case, FILEMODEL mus be NONE.  150 100 100
RADIUS The radius of the beads that will compose your ellipsoid.  As initial guest select a size radius that produce initial models formed by less than 200-300 beads. DALAI_GA will refine the model later with smaller balls.  7.0

Once a best model of given bead size is found a new GA search with a smaller beads is . This iterative process is repited till 3A beads. This parameters gives reduction of the radius for the next step. The default value is 1. 

END_R Minimal size radius used 3.0
MASS+/- When the program starts a random initial population is generated, the maximum and minimum number of beads of the starting models is specified here. 40 10
RG+/-  The target radius of gyration and the corresponding allowed deviation only in the initial generation. 30 5
NL/NC The speed of convergence of the algorithm, the smaller the values, the faster it converges. The larger the values the safer your results will be. 10 100 200 500
DISPLAY Defines level of diagnostic messages, 1 means standard verbosity mode and 0 non messages. 1

 Output files

DALAI_GA will produce two kind of output files (plus some rubbish you shouldn't care about): the structure of the best models plus the experimental and calculated SAXS profiles. They will have the name bestxxxx plus the following extensions where xxxx is the  generation number.

*.dat -> Files with extension dat correspond to the calculated and experimental  SAXS profiles.
            It has five columns, that correspond:
            1) S values.
            2) Normalized Intensities of the model.
            3) Experimental normalized intensities.
            4) Normalized intensities of the model in logarithmic scale.
            5) Experimental normalized intensities in logarithmic scale.  

*.pdb -> bead model in PDB format. You can easily visualise it with rasmol (using spacefill command) or grasp (making a molecular surface).

Every 1000 steps the best model configurations are saved in the best_(nº generations). dat or .pdb, the current best configuration is stored in best.dat and best.pdb files. The best model obtained at a given resolution is saved as well.

Practical advices

1) The better the data the better the model.

All ab initio shape determination strongly depends on SAXS data quality, so:

  • Repeat measurements to reduce the noise.
  • Measurements with different cameras lenghts are recommend to extend the maximum S vector, an therefore the resolution.
  • Accurate buffer subtraction. Make sure that the product I(S) x S^2 decays at higher angles and that I(S)x S^4 follows Porod's law
  • The data must be normalized to 1 at S=0. The use of GNOM is recommended to obtain the value of I(0) from the SAXS profile.  
  • Errors in the normalization & buffer subtraction could produce failures and strong deviations in the mass estimation.
  • SAXS at low angle is very sensitive to inhomogenities in the sample. A small part of this region may be edited if it is suspected that it has been affected by this problem, without much loss of shape information. Different experimental and model Rg values normally indicate errors in this part of the profile.

2) Reduce the computation time.

The first step of the search is define the initial configurational search space. As initial trial, an ellipsoid with  an hexagonal packing of beads with dimensions at least 30 A bigger than the maximal dimension of the particle, Dmax as obtained from the SAXS profile, is advisable (for Dmax, use GNOM).The size of the spheres must be in compatible with the maximum scattering vector, but even a nice data set should start with relatively large spheres (the number of spheres of the initial models has to be between 20-40 for optimal results, the size of the beads of the initial configurational space can be stimated from the Mw in dalton from figure 2 of Chacon et al. J.Mol. Biol. 299, 1289-1302, (2000) ). DALAI-GA will take the data up to this resolution and will use the rest when the model is refined. Run several initial trials, with low values of NL/NC 10 100 200 500, and no more than 500 iterations of DALAI_GA. Look at the dimensions and the number of spheres of the models to check the initial search models. Good performance can be obtained if the number of beads is the search space is less than ten times the number of beads in the resulting models.

3) Be sure of reproducibility.

Once having a good initial search space, run DALAI-GA at least 10 times. The stochastic character of the method and the intrinsic degeneracy of the inverse scattering problem results to final different models. In practice, all of them should correspond to a given shape and dimensions. Be patient enough to manually superimpose the models. Enantiomorphic/symmetric solutions may appear with small bead numbers.

Run at least one DALAI_GA with safe NL/NC parameters 500, 1000, 1000, 2000.

4) Do not try too far (too small bead size)

Remember that the bead size must be in agreement with the S range. So, do not try too small spheres if the data does not have sufficient resolution; in this case the final models will diverge due to overfitting.