This chapter deals with operations that change the configuration of a system or describe how it changes in time. Minimization and molecular dynamics produce trajectories, whereas a normal mode calculation produces a normal mode object.
Minimization and dynamics algorithms produce sequences of configurations that are often stored for later analysis. In fact, they are often the most valuable result of a lengthy simulation run. To make sure that the use of trajectory files is not limited by machine compatibility, MMTK stores trajectories in netCDF files. These files contain binary data, minimizing disk space usage, but are freely interchangeable between different machines. In addition, there are a number of programs that can perform standard operations on arbitrary netCDF files, and which can be used directly on MMTK trajectory files. Finally, netCDF files are self-describing, i.e. contain most of the information needed to interpret their contents.
A trajectory object is created by Trajectory(object, name, mode, comment), where object is the object for which the trajectory is intended, name is the filename and mode determines what can be done with the file: "r" allows only reading, "w" replaces the file by a new one that can be read and write, and "a" allows reading and appending new data, creating the file if it doesn't exist. The mode argument is optional and defaults to "r". The last argument is also optional and specifies a comment to be added to the trajectory file for identification. A trajectory is closed by calling the method close.
In most cases, trajectories are created for a universe object. If they are created for some other object (e.g. a single molecule or a collection), all its components must be parts of a single universe. The trajectory file will then contain a description of the original universe (i.e. the same geometry and force field) but with only the requested items in it. If among the requested items there are subparts of chemical objects (e.g. individual atoms of a molecule), the full chemical objects are stored, but with trajectory data only for the requested subparts.
When a trajectory is opened for reading, the universe argument can be None. In that case, MMTK creates a universe from the description contained in the trajectory file. This universe will contain the same objects as the one for which the trajectory file was created, but not necessarily have all the properties of the original universe (the description contains only the names and types of the objects in the universe, not any modifications applied to them). The universe can be accessed via the attribute universe of the trajectory.
If the trajectory was created with partial data for some of the objects, reading data from it will set the data for the missing parts to "undefined". Analysis operations on such systems must be done very carefully. In most cases, the trajectory data will contain the atomic configurations, and in that case the "defined" atoms can be extracted with the method atomsWithDefinedPositions().
MMTK trajectory files can store various data: atomic positions, velocities, energies, energy gradients etc. Each trajectory-producing algorithm offers a set of quantities from which the user can choose what to put into the trajectory. Since a detailed selection would be too tedious, the data is divided into classes, e.g. the class "energy" stands for potential energy, kinetic energy, and whatever other energy-related quantities an algorithm produces.
Every trajectory file contains a history of its creation. The creation of the file is logged with time and date, as well as each operation that adds data to it with parameters and the time/date of start and end. This information, together with the comment and the number of atoms and steps contained in the file, can be obtained with the function trajectoryInfo(filename).
It is possible to read data from a trajectory file while it is being created. For efficiency, trajectory data is not written to the file at every time step, but only approximately every 15 minutes. Therefore the amount of data available for reading may be somewhat less than what has been produced already.
Minimizers and dynamics integrators accept various optional parameter specifications. All of them are selected by keywords, have reasonable default values, and can be specified when the integrator is created or when it is called.
The trajectory output specifications always have the same form: trajectory=specification. If specification is None (the default), no trajectory file is produced. Otherwise specification must be a five-element tuple. The first element is the number of the first step to recorded. The second element is the number of the step before which the trajectory recording should be ended; None indicates writing up to the last step. The third element indicates the number of steps between two consecutive configurations to be written. The fourth element is the trajectory object. A string can be supplied instead, which is taken to be a filename, and a trajectory with that name is implicitly opened in mode "a" and closed at the end. The last element is another tuple consisting of strings which name the classes of data to be included. The classes are described below for each trajectory-generating operation.
It is also possible to have a protocol of a trajectory production printed to the screen or to a file with a parameter of the form log=specification. Printing specifications are the same as trajectory specifications, except that standard file objects are used instead of trajectory objects. For printing to the screen, use stdout. As a convenient shorthand, a protocol specification consisting of a single integer n can be used instead of (0, None, n, stdout, ("energy", )).
It is possible to specify a list of options that modify the basic algorithm. Such options are indicated by an optional parameter of the form options=[option1, option2, ...]. Each list element can either be just the name of the option, meaning that it is applied at every step, or a tuple containing the option and a frequency specification. The frequency specification can be one number (the number of steps between application of the option) or three numbers (first step, last step, frequency). The individual options are described in the individual sections below.
During the course of a minimization or molecular dynamics algorithm, the atoms move to different positions. It is possible to exclude specific atoms from this movement, i.e. fixing them at their initial positions. This has no influence whatsoever on energy or force calculations; the only effect is that the atoms' positions never change. Fixed atoms are specified by giving them an attribute fixed with a value of one. Atoms that do not have an attribute fixed, or one with a value of zero, move according to the selected algorithm.
MMTK has two energy minimizers using different algorithms: steepest descent and conjugate gradient. Steepest descent minimization is very inefficient if the goal is to find a local minimum of the potential energy. However, it has the advantage of always moving towards the minimum that is closest to the starting point and is therefore ideal for removing bad contacts in a unreasonably high energy configuration. For finding local minima, the conjugate gradient algorithm should be used.
Minimizers accept three specific option parameters: steps=integer to specify the maximum number of steps (default is 100), step_size=distance to specify an initial step length used in the search for a minimum (default is 2 pm), and convergence=gradient to specify the gradient norm (more precisely the root-mean-square length) at which the minimization should stop (default is 0.01 kJ/mol/nm).
There are three classes of trajectory data: "energy" includes the potential energy and the norm of its gradient, "configuration" stands for the atomic positions, and "gradients" stands for the energy gradients at each atom position.
SteepestDescentIntegrator(universe) creates a minimizer that uses the steepest descent algorithm; ConjugateGradientIntegrator(universe) creates a minimizer based on the conjugate gradient algorithm.
The following example performs 100 steps of steepest descent minimization without producing any trajectory or printed output:
from mmtk import * world = InfiniteUniverse(AmberForceField()) world.protein = Protein('insulin') minimizer = SteepestDescentMinimizer(world) minimizer(steps = 100)
The integration of the classical equations of motion for an atomic system requires not only positions, but also velocities for all atoms. Usually the velocities are initialized to random values drawn from a normal distribution with a variance corresponding to a certain temperature. This is done by calling the method initializeVelocitiesToTemperature(temperature) on a universe. Note that the velocities are assigned atom by atom; no attempt is made to remove global translation or rotation of the total system or any part of the system.
During equilibration of a system, it is common to multiply all velocities by a common factor to restore the intended temperature. This can done explicitly by calling the method scaleVelocitiesToTemperature(temperature) on a universe, or by an integration option as explained below.
A molecular dynamics integrator based on the "Velocity Verlet" algorithm is created by VelocityVerletIntegrator(universe). The specific option parameters are steps=integer to specify the number of steps (default 100) and delta_t=time to specify the time step (default 1 fs).
There are three classes of trajectory data: "energy" includes the potential energy and the kinetic energy, "time" stands for the time, "temperature" or "thermodynamic" stand for the temperature, "configuration" stands for the atomic positions, "velocities" stands for the atomic velocities, and "gradients" stands for the energy gradients at each atom position.
There are four options that can be used to modify the basic integration algorithm. TranslationRemover() specifies subtraction of the global translational momentum from the system, and RotationRemover() subtraction of the global angular momentum. VelocityScaler(temperature) multiplies all velocities by a global factor such that the temperature assumes the specified value; an optional second argument (defaulting to zero) indicates the temperature deviation allowed before scaling is applied. Heater(initial, final, gradient) also scales the velocities, but to a varying temperature defined by min(initial+time*gradient, final.
The following example performs a 1000 step dynamics integration, storing every 10th step in a trajectory file and removing the total translation and rotation every 50th step:
from mmtk import * world = InfiniteUniverse(AmberForceField()) world.protein = Protein('insulin') world.initializeVelocitiesToTemperature(300.*Units.K) integrator = VelocityVerletIntegrator(world, delta_t = 1.*Units.fs, log = (0, None, 100, stdout, ("energy", )), trajectory = (0, None, 10, "insulin.nc", ("time", "energy", "configuration")), options = [(TranslationRemover(), 50), (RotationRemover(), 50) ]) integrator(steps = 1000)
A snapshot generator allows writing the current system state to a trajectory. It works much like a zero-step minimization or dynamics run, i.e. it takes the same optional arguments for specifying the trajectory and protocol output. A snapshot generator is created by SnapshotGenerator(universe).
Normal modes are calculated by creating a normal mode object with NormalModes(universe), which takes an optional temperature parameter which defaults to 300 K. The temperature is used to generate an appropriate amplitude for the normal mode displacements. Note that this amplitude is totally unrealistic for lower-frequency modes due to the approximations involved in normal mode analysis; however, it is still useful for visualization. You can set the temperature to None to obtain unscaled normal modes.
A normal mode object is a sequence object, i.e. individual modes can be extracted by indexing, and standard for-loops can be used to iterate over all modes. A single mode is represented by a special vector variable object storing the temperature-scaled displacement vector and the frequency. The frequency is available via the attribute frequency.
Normal mode objects provide the method reduceToRange(first, last), which removes all modes outside the range specified. This is mainly useful when saving a normal mode object to a file, in order to reduce the file size.
The following example performs a minimization (printing the energy at every 50th step) and calculates the normal modes, prints all the frequencies, and adds the displacements of the 7th mode (the first internal mode) to the configuration:
from mmtk import * world = InfiniteUniverse(AmberForceField()) world.protein = Protein('insulin') minimizer = ConjugateGradientMinimizer(world, steps = 10000, convergence = 0.002, log=(0, None, 50, stdout, ("energy",))) minimizer() modes = NormalModes(world) for mode in modes: print mode.frequency world.setConfiguration(world.configuration()+modes[6])
A more general calculation method is offered by the class SubspaceNormalModes(universe, vectors), which calculates normal modes in the subspace spanned by a given set of displacement vectors. The argument vectors must be a sequence of displacement vectors (variable objects) that are not mass-weighted and need not be orthogonal or normalized. The first step in the calculation is the construction of an orthonormal basis for this subspace by singular value decomposition. Then the full second-derivative matrix is calculated and multiplied from both sides by the basis matrix. The resulting smaller matrix is then diagonalized, and the eigenvectors multiplied by the basis vectors to obtain atomic displacements again.
For large systems the memory requirements for the calculation of the full second-derivative matrix may be prohibitive. In this case the variant FiniteDifferenceSubspaceNormalModes(universe, vectors) can be used, which calculates the force constant matrix directly in the specified subspace by numerical (finite difference) differentiation. The offset for the finite-difference calculations can be specified by an optional third parameter; the default value is 0.0001 nm amu1/2.
Another way to deal with large systems, supposing that the interactions are short-ranged and the second-derivative matrix is therefore sparse, is the variant SparseMatrixSubspaceNormalModes(universe, vectors). This variant calculates the full Cartesian second-derivative matrix stored in a sparse-matrix format, and then calculates the reduced matrix for the subspace by multiplication with the subspace basis.