Constructing a molecular system

The construction of a complete system for simulation or analysis involves some or all of the following operations:

MMTK offers a large range of functions to deal with these tasks.

These are the topics treated in this chapter:


Creating chemical objects

Chemical objects (atoms, molecules, complexes) are created from definitions in the database. Since these definitions contain most of the necessary information, the subsequent creation of the objects is a simple procedure.

All objects are created by their class name with the name of the definition file as first paramter. Additional optional parameters can be specified to modify the object being created. The following optional parameters can be used for all object types:

Some examples with additional explanations for specific types:

Proteins, peptide chains, and nucleotide chains

MMTK contains special support for working with proteins and nucleic acids. As described in the chapter on the database, proteins can be described by a special database definition file. However, sometimes it is useful to create protein objects directly in an application program.

Protein('insulin') creates a protein object for insulin from a database file. Protein(list_of_molecules) creates a protein directly from a list of molecules, which will normally be peptide chains. The process of creating a protein from peptide chains includes the automatic creation of disulfide bridges, both within chains and between chains.

Peptide chains are created from a sequence of peptide chains by PeptideChain(list_of_residues). The residue list can be a PDBPeptideChain object (see the documentation of the PDB module), a list of three-letter residue names, or a string consisting of one-letter residue names. The optional parameter model (default is 'all') specifies the model used for the residues. The currently implemented models are

The optional parameter n_terminus=[0/1] (default is 1) specifies whether the chain starts with an N-terminus. The optional parameter c_terminus=[0/1] (default is 1) specifies whether the chain ends with a C-terminus.

The construction of nucleotide chains is very similar. The constructor is called NucleotideChain and the residue list can be either a PDBNucleotideChain object (see the documentation of the PDB module) or a list of two-letter residue names. The first letter of a residue name indicates the sugar type ('R' for ribose and 'D' for desoxyribose), and the second letter defines the base ('A', 'C', and 'G', plus 'T' for DNA and 'U' for RNA). The optional parameters defining the models are the same as for peptide chains, except that the model "calpha" does not exist. The parameters that describe the presence of termini are terminus_5 and terminus_3.

Most frequently proteins and nucleotide chains are created from a PDB file. The PDB files often contain solvent (water) as well, and perhaps some other molecules. MMTK provides two convenience functions for extracting all that information from PDB sequences:

The following example shows how all information from a PDB file can be used to build a corresponding system in MMTK:

# Read the PDB file, producing a PDB sequence object
sequence = PDBFile('something.pdb').readSequenceWithConfiguration()

# Create the objects for the peptide chains
chains = map(PeptideChain, sequence)

# Create the protein object
protein = Protein(chains)

# Create a collection of water molecules
water = waterFromPDBSequence(sequence)

# Create an atom collection corresponding to everything else
other = unknownResiduesFromPDBSequence(sequence)

Collections

Often it is useful to treat a collection of several objects as a single entity. A good example is a large number of solvent molecules surrounding a solute. MMTK has special collection objects for this purpose.

Collection() creates an empty collection. An optional argument specifies objects to be put into the collection. This argument can take several different forms: a chemical object, another collection, or a list of chemical objects and/or other collections.

It is also possible to add objects to a collection after it has been created. This is done by calling the method addObject(object) on the collection. The object to be added can be a chemical object, another collection, or a list of chemical objects and/or other collections.


Force fields

MMTK has only one built-in force field, which is the Amber force field. It has two optional parameters, one for specifying how the van-der-Waals interactions are calculated, and an analogous one for electrostatic interactions. Both parameters default to None. An instance of the Amber force field is created by AmberForceField(vdw_options, electrostatic_options).

Each of the arguments can be

The dictionary must have an entry "method", whose value is a string that specifies the calculation method. Currently the permitted values are "direct" (explicit summation over all pairs, using the minimum-image convention in case of periodic universes), and "cutoff" (like "direct", but with a cutoff value specified by a dictionary entry "cutoff"). Note that a cutoff does not modify the interactions for short distances; it only sets the interactions for pairs with a distance above the cutoff to zero.

For electrostatic interactions, there is the additional method "multipole", which uses the DPMTA library for calculating electrostatic interactions. Note that this method provides only energy and forces, but no second-derivative matrix. There are many optional dictionary entries for this method, all of which are set to reasonable default values. The entries are "spatial_decomposition_levels", "multipole_expansion_terms", "use_fft", "fft_blocking_factor", "macroscopic_expansion_terms", and "multipole_acceptance". For the precise meaning of these options, refer to the DPMTA manual.

If a number is passed instead of a dictionary, it is interepreted as a cutoff value and the method is set to "cutoff". A value of None indicates the default method, which is "direct" for van-der-Waals interactions and for electrostatic interactions in non-periodic universes, and "multipole" for electrostatics in periodic universes.


Creating universes

A universe describes a complete molecular system consisting of any number of chemical objects and a specification of their interactions (i.e. a force field) and surroundings: boundary conditions, external fields, thermostats, etc. Universes are created empty; the chemical objects are then created one by one and added to the universe.

InfiniteUniverse(force_field) creates an infinite, i.e. unbounded, universe. OrthorhombicPeriodicUniverse((a, b, c), force_field) creates an orthorhombic periodic universe with lattice constants a, b, and c; the special case of a cubic elementary cell is created by CubicPeriodicUniverse(a, force_field). An optional parameter name=string specifies a name for all types of universe.

Two types of objects can be added to a universe: chemical objects and collections. There are two ways to add an object to a universe: calling the method addObject(object) will simply add the object, whereas assigning the object to an attribute of a universe (for example universe.protein = Protein('insulin')) also creates an attribute by which the added object can be referred to.

It is also possible to remove objects from a universe by calling the method removeObject(object). The object to be removed can be a chemical object (which must be in the universe, otherwise an exception will be raised) or a list or collection, in which case all its elements will be removed.


Referring to objects and parts of objects

Most MMTK objects (in fact all except for atoms) have a hierarchical structure of parts of which they consist. For many operations it is necessary to access specific parts in this hierarchy.

In most cases, parts are attributes with a specific name. For example, the oxygen atom in every water molecule is an attribute with the name "O". Therefore if w refers to a water molecule, then w.O refers to its oxygen atom. For a more complicated example, if m refers to a molecule that has a methyl group called "M1", then m.M1.C refers to the carbon atom of that methyl group. The names of attributes are defined in the database.

Some objects consist of parts that need not have unique names, for example the elements of a collection, the residues in a peptide chain, or the chains in a protein. Such parts are accessed by indices; the objects that contain them are Python sequence types. This means that the following operations are possible:

Peptide and nucleotide chains also allow the operation of slicing: if p refers to a peptide chain, then p[1:-1] is a subchain extending from the second to the next-to-last residue.

The structure of peptide and nucleotide chains

Since peptide and nucleotide chains are not constructed from an explicit definition file in the database, it is not evident where their hierarchical structure comes from. But it is only the top-level structure that is treated in a special way. The constituents of peptide and nucleotide chains, residues, are normal group objects. The definition files for these group objects are in the MMTK standard database and can be freely inspected and even modified or overriden by an entry in a database that is listed earlier in MMTKPATH.

Peptide chains are made up of amino acid residues, each of which is a group consisting of two other groups, one being called "peptide" and the other "sidechain". The first group contains the peptide group and the Cα and Hα atoms; everything else is contained in the sidechain. The Cα atom of the fifth residue of peptide chain p is therefore referred to as p[4].peptide.C_alpha.

Nucleotide chains are made up of nucleotide residues, each of which is a group consisting of two or three other groups. One group is called "sugar" and is either a ribose or a desoxyribose group, the second one is called "base" and is one the five standard bases. All but the first residue in a nucleotide chain also have a subgroup called "phosphate" describing the link between residues.


Analyzing and modifying coordinates

When a molecule is created, there are two ways to influence the coordinates of its atoms: the relative positions of the atoms can be chosen using the configuration parameter from a choice of configurations in the molecules database definition, and the absolute position of the center of mass can be set with the position parameter. More coordinate operations can be applied after the molecule's creation.

All the operations below can be applied to all objects for which they make sense, i.e. to atoms and objects consisting of atoms. This includes not only groups, molecules, etc., but also universes and collections. All operations are methods to be called on an object.

The first group of operations calculate quantities that depend on the atomic coordinates. They all take as an optional argument the universe configuration (see below) for which they are to be calculated; the default is the current configuration.

The second group of operations is used to move the atoms of an object. They always act on the current configuration.

The last group of operations compares two configurations. The first one must be given explicitly, the second one is optional and defaults to the current configuration (see below).

Configurations

A configuration object specifies all atomic coordinates of a universe. Every universe has a current configuration, which is what all operations act on by default. It is also the configuration that is updated by minimizations, molecular dynamics, etc. The current configuration can be obtained by calling the method configuration(). Configuration objects can indexed with a atom object to obtain this atoms position.

There are two ways to create configuration objects: by making a copy of the current configuration (with copy(universe.configuration()), or by reading a configuration from a trajectory file.

Transformations

Transformation objects specify a general displacement consisting of a rotation around the origin of the coordinate system followed by a translation. Transformation objects corresponding to pure translations can be created with Translation(displacement); transformation objects describing pure rotations with Rotation(axis, angle) or Rotation(rotation_matrix). Multiplication of transformation objects returns a composite transformation.

The translational component of any transformation can be obtained by calling the method translation(); the rotational component is obtained analogously with rotation(). The displacement vector for a pure translation can be extracted with the method displacement(), a tuple of axis and angle can be extracted from a pure rotation by calling axisAndAngle().


Table of contents