Blog

The Blog is a collection of News, Blog posts, technical tips and tricks and other writings from my sphere of interest.
- Esben Jannik Bjerrum

May

Scripting Molecular Mechanics Calculations using Tinker and sdf2xyz2sdf

Molecular Forcefield Energy Terms

Molecular Forcefield Energy Terms

Molecular mechanics treats molecules as ball and spring systems using classical mechanical physics. Molecules are modeled as ”balls on springs”, and the total conformational energy are described as a sum of spring constants, torsional terms, vdWalls interaction, Couloumbic interactions etc. This collection of terms are known as a force field, and enables fast conformational searching and minimization of molecular conformations, enabling scientist to get reasonable conformations of molecules. http://en.wikipedia.org/wiki/Force_field_(chemistry)

This is useful in e.g. Drug Discovery medicinal chemistry projects, for evaluating conformations of not yet synthesized and biological tested compounds, to see if their prefered conformation fits the binding conformation of e.g. known ligands, 3D QSAR models or binding sites.

In comparison with quantum mechanical calculations, which models the electrons around the nuclei of the molecule, molecular are much faster, albeit careful parameterization of the constants in the force field is necessary, to obtain accurate and realistic results. In a chemical forcefield a carbon atom are not just a carbon atom, but there exists atom types e.g. C, double bonded C, triple bonded C, aromatic C etc. The atom types are then assigned different parameters reflecting the in reality observed differences in bond lengths, angles, etc.

For biomolecules there exists a large collection of force fields and programs for doing e.g. molecular dynamics calculations, but for smaller molecules, the options are more limited. Probably because more ”interesting” atom combinations are used in small drug-like molecules, than the more limited alphabet of the natural amino-acids in proteins. More complicated atom type schemes are thus necessary to account for a larger area of chemical space.

During my Ph.D studies I used http://www.schrodinger.com/MacroModel/ available as part of Schrődingers suite of programs for my molecular mechanics needs. This is however, not freely available which can limit its usefulness for embedding into own scripts and general usage by students and chemists.

http://dasher.wustl.edu/tinker/ is a freely available, open source collection of molecular mechanics programs, that supports a variety of force fields, including MMFF, specifically designed for small drug-like organic molecules. However, the collection of scripts do not contain a feature to automatically assign the atom types from a connectivity table as e.g. found in a Mol- or SDF file, and the default way seem to be to edit the atom types by hand in the XYZ file which is the tinker-formatted file containing the Cartesian coordinates of the atoms, the connectivity and the atom types. This seriously limits the usefulness if large collections of thousand of molecules needs to be treated by scripting the calculations.

Luckily, Paolo Tosco, wrote such a tool and made it available for the public. http://sourceforge.net/projects/sdf2xyz2sdf/, which in combination with Tinker gives access to doing automated assignment of atom types for the MMFF force field and scripted calculations on large collection of compounds.

Binary Tinker packages for Linux are available on the Tinker homepage, or it can be compiled from source, and the sdf2xyz2sdf program has to downloaded directly from the files on sourceforge as the download button points directly to the openbabel_for_open3Dtools package, which do not include the sdf2xyz2sdf tool. Compiling from source was simple

Untar the tgz 
$ ./bootstrap
$ ./configure
$ make
# make install

For making a small example, open-babel was used to generate an SDF file containing glycerol (http://openbabel.org).

obabel -:"C(O)C(O)CO" -O Glycerol.sdf --gen3D --title Glycerolsdf2tinkerxyz < Glycerol.sdf

obabel takes the direct input as smiles and outputs an SDF file with the molecule title Glycerol. This is used as input in the subsequent sdf2tinkerxyz command. It can be done with the one-liner:

obabel -:"C(O)C(O)CO" -osdf --gen3D --title Glycerol | sdf2tinkerxyz

This produces the Glycerol.xyz and Glycerol.key input files for the tinker programs. The Glycerol.XYZ file contain the coordinates, atom types and connectivity and the key file contain the charges of the atoms.

The tinker programs are in their default interactive, e.g. by executing analyze, the program will ask for the xyz-file, the force field parameter file and the options for the calculation. An example session could look like this.

$ analyze 
 
 ############################################################################## 
 ############################################################################## 
 ##                                                                          ## 
 ##             TINKER  ---  Software Tools for Molecular Design             ## 
 ##                                                                          ## 
 ##                        Version 6.0   October 2011                        ## 
 ##                                                                          ## 
 ##               Copyright (c)  Jay William Ponder  1990-2011               ## 
 ##                            All Rights Reserved                           ## 
 ##                                                                          ## 
 ############################################################################## 
 ############################################################################## 
 
 
 Enter Cartesian Coordinate File Name :  Glycerol.xyz 
 
 Enter Potential Parameter File Name :  /home/esben/open3dtools/share/tinker/mmff.prm 
 
 Additional Partial Charges for Specific Atoms : 
 
      Atom              Charge 
 
         1              0.2800 
         2             -0.6800 
         3              0.2800 
         4             -0.6800 
         5              0.2800 
         6             -0.6800 
         7              0.0000 
         8              0.0000 
         9              0.4000 
        10              0.0000 
        11              0.4000 
        12              0.0000 
        13              0.0000 
        14              0.4000 
 
 The TINKER Analysis Facility can Provide : 
 
 General System and Force Field Information [G] 
 Force Field Parameters for Interactions [P] 
 Total Potential Energy and its Components [E] 
 Energy Breakdown over Each of the Atoms [A] 
 List of the Large Individual Interactions [L] 
 Details for All Individual Interactions [D] 
 Electrostatic, Inertial & Virial Properties [M] 
 
 Enter the Desired Analysis Types [G,P,E,A,L,D,M] :  G 
 
 Overall System Contents : 
 
 Number of Atoms                               14 
 Number of Molecules                            1 
 Total System Mass                        92.0490 
 
 Force Field Name :                        MMFF94 
 
 VDW Function                       BUFFERED-14-7 
 Size Descriptor                            R-MIN 
 Size Unit Type                            RADIUS 
 Size Combining Rule                       MMFF94 
 Well Depth Rule                           MMFF94 
 
 Electrostatics                    PARTIAL CHARGE

 

In a way, this is nice to get to know the program and its options, but for scripting, options can be specified by adding keywords to the .key file.

# Solvation Model 
SOLVATE GBSA 


#Force Field Parameters 
PARAMETERS /home/esben/open3dtools/share/tinker/mmffs.prm 

The analyze program can then be run non-interactively e.g.

analyze Glycerol.xyz -k Glycerol.key E > Glycerol.log

The resulting energy calculations are output in the log-file.

Total Potential Energy : 23.2894 Kcal/mole 



Energy Component Breakdown : Kcal/mole Interactions 



Bond Stretching 0.4992 13 
Angle Bending 0.5269 21 
Stretch Bend 0.0470 21 
Torsional Angle 3.3721 27 
Van der Waals 4.2014 57 
Charge-Charge 27.1685 20 
Implicit Solvation -12.5256 59 

 

For scripting, regular expressions and grep come in handy for extracting the energy in Kcal/mol.

grep -o -P 'Total Potential Energy :\s+\K([0-9]+.[0-9]+\w)' Glycerol.log

Everything is thus in place to script the calculations as all steps can be done without manual intervention.

 

Best Regards and Stay Tuned

Esben Jannik Bjerrum

www.wildcardconsulting.dk


Comment

  1. Jimmy
    November 23, 2016 at 08:35 Reply

    I need to to thank you for this very good read!! I certainly loved every little
    bit of it. I’ve got you saved as a favorite to look at new stuff you

Leave a Reply

Your email address will not be published. Required fields are marked *