The Good, the Bad and the Ugly RDKit molecules

Esben Jannik Bjerrum/ October 28, 2015/ Blog, Cheminformatics, RDkit/ 3 comments

good-bad-ugly_Molecules
Rdkit is a nice cheminformatics toolkit with python bindings. Wildcard Pharmaceutical Consulting have over the years used it a lot for a couple of different projects in Python Programming. However, RDKit strives to ensure that the molecules created makes chemical sense, which can be a show stopper when working with large Sdfiles from various sources. OpenBabel is not so picky with molecules and can be used visualizing and trouble shooting ”Broken” molecules.
But it is possible to load ”unsanitizable” molecules into Rdkit molecular objects and then visualise them as the following python prompt example show. The molecule is created from a senseless smiles string, but could as well have been from a large SD file that needed to be automatically curated and standardized before developing a QSAR model or loading into a Chemical Database.

Python 2.7.3 (default, Jun 22 2015, 19:33:41)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from rdkit import Chem
>>> from rdkit.Chem import Draw
>>> mol = Chem.MolFromSmiles("c1ccccc1(C)(C)")
[10:39:29] Can't kekulize mol

Rdkit doesn’t think penta valent carbon is a sensible idea and can’t kekulize the molecule. Me neither, but sometimes users by accident draw a methyl to much on a aromatic ring, so this is sometimes encountered in the wild and is a complete show stopper for the python script. But we can ask Rdkit NOT to Sanitize the molecule.

>>> mol = Chem.MolFromSmiles("c1ccccc1(C)(C)", sanitize=False)
>>> Draw.MolToFile(mol, "BadMolecule.png",kekulize=False)

and there it is, our “Bad” molecule…..
BadMolecule
 
More advanced work will sometime require the molecule to have an updated property cache.

>>> mol.UpdatePropertyCache(strict=False)
Share this Post

3 Comments

  1. Great article. I will be going through some of these issues as well..

    1. Thank you for commenting. Glad it was useful to you.

  2. Pingback: Google

Leave a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>
*
*