Blog

The Blog is a collection of News, Blog posts, technical tips and tricks and other writings from my sphere of interest.
- Esben Jannik Bjerrum

Oct

The Good, the Bad and the Ugly RDKit molecules

good-bad-ugly_Molecules

Rdkit is a nice cheminformatics toolkit with python bindings. Wildcard Pharmaceutical Consulting have over the years used it a lot for a couple of different projects in Python Programming. However, RDKit strives to ensure that the molecules created makes chemical sense, which can be a show stopper when working with large Sdfiles from various sources. OpenBabel is not so picky with molecules and can be used visualizing and trouble shooting ”Broken” molecules.

But it is possible to load ”unsanitizable” molecules into Rdkit molecular objects and then visualise them as the following python prompt example show. The molecule is created from a senseless smiles string, but could as well have been from a large SD file that needed to be automatically curated and standardized before developing a QSAR model or loading into a Chemical Database.

Python 2.7.3 (default, Jun 22 2015, 19:33:41)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from rdkit import Chem
>>> from rdkit.Chem import Draw
>>> mol = Chem.MolFromSmiles("c1ccccc1(C)(C)")
[10:39:29] Can't kekulize mol

Rdkit doesn’t think penta valent carbon is a sensible idea and can’t kekulize the molecule. Me neither, but sometimes users by accident draw a methyl to much on a aromatic ring, so this is sometimes encountered in the wild and is a complete show stopper for the python script. But we can ask Rdkit NOT to Sanitize the molecule.

>>> mol = Chem.MolFromSmiles("c1ccccc1(C)(C)", sanitize=False)
>>> Draw.MolToFile(mol, "BadMolecule.png",kekulize=False)

and there it is, our “Bad” molecule…..

BadMolecule

 

More advanced work will sometime require the molecule to have an updated property cache.

>>> mol.UpdatePropertyCache(strict=False)

Comment

  1. Anonymous
    July 26, 2016 at 18:18 Reply

    Great article. I will be going through some of these issues as well..

Trackbacks for this post
  1. Google

Leave a Reply

Your email address will not be published. Required fields are marked *