RDkit

Jun

One of the more popular blog post based on monthly visitors is the old Create a Simple Object Oriented GUIDE GUI in MatLAB, but since I don’t program MATLAB at the moment, I thought it could be nice by making an update about how this could be done ...

Mar

The process of expanding an otherwise limited dataset in order to more efficiently train a neural network is known as Data Augmentation For images there have been used a variety of techniques, such as flipping, rotation, sub-segmenting and cropping, zooming. The mirror image of a cat is ...

Jan

I found some interesting toxicology datasets from the Tox21 challenge, and wanted to see if it was possible to build a toxicology predictor using a deep neural network. I don't know how many layers a neural network actually has to have to be called "deep", but its a buzz word, so ...

Dec

As covered before, chemical space is huge. So it could be nice if this multidimensional molecular space could be reduced and visualized to get an idea about where how query molecules relate to one another. This small tutorial show a simple example of how this could be done with the PCA decomposition from scikit-learn and molecular fingerprints calculated with ...

Dec

Neural networks are interesting models underlying much of the newest AI applications and algorithms. Recent advances in training algorithms and GPU enabled code together with publicly available highly efficient libraries such as Google's Tensorflow or Theano makes them highly interesting for modelling molecular data. Here I explore the high level Neural Network ...

Nov

Neural Networks are interesting algorithms, but sometimes also a bit spooky. In this blog post I explore the possibilities for teaching the neural networks to generate completely novel drug like molecules. I have experimented for some time with recurrent neural networks with the LSTM architecture. In short, recurrent neural networks differ from more traditional feed forward neural networks because they do ...

Oct

RDKit UGM 2016

I'm looking forward for the first to attend the RDKit user group meeting from 26-28 October 2016 in Basel, Switzerland. RDKit is an open source chemoinformatics toolkit written in c++ with python bindings and extensions. Additionally, it has a database cartridge, which makes it quite useful for handling chemical information and storage ...

Apr

When I have been working with chemical databases and import of molecules I have encountered numerous problems with the way chemical structures are drawn. Most often the problem arises as creative users in the past have had a problem with registering a compound the way they wanted it. Sometimes the used ...

Mar

Toxic compounds are most often something that we try to avoid when designing novel pharmaceutical compounds, so it could be nice to get a prediction if a compound is toxic even before ressources are used to synthesize it.  But what if it comes back as predicted ...

Feb

The chemoinformatics package Rdkit has is strength with handling small organic molecules. These molecules are characterized by a large diversity in chemical structures. A description of the exact way the atoms bond together are necessary to understand what molecule it is. Biological macro molecules are often build from repeating sequences of standard building blocks, such as amino-acids or ...

12