Cheminformatics

Aug

Excel is widely used in businesses all over the world and can be used for many diverse tasks due to the flexibility of the program. I’ve been doing a fair deal of Excel templates over the years to solve an array of very different tasks. I’ve also ...

Jun

One of the more popular blog post based on monthly visitors is the old Create a Simple Object Oriented GUIDE GUI in MatLAB, but since I don’t program MATLAB at the moment, I thought it could be nice by making an update about how this could be done ...

Mar

The process of expanding an otherwise limited dataset in order to more efficiently train a neural network is known as Data Augmentation For images there have been used a variety of techniques, such as flipping, rotation, sub-segmenting and cropping, zooming. The mirror image of a cat is ...

Jan

I found some interesting toxicology datasets from the Tox21 challenge, and wanted to see if it was possible to build a toxicology predictor using a deep neural network. I don't know how many layers a neural network actually has to have to be called "deep", but its a buzz word, so ...

Jan

In the last blogpost the battle tested principal components analysis (PCA) was used as a dimensionality reduction tool. This time we'll take a deeper look into chemical space by using a deep learning neural autoencoder, by testing some of the newer tools based on neural networks which has shown promising results. ...

Dec

As covered before, chemical space is huge. So it could be nice if this multidimensional molecular space could be reduced and visualized to get an idea about where how query molecules relate to one another. This small tutorial show a simple example of how this could be done with the PCA decomposition from scikit-learn and molecular fingerprints calculated with ...

Dec

Neural networks are interesting models underlying much of the newest AI applications and algorithms. Recent advances in training algorithms and GPU enabled code together with publicly available highly efficient libraries such as Google's Tensorflow or Theano makes them highly interesting for modelling molecular data. Here I explore the high level Neural Network ...

Nov

Neural Networks are interesting algorithms, but sometimes also a bit spooky. In this blog post I explore the possibilities for teaching the neural networks to generate completely novel drug like molecules. I have experimented for some time with recurrent neural networks with the LSTM architecture. In short, recurrent neural networks differ from more traditional feed forward neural networks because they do ...

Oct

RDKit UGM 2016

I'm looking forward for the first to attend the RDKit user group meeting from 26-28 October 2016 in Basel, Switzerland. RDKit is an open source chemoinformatics toolkit written in c++ with python bindings and extensions. Additionally, it has a database cartridge, which makes it quite useful for handling chemical information and storage ...

Mar

Toxic compounds are most often something that we try to avoid when designing novel pharmaceutical compounds, so it could be nice to get a prediction if a compound is toxic even before ressources are used to synthesize it.  But what if it comes back as predicted ...

12