Cheminformatics

Dec

I've previously written about molecular generators based on long short-term memory recurrent neural networks (LSTM-RNNs). The networks learn rules about how SMILES strings of molecules are formatted and are then able to create novel SMILES following the same rules by iterating through the characters. The results are "creative" computers that can ...

Dec

The SMILES enumeration code at GitHub has been revamped and revised into an object for easier use. It can work in conjunction with a SMILES iterator object that give on-the-fly enumeration and vectorization for training of SMILES based Recurrent Neural Network (RNN) models of molecules for ...

Nov

The film Inception with Leonardo Di Caprio is about dreams in dreams, and gave rise to the meme "We need to go deeper". The title has also given name to the Inception networks used by Google in their Inception network. I recently stumbled across two interesting ...

Nov

Yesterday I was the external examiner at Khanhvi Tran's Master thesis defense. She had been working on upgrading the SmartCYP program from version 2.4 to 3.0, together with her supervisors Associate Professor Lars Olsen, Ph.D student Marco Montefiori  and Professor Flemming Steen Jørgensen. The program predicts site of metabolism (SOM) on ...

Aug

Excel is widely used in businesses all over the world and can be used for many diverse tasks due to the flexibility of the program. I’ve been doing a fair deal of Excel templates over the years to solve an array of very different tasks. I’ve also ...

Jun

One of the more popular blog post based on monthly visitors is the old Create a Simple Object Oriented GUIDE GUI in MatLAB, but since I don’t program MATLAB at the moment, I thought it could be nice by making an update about how this could be done ...

Mar

The process of expanding an otherwise limited dataset in order to more efficiently train a neural network is known as Data Augmentation For images there have been used a variety of techniques, such as flipping, rotation, sub-segmenting and cropping, zooming. The mirror image of a cat is ...

Jan

I found some interesting toxicology datasets from the Tox21 challenge, and wanted to see if it was possible to build a toxicology predictor using a deep neural network. I don't know how many layers a neural network actually has to have to be called "deep", but its a buzz word, so ...

Jan

In the last blogpost the battle tested principal components analysis (PCA) was used as a dimensionality reduction tool. This time we'll take a deeper look into chemical space by using a deep learning neural autoencoder, by testing some of the newer tools based on neural networks which has shown promising results. ...

Dec

As covered before, chemical space is huge. So it could be nice if this multidimensional molecular space could be reduced and visualized to get an idea about where how query molecules relate to one another. This small tutorial show a simple example of how this could be done with the PCA decomposition from scikit-learn and molecular fingerprints calculated with ...

12