Cheminformatics

Oct

Earlier I wrote a blog post about how to build SMILES based autoencoders in Keras. It has since been a much visited page, so the topic seems interesting for a lot of people, thank you for reading. One thing that I worried about was how nearby molecules seem more related when comparing the SMILES strings rather than the molecules. ...

Dec

UPDATE: Be sure to check out the follow-up to this post if you want to improve the model: Learn how to improve SMILES based molecular autoencoders with heteroencoders I've previously written about molecular generators based on long short-term memory recurrent neural networks (LSTM-RNNs). The networks learn rules about how SMILES strings ...

Dec

The SMILES enumeration code at GitHub has been revamped and revised into an object for easier use. It can work in conjunction with a SMILES iterator object that give on-the-fly enumeration and vectorization for training of SMILES based Recurrent Neural Network (RNN) models of molecules for ...

Nov

The film Inception with Leonardo Di Caprio is about dreams in dreams, and gave rise to the meme "We need to go deeper". The title has also given name to the Inception networks used by Google in their Inception network. I recently stumbled across two interesting ...

Nov

Yesterday I was the external examiner at Khanhvi Tran's Master thesis defense. She had been working on upgrading the SmartCYP program from version 2.4 to 3.0, together with her supervisors Associate Professor Lars Olsen, Ph.D student Marco Montefiori  and Professor Flemming Steen Jørgensen. The program predicts site of metabolism (SOM) on ...

Aug

Excel is widely used in businesses all over the world and can be used for many diverse tasks due to the flexibility of the program. I’ve been doing a fair deal of Excel templates over the years to solve an array of very different tasks. I’ve also ...

Jun

One of the more popular blog post based on monthly visitors is the old Create a Simple Object Oriented GUIDE GUI in MatLAB, but since I don’t program MATLAB at the moment, I thought it could be nice by making an update about how this could be done ...

Mar

The process of expanding an otherwise limited dataset in order to more efficiently train a neural network is known as Data Augmentation For images there have been used a variety of techniques, such as flipping, rotation, sub-segmenting and cropping, zooming. The mirror image of a cat is ...

Jan

I found some interesting toxicology datasets from the Tox21 challenge, and wanted to see if it was possible to build a toxicology predictor using a deep neural network. I don't know how many layers a neural network actually has to have to be called "deep", but its a buzz word, so ...

Jan

In the last blogpost the battle tested principal components analysis (PCA) was used as a dimensionality reduction tool. This time we'll take a deeper look into chemical space by using a deep learning neural autoencoder, by testing some of the newer tools based on neural networks which has shown promising results. ...