Boosting Docking-based Virtual Screening with Deep Learning
Janaina Cruz Pereira
and
Ernesto Raul Caffarena
and
Cicero dos Santos
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
q-bio.QM
First published: 2016/08/17 (8 years ago) Abstract: In this work, we propose a deep learning approach to improve docking-based
virtual screening. The introduced deep neural network, DeepVS, uses the output
of a docking program and learns how to extract relevant features from basic
data such as atom and residues types obtained from protein-ligand complexes.
Our approach introduces the use of atom and amino acid embeddings and
implements an effective way of creating distributed vector representations of
protein-ligand complexes by modeling the compound as a set of atom contexts
that is further processed by a convolutional layer. One of the main advantages
of the proposed method is that it does not require feature engineering. We
evaluate DeepVS on the Directory of Useful Decoys (DUD), using the output of
two docking programs: AutodockVina1.1.2 and Dock6.6. Using a strict evaluation
with leave-one-out cross-validation, DeepVS outperforms the docking programs in
both AUC ROC and enrichment factor. Moreover, using the output of
AutodockVina1.1.2, DeepVS achieves an AUC ROC of 0.81, which, to the best of
our knowledge, is the best AUC reported so far for virtual screening using the
40 receptors from DUD.