Journal
MOLECULAR INFORMATICS
Volume 29, Issue 6-7, Pages 499-508Publisher
WILEY-V C H VERLAG GMBH
DOI: 10.1002/minf.201000052
Keywords
Bioinformatics; Chemogenomics; Drug design; Protein-Ligand interactions; Proteochemometrics
Categories
Funding
- Uppsala University [Kof 07]
- Knut and Alice Wallenberg Foundation
- Swedish Foundation for Strategic Research
Ask authors/readers for more resources
A proteochemometrics model was induced from all interaction data in the BindingDB database, comprizing in all 7078 protein-ligand complexes with representatives from all major drug target categories. Proteins were represented by alignment-independent sequence descriptors holding information on properties such as hydrophobicity, charge, and secondary structure. Ligands were represented by commonly used QSAR descriptors. The inhibition constant (pK(i)) values of protein-ligand complexes were discretized into high and low interaction activity. Different machine-learning techniques were used to induce models relating protein and ligand properties to the interaction activity. The best was decision trees, which gave an accuracy of 80% and an area under the ROC curve of 0.81. The tree pointed to the protein and ligand properties, which are relevant for the interaction. As the approach does neither require alignments nor knowledge of protein 3D structures virtually all available protein-ligand interaction data could be utilized, thus opening a way to completely general interaction models that may span entire proteomes.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available