Proteochemometric modeling in a Bayesian framework - Institut Pasteur Accéder directement au contenu
Article Dans Une Revue Journal of Cheminformatics Année : 2014

Proteochemometric modeling in a Bayesian framework

Résumé

Proteochemometric (PCM) is an approach for bioactivity predictive modeling which models the relationship between protein and chemical information. Gaussian Processes (GP), based on Bayesian inference, provide the most objective estimation of the uncertainty in predictions, thus permitting the evaluation of the applicability domain (AD) of the model. Furthermore, the experimental error on bioactivities measurements can be used as input for this probabilistic model. In this study, we apply GP implemented with a panel of kernels on three various (and multispecies) PCM datasets. The first dataset consisted of information from 8 human and rat adenosine receptors with a number of small molecule ligands and their binding affinity. The second consisted of the catalytic activity of four dengue virus NS3 proteases on 56 small peptides. Finally, we have gathered bioactivity information of small molecule ligands on 91 aminergic GPCRs from 9 different species, leading to a dataset of 24,593 datapoints with a matrix completeness of only 2.43%. GP models trained on these datasets are statistically sound, at the same level of statistical significance as Support Vector Machines (SVM), with R 2 0 values on the external dataset ranging from 0.68 to 0.92, and RMSEP values close to the experimental error. Furthermore, the best GP models obtained with the Normalized Polynomial and radial kernels provide intervals of confidence for the predictions in agreement with the cumulative Gaussian distribution. GP models were also interpreted on the basis of individual targets and of ligand descriptors. In the dengue dataset, the model interpretation in terms of the amino-acid positions in the tetra-peptide ligands gave biologically meaningful results.
Fichier principal
Vignette du fichier
GP_paper_bmc_revision2.pdf (337.98 Ko) Télécharger le fichier
GP_paper_bmc_SI_revision2.pdf (439.76 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

pasteur-01107505 , version 1 (20-01-2015)

Identifiants

Citer

Isidro Cortes-Ciriano, Gerard van Westen, Eelke B Lenselink, Daniel Murrell, Andreas Bender, et al.. Proteochemometric modeling in a Bayesian framework. Journal of Cheminformatics, 2014, 6, pp.35. ⟨10.1186/1758-2946-6-35⟩. ⟨pasteur-01107505⟩

Collections

PASTEUR CNRS
313 Consultations
342 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More