A. Bender, Databases: Compound bioactivities go public, Nature Chemical Biology, vol.6, issue.5, p.309, 2010.
DOI : 10.1038/nchembio.354

A. Gaulton, L. Bellis, A. Bento, J. Chambers, M. Davies et al., ChEMBL: a large-scale bioactivity database for drug discovery, Nucleic Acids Research, vol.40, issue.D1, pp.1100-1107, 2011.
DOI : 10.1093/nar/gkr777

Y. Wang, X. J. Suzek, T. Zhang, J. Wang, J. Zhou et al., PubChem's BioAssay Database, Nucleic Acids Research, vol.40, issue.D1, pp.400-412, 2012.
DOI : 10.1093/nar/gkr1132

G. Van-westen, J. Wegner, A. Ijzerman, H. Van-vlijmen, and A. Bender, Proteochemometric modeling as a tool to design selective compounds and for extrapolating to novel targets, Med. Chem. Commun., vol.48, issue.1, pp.16-30, 2011.
DOI : 10.1039/C0MD00165A

C. Ciriano, I. Ain, Q. Subramanian, V. Lenselink, E. et al., Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects, Med. Chem. Commun., vol.503, issue.suppl. 1, pp.24-50, 2015.
DOI : 10.1093/bib/bbu010

R. Team, R: a language and environment for statistical com? puting. R Foundation for Statistical Computing, 2013.

R. Gentleman, V. Carey, D. Bates, B. Bolstad, M. Dettling et al., Bioconductor: open software development for computational biology and bioinformatics, Genome Biology, vol.5, issue.10, p.80, 2004.
DOI : 10.1186/gb-2004-5-10-r80

S. Mente and M. Kuhn, The Use of the R Language for Medicinal Chemistry Applications, Current Topics in Medicinal Chemistry, vol.12, issue.18, pp.1957-1964, 2012.
DOI : 10.2174/156802612804910322

Y. Cao, A. Charisi, L. Cheng, T. Jiang, and T. Girke, ChemmineR: a compound mining framework for R, Bioinformatics, vol.24, issue.15, pp.1733-1734, 2008.
DOI : 10.1093/bioinformatics/btn307

R. Guha, Chemical informatics functionality in R, J Stat Softw, vol.18, issue.5, pp.1-16, 2007.

M. Kuhn, Building predictive models in r using the caret package, J Stat Softw, vol.28, issue.5, pp.1-26, 2008.

D. Rognan, Chemogenomic approaches to rational drug design, British Journal of Pharmacology, vol.62, issue.1, pp.38-52, 2007.
DOI : 10.1038/sj.bjp.0707307

URL : https://hal.archives-ouvertes.fr/hal-00195211

C. Yap, PaDEL-descriptor: An open source software to calculate molecular descriptors and fingerprints, Journal of Computational Chemistry, vol.24, issue.Suppl 2, pp.1466-1474, 2011.
DOI : 10.1002/jcc.21707

D. Rogers and M. Hahn, Extended-Connectivity Fingerprints, Journal of Chemical Information and Modeling, vol.50, issue.5, pp.742-754, 2010.
DOI : 10.1021/ci100050t

G. Landrum, RDKit: open?source cheminformatics, 2006.

C. Steinbeck, Y. Han, S. Kuhn, O. Horlacher, E. Luttmann et al., The Chemistry Development Kit (CDK):??? An Open-Source Java Library for Chemo- and Bioinformatics, Journal of Chemical Information and Computer Sciences, vol.43, issue.2, pp.493-500, 2003.
DOI : 10.1021/ci025584y

L. Hall and L. Kier, Electrotopological State Indices for Atom Types: A Novel Combination of Electronic, Topological, and Valence State Information, Journal of Chemical Information and Modeling, vol.35, issue.6, pp.1039-1045, 1995.
DOI : 10.1021/ci00028a014

J. Durant, B. Leland, D. Henry, and J. Nourse, Reoptimization of MDL Keys for Use in Drug Discovery, Journal of Chemical Information and Computer Sciences, vol.42, issue.6, pp.1273-1280, 2002.
DOI : 10.1021/ci010132r

O. Boyle, N. Banck, M. James, C. Morley, C. Vandermeersch et al., Open Babel: An open chemical toolbox, Journal of Cheminformatics, vol.3, issue.1, p.33, 2011.
DOI : 10.1093/nar/gkp324

J. Klekota and F. Roth, Chemical substructures that enrich for biological activity, Bioinformatics, vol.24, issue.21, pp.2518-2525, 2008.
DOI : 10.1093/bioinformatics/btn479

N. Xiao, Q. Xu, G. Van-westen, R. Swier, I. Cortes?ciriano et al., Protr: Protein sequence descriptor calculation and similarity computation with R. R package version 0.2?1 25 Benchmarking of protein descriptor sets in pro? teochemometric modeling (part 2): Modeling performance of 13 amino acid descriptor sets, J Cheminf, vol.5, issue.1, p.42, 2013.

G. Van-westen, R. Swier, J. Wegner, A. Ijzerman, H. Van-vlijmen et al., Benchmarking of protein descriptor sets in proteochemometric modeling (part 1): comparative study of 13 amino acid descriptor sets, Journal of Cheminformatics, vol.5, issue.1, p.41, 2013.
DOI : 10.1021/ci800249s

G. Van-westen, O. Van-den-hoven, R. Van-der-pijl, T. Mulder?krieger, H. De-vries et al., Identifying Novel Adenosine Receptor Ligands by Simultaneous Proteochemometric Modeling of Rat and Human Bioactivity Data, Journal of Medicinal Chemistry, vol.55, issue.16, pp.7010-7020, 2012.
DOI : 10.1021/jm3003069

I. Cortes?ciriano, D. Murrell, G. Van-westen, A. Bender, and T. Malliavin, Prediction of the potency of mammalian cyclooxygenase inhibitors with ensemble proteochemometric modeling, Journal of Cheminformatics, vol.7, issue.1, 2014.
DOI : 10.1093/bioinformatics/btn479

URL : https://hal.archives-ouvertes.fr/pasteur-01398459

C. Andersson, M. Gustafsson, and H. Strömbergsson, Quantitative Chemogenomics: Machine-Learning Models of Protein-Ligand Interaction, Current Topics in Medicinal Chemistry, vol.11, issue.15, pp.1978-1993, 2011.
DOI : 10.2174/156802611796391249

M. Kuhn and K. Johnson, Applied predictive modeling, 2013.
DOI : 10.1007/978-1-4614-6849-3

Z. Mayer, CaretEnsemble: framework for combining caret models into ensembles. [R Package Version 1, 2013.

R. Caruana, A. Niculescu?mizil, G. Crew, and A. Ksikes, Ensemble selection from libraries of models, Twenty-first international conference on Machine learning , ICML '04, p.18, 2004.
DOI : 10.1145/1015330.1015432

S. Wold, M. Sjöström, and L. Eriksson, PLS-regression: a basic tool of chemometrics, Chemometrics and Intelligent Laboratory Systems, vol.58, issue.2, pp.109-130, 2001.
DOI : 10.1016/S0169-7439(01)00155-1

L. Breiman, Random forests, Machine Learning, vol.45, issue.1, pp.5-32, 2001.
DOI : 10.1023/A:1010933404324

D. Hawkins, S. Basak, and D. Mills, Assessing Model Fit by Cross-Validation, Journal of Chemical Information and Computer Sciences, vol.43, issue.2, pp.579-586, 2003.
DOI : 10.1021/ci025626i

V. Consonni, D. Ballabio, and R. Todeschini, Evaluation of model predic? tive ability by external validation techniques, J. Chemom, vol.24, pp.3-4194, 2010.

A. Golbraikh and A. Tropsha, Beware of q2!, Journal of Molecular Graphics and Modelling, vol.20, issue.4, pp.269-276, 2002.
DOI : 10.1016/S1093-3263(01)00123-1

A. Tropsha, P. Gramatica, and V. Gombar, The Importance of Being Earnest: Validation is the Absolute Essential for Successful Application and Interpretation of QSPR Models, QSAR & Combinatorial Science, vol.38, issue.1, pp.69-77, 2003.
DOI : 10.1002/qsar.200390007

C. Ciriano, I. Van-westen, G. Lenselink, E. Murrell, D. Bender et al., Proteochemometric modeling in a Bayesian framework, Journal of Cheminformatics, vol.6, issue.1, p.35, 2014.
DOI : 10.1186/1758-2946-6-35

URL : https://hal.archives-ouvertes.fr/pasteur-01107505

H. Wickham, Ggplot2: elegant graphics for data analysis, 2009.

J. Wang, G. Krudy, T. Hou, W. Zhang, G. Holland et al., Development of Reliable Aqueous Solubility Models and Their Application in Druglike Analysis, Journal of Chemical Information and Modeling, vol.47, issue.4, pp.1395-1404, 2007.
DOI : 10.1021/ci700096r

G. Rimon, R. Sidhu, D. Lauver, J. Lee, N. Sharma et al., Coxibs interfere with the action of aspirin by binding tightly to one monomer of cyclooxygenase-1, Proceedings of the National Academy of Sciences, vol.107, issue.1, pp.28-33, 2010.
DOI : 10.1073/pnas.0909765106

F. Kruger and J. Overington, Global Analysis of Small Molecule Binding to Related Protein Targets, PLoS Computational Biology, vol.45, issue.1, p.1002333, 2012.
DOI : 10.1371/journal.pcbi.1002333.s017