MSclassifR: an R package for supervised classification of mass spectra with machine learning methods - Institut Pasteur Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2022

MSclassifR: an R package for supervised classification of mass spectra with machine learning methods

Résumé

MSclassifR is an R package that has been specifically designed to improve the classification of mass spectra obtained from MALDI-TOF mass spectrometry. It offers a comprehensive range of functions that are focused on processing mass spectra, identifying discriminant m/z values, and making accurate predictions. The package introduces innovative algorithms for selecting discriminating m/z values and making predictions. To assess the effectiveness of these methods, extensive tests were conducted using challenging real datasets, including bacterial subspecies of the Mycobacterium abscessus complex, virulent and avirulent phenotypes of Escherichia coli, different species of Streptococci and nasal swabs from individuals infected and uninfected with SARS-CoV-2. Additionally, multiple datasets of varying sizes were created from these real datasets to evaluate the robustness of the algorithms. The results demonstrated that the Machine Learning-based pipelines in MSclassifR achieved high levels of accuracy and Kappa values. On an in-house dataset, some pipelines even achieved more than 95% mean accuracy, whereas commercial system only achieved 62% mean accuracy. Certain methods showed greater resilience to changes in dataset sizes when constructing Machine Learning-based pipelines. These simulations also helped determine the minimum sizes of training sets required to obtain reliable results. The package is freely available online, and its open-source nature encourages collaborative development, customization, and fosters innovation within the community focused on improving diagnosis based on MALDI-TOF spectra.
Fichier sous embargo
Fichier sous embargo
Date de visibilité indéterminée

Dates et versions

pasteur-04093599 , version 1 (10-05-2023)
pasteur-04093599 , version 2 (18-10-2023)

Identifiants

Citer

Alexandre Godmer, Yahia Benzerara, Emmanuelle Varon, Nicolas Veziris, Karen Druart, et al.. MSclassifR: an R package for supervised classification of mass spectra with machine learning methods. 2023. ⟨pasteur-04093599v2⟩
119 Consultations
189 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More