Skip to Main content Skip to Navigation
Journal articles

An automatic tool to analyze and cluster macromolecular conformations based on self-organizing maps.

Abstract : Sampling the conformational space of biological macromolecules generates large sets of data with considerable complexity. Data-mining techniques, such as clustering, can extract meaningful information. Among them, the self-organizing maps (SOMs) algorithm has shown great promise; in particular since its computation time rises only linearly with the size of the data set. Whereas SOMs are generally used with few neurons, we investigate here their behavior with large numbers of neurons. We present here a python library implementing the full SOM analysis workflow. Large SOMs can readily be applied on heavy data sets. Coupled with visualization tools they have very interesting properties. Descriptors for each conformation of a trajectory are calculated and mapped onto a 3D landscape, the U-matrix, reporting the distance between neighboring neurons. To delineate clusters, we developed the flooding algorithm, which hierarchically identifies local basins of the U-matrix from the global minimum to the maximum. Availability and implementation: The python implementation of the SOM library is freely available on github: https://github.com/bougui505/SOM. michael.nilges@pasteur.fr or guillaume.bouvier@pasteur.fr Supplementary information: Supplementary data are available at Bioinformatics online.
Complete list of metadatas

Cited literature [7 references]  Display  Hide  Download

https://hal-pasteur.archives-ouvertes.fr/pasteur-01145388
Contributor : Maya Um <>
Submitted on : Tuesday, April 14, 2020 - 5:39:18 PM
Last modification on : Wednesday, April 15, 2020 - 8:57:56 AM

Links full text

Identifiers

Collections

Citation

Guillaume Bouvier, Nathan Desdouits, Mathias Ferber, Arnaud Blondel, Michael Nilges. An automatic tool to analyze and cluster macromolecular conformations based on self-organizing maps.. Bioinformatics, Oxford University Press (OUP), 2015, 31 (9), pp.1490-1492. ⟨10.1093/bioinformatics/btu849⟩. ⟨pasteur-01145388⟩

Share

Metrics

Record views

404