https://hal-pasteur.archives-ouvertes.fr/pasteur-02348884Koehl, PatricePatriceKoehlUC Davis - University of California [Davis] - University of CaliforniaDelarue, MarcMarcDelarueDynamique structurale des Macromolécules / Structural Dynamics of Macromolecules - Institut Pasteur [Paris] - CNRS - Centre National de la Recherche ScientifiqueOrland, HenriHenriOrlandIPHT - Institut de Physique Théorique - UMR CNRS 3681 - CEA - Commissariat à l'énergie atomique et aux énergies alternatives - Université Paris-Saclay - CNRS - Centre National de la Recherche ScientifiqueOptimal transport at finite temperatureHAL CCSD2019[CHIM.CRIS] Chemical Sciences/Cristallography[SDV.BBM.BS] Life Sciences [q-bio]/Biochemistry, Molecular Biology/Structural Biology [q-bio.BM][SDV.BBM] Life Sciences [q-bio]/Biochemistry, Molecular Biology[INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM][PHYS.PHYS.PHYS-BIO-PH] Physics [physics]/Physics [physics]/Biological Physics [physics.bio-ph][SDV.BC] Life Sciences [q-bio]/Cellular Biologyde TARRAGON, Marie2019-11-05 14:42:562022-04-07 10:10:372019-11-12 16:28:51enJournal articles10.1103/PhysRevE.100.0133101Optimal transport (OT) has become a discipline by itself that offers solutions to a wide range of theoretical problems in probability and mathematics. Despite its appealing theoretical properties, solving the OT problem involves the resolution of a linear program whose computational cost can quickly become prohibitive whenever the size of the problem exceeds a few hundred points. The recent introduction of entropy regularization, however, has led to the development of fast algorithms for solving an approximate OT problem. The successes of those algorithms have resulted in a popularization of the applications of OT in several applied fields such as imaging sciences and machine learning, and in data sciences in general. Problems remain, however, as to the numerical convergence of those regularized approximations towards the actual OT solution. In addition, the physical meaning of this regularization is unclear. In this paper, we propose an approach to solving the discrete OT problem using techniques adapted from statistical physics. Our first contribution is to fully describe this formalism, including all the proofs of its main claims. In particular we derive a strongly concave effective free energy function that captures the constraints of the optimal transport problem at a finite temperature. Its maximum defines a pseudo distance between the two set of weighted points that are compared, which satisfies the triangular inequalities. The temperature dependent OT pseudo distance decreases monotonically to the standard OT distance, providing a robust framework for temperature annealing. Our second contribution is to show that the implementation of this formalism has the same properties as the regularized OT algorithms in time complexity, making it a competitive approach to solving the OT problem. We illustrate applications of the framework to the problem of protein fold recognition based on sequence information only.