Accounting for ambiguity in ancestral sequence reconstruction - Archive ouverte HAL Access content directly
Journal Articles Bioinformatics Year : 2019

Accounting for ambiguity in ancestral sequence reconstruction

(1, 2) , (1) , (1) , (1) , (3) , (1)
1
2
3

Abstract

Motivation: The reconstruction of ancestral genetic sequences from the analysis of contemporan-eous data is a powerful tool to improve our understanding of molecular evolution. Various statistical criteria defined in a phylogenetic framework can be used to infer nucleotide, amino-acid or codon states at internal nodes of the tree, for every position along the sequence. These criteria generally select the state that maximizes (or minimizes) a given criterion. Although it is perfectly sensible from a statistical perspective, that strategy fails to convey useful information about the level of uncertainty associated to the inference. Results: The present study introduces a new criterion for ancestral sequence reconstruction, the minimum posterior expected error (MPEE), that selects a single state whenever the signal conveyed by the data is strong, and a combination of multiple states otherwise. We also assess the performance of a criterion based on the Brier scoring scheme which, like MPEE, does not rely on any tuning parameters. The precision and accuracy of several other criteria that involve arbitrarily set tuning parameters are also evaluated. Large scale simulations demonstrate the benefits of using the MPEE and Brier-based criteria with a substantial increase in the accuracy of the inference of past sequences compared to the standard approach and realistic compromises on the precision of the solutions returned. Availability and implementation: The software package PhyML (https://github.com/stephaneguin don/phyml) provides an implementation of the Maximum A Posteriori (MAP) and MPEE criteria for reconstructing ancestral nucleotide and amino-acid sequences.
Fichier principal
Vignette du fichier
ancestral_clean.pdf (507.73 Ko) Télécharger le fichier
Loading...

Dates and versions

pasteur-02404399 , version 1 (11-12-2019)

Identifiers

Cite

Adrien Oliva, Sylvain Pulicani, Vincent Lefort, Laurent Brehelin, Olivier Gascuel, et al.. Accounting for ambiguity in ancestral sequence reconstruction. Bioinformatics, 2019, 35 (21), pp.4290-4297. ⟨10.1093/bioinformatics/btz249⟩. ⟨pasteur-02404399⟩
173 View
306 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More