The order of sequence alignment can bias the selection of tree topology, Mol Biol Evol, vol.8, pp.378-385, 1991. ,
Effects of nucleotide sequence alignment on phylogeny estimation: a case study on 18 S rDNAs of apicomplexa, Mol Biol Evol, vol.14, pp.428-441, 1997. ,
Multiple sequence alignment accuracy and phylogenetic inference, Syst Biol, vol.55, pp.314-328, 2006. ,
The impact of multiple protein sequence alignment on phylogenetic estimation, IEEE/ACM Trans Comput Biol Bioinf, 2009. ,
Multiple sequence alignment, Curr Opin Struct Biol, vol.16, pp.368-373, 2006. ,
Recent evolutions of multiple sequence alignment algorithms, PLoS Comput Biol, vol.3, p.123, 2007. ,
Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis, Mol Biol Evol, vol.17, pp.540-552, 2000. ,
Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments, Syst Biol, vol.56, pp.564-577, 2007. ,
Ribosomal RNA: a key to phylogeny, FASEB J, vol.7, pp.113-123, 1993. ,
Inadequate support for an evolutionary link between the Metazoa and the Fungi, Syst Biol, vol.43, pp.578-584, 1994. ,
, Phylogenetic inference. Molecular Systematics Sunderland: Sinauer AssociatesHillis DM, pp.407-514, 1996.
Monophyly of primary photosynthetic eukaryotes: green plants, red algae, and glaucophytes, Curr Biol, vol.15, pp.1325-1330, 2005. ,
PhylomeDB: a database for genome-wide collections of gene phylogenies, Nucleic Acids Res, vol.36, pp.491-496, 2008. ,
Noisy: Identification of problematic columns in multiple sequence alignments, Algorithms for Molecular Biology, vol.3, p.7, 2008. ,
Integrating Markov clustering and molecular phylogenetics to reconstruct the cyanobacterial species tree from conserved protein families, Mol Biol Evol, vol.25, pp.643-654, 2008. ,
trimAl: a tool for automated alignment triming in large-scale phylogenetic analyses, Bioinformatics, vol.25, pp.1972-1973, 2009. ,
A phylogenomic approach to bacterial phylogeny: evidence of a core of genes sharing a common history ,
URL : https://hal.archives-ouvertes.fr/hal-00427259
, Genome Res, vol.12, pp.1080-1090, 2002.
Toward automatic reconstruction of a highly resolved tree of life, Science, vol.311, pp.1283-1287, 2006. ,
Plastid genome phylogeny and a model of amino acid substitution for protein encoded by chloroplast DNA, J Mol Evol, vol.50, pp.348-358, 2000. ,
Phylogenetic supermatrix analysis of GenBank sequences from 2228 Papilionoid legumes, Syst Biol, vol.55, pp.818-836, 2006. ,
Sequence logos: a new way to display consensus sequences, Nucl Acids Res, vol.18, pp.6097-6100, 1990. ,
Estimation of phylogeny using a general Markov model, Evol Bioinf Online, vol.1, pp.62-80, 2005. ,
Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis, Mol Biol Evol, vol.15, pp.871-879, 1998. ,
URL : https://hal.archives-ouvertes.fr/hal-00428472
The biasing effect of compositional heterogeneity on phylogenetic estimates may be underestimated, Syst Biol, vol.53, pp.638-643, 2004. ,
Genome-scale phylogeny and the detection of systematic biases, Mol Biol Evol, vol.21, pp.1455-1458, 2004. ,
URL : https://hal.archives-ouvertes.fr/halsde-00193019
, International Union of Pure and Applied Chemistery and International Union of Biochemistery (IUPAC-IUB) Commission on Biochemical Nomenclature: Abbreviations and symbols for nucleic acids, polynucleotides and their constituents, Biochem J, vol.120, pp.449-454, 1970.
Atlas of Protein Sequence and Structure Washington, National Biomedical Research FoundationDayhoff MO, vol.1978, issue.3, pp.345-352 ,
Mitochondria and hydrogenosomes are two forms of the same fundamental organelle, Philos Trans R Soc Lond B Biol Sci, vol.358, pp.191-203, 2003. ,
On reduced amino acid alphabets for phylogenetic inference, Mol Biol Evol, vol.24, pp.2139-2150, 2007. ,
A test for homogeneity of the marginal distributions in a twoway classification, Biometrika, vol.42, pp.412-416, 1955. ,
Matched-pairs tests of homogeneity with applications to homologous nucleotide sequences, Bioinformatics, vol.22, pp.1225-1231, 2006. ,
, International Union of Pure and Applied Chemistery and International Union of Biochemistery (IUPAC-IUB) Commission on Biochemical Nomenclature: A one-letter notation for amino acid sequences (definitive rules), Pure Appl Chem, vol.31, pp.639-645, 1972.
, Mathematische Grundlagen der Quantenmechanik, 1932.
Are proteinprotein interfaces more conserved in sequence than the rest of the protein surface?, Protein Sci, vol.13, pp.190-202, 2004. ,
, JAMA: A Java Matrix Package
, A mathematical theory of communication. Bell System Tech J 1948, vol.27, pp.623-656
The classification of amino acid conservation, J Theor Biol, vol.119, pp.205-218, 1986. ,
Suggestion for "safe" residue substitutions in sitedirected mutagenesis, J mol Biol, vol.217, pp.721-729, 1991. ,
Amino acid substitution matrices from protein blocks, Proc Natl Acad Sci, vol.89, pp.10915-10919, 1992. ,
, BLOSUM matrices
Where did the BLOSUM62 alignment score matrix come from?, Nat Biotechnol, vol.22, pp.1035-1036, 2004. ,
Improved sensitivity of nucleic acid database searches using application-specific scoring matrices, Methods: A Companion to Methods Enzymol, vol.3, pp.66-70, 1991. ,
Validity and applicability of several tests for comparing marginal distributions of a square table with ordered categories, Behaviormetrika, vol.30, pp.65-78, 1991. ,
, Numerical Recipes in C -The Art of Scientific Computing Cambridge, vol.2, 1992.
Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees, Comput Appl Biosci, vol.13, pp.235-238, 1997. ,
The rapid generation of mutation data matrices from protein sequences, Comput Appl Biosci, vol.8, pp.275-282, 1992. ,
A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood, Syst Biol, vol.52, pp.696-704, 2003. ,
MUSCLE: multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Res, vol.32, pp.1792-1797, 2004. ,
MUSCLE: a multiple sequence alignment method with reduced time and space complexity, BMC Bioinformatics, vol.5, p.113, 2004. ,
Signal detection theory and ROC analysis, series in cognition and perception, 1975. ,
Basic principles of ROC analysis, Semin Nucl Med, vol.8, pp.283-298, 1978. ,
Measuring the accuracy of diagnostic systems, Science, vol.240, pp.1285-1293, 1988. ,
, , vol.10, 2010.
An introduction to ROC analysis, Pattern Recogn Lett, vol.27, pp.861-874, 2006. ,
A note on the power of the sign test, Ann Math Stat, vol.12, pp.279-303, 1941. ,
The statistical sign test, J Am Statist Assoc, vol.41, pp.557-566, 1946. ,
A theorem on the sign test when ties are present, Proc Nederl Akad Weten Ser A, vol.55, p.322, 1952. ,
BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data, Mol Biol Evol, vol.14, pp.685-695, 1997. ,
URL : https://hal.archives-ouvertes.fr/lirmm-00730410
TNT (Tree analysis using New Technology) ver ,
The parsimony ratchet, a new method for rapid parsimony analysis, Cladistics, vol.15, pp.407-414, 1999. ,
Comparison of undirected phylogenetic trees based on subtrees of four evolutionary units, Syst Zool, vol.34, pp.193-200, 1985. ,
, Confidence limits on phylogenies: an approach using the bootstrap, vol.39, pp.783-791, 1985.
Approximate likelihood ratio test for branches: a fast, accurate and powerful alternative, Syst Biol, vol.55, pp.539-552, 2006. ,
URL : https://hal.archives-ouvertes.fr/lirmm-00136658
New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0, Syst Biol, vol.59, pp.307-321, 2010. ,
URL : https://hal.archives-ouvertes.fr/lirmm-00511784
The meaning and the use of the area under a receiver operating characteristic (ROC) curve, Radiology, vol.143, pp.29-36, 1982. ,
Case studies in the use of ROC curve analysis for sensor-based estimates in human computer interaction, Proceedings of Graphics Interface, pp.129-136, 2005. ,
, Concatenate: a software to build supermatrices of characters
Model of amino acid substitution in proteins encoded by mitochondrial DNA, J Mol Evol, vol.42, pp.459-468, 1996. ,
Monophyly of Primary Photosynthetic Eukaryotes: Green Plants, Red Algae, and Glaucophytes, Curr Biol, vol.15, pp.1325-1330, 2005. ,
Phylogenomic analysis supports the monophyly of Cryptophytes and Haptophytes and the association of Rhizaria with Chromalveolates, Mol Biol Evol, vol.24, pp.1702-1713, 2007. ,
Phylogenomics reveals a new 'metagroup' including most photosynthetic eukaryotes, Biol Lett, vol.4, pp.366-369, 2008. ,
Phylogenomic analyses support the monophyly of Excavata and resolve relationships among eukaryotic "supergroups, Proc Natl Acad Sci, vol.106, pp.3859-3864, 2009. ,
, Statistical methods in bioinformatics: an introduction, 2005.
Evolutionary tree from DNA sequences: a maximum likelihood approach, J Mol Evol, vol.17, pp.368-376, 1981. ,
Tests of applicability of several substitution models for DNA sequence data, Mol Biol Evol, vol.12, pp.131-151, 1995. ,
A test for symmetry in contingency tables, J Am Stat Assoc, vol.43, pp.572-574, 1948. ,
Genome-scale approaches to resolving incongruence in molecular phylogenies, Nature, vol.425, pp.798-804, 2003. ,
, Yeast multi-gene dataset
A new method for calculating evolutionary substitution rates, J Mol Evol, vol.20, pp.86-93, 1984. ,
The general stochastic model of nucleotide substitution, J Theor Biol, vol.142, pp.485-501, 1990. ,
Estimating the pattern of nucleotide substitution, J Mol Evol, vol.39, pp.105-111, 1994. ,
Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances, Proc Natl Acad Sci, vol.91, pp.1455-1459, 1994. ,
Recovering evolutionary trees under a more realistic model of sequence evolution, Mol Biol Evol, vol.11, pp.605-612, 1994. ,
Recovering a tree from the leaf colourations it generates under a Markov model, Appl Math Lett, vol.7, pp.19-23, 1994. ,
An assessment of accuracy, error, and conflict with support values from genome-scale phylogenetic data, Mol Biol Evol, vol.21, pp.1534-1537, 2004. ,
An empirical examination of the utility of codon-substitution models in phylogeny reconstruction, Syst Biol, vol.54, pp.808-818, 2005. ,
Supertree bootstrapping methods for assessing phylogenetic variation among genes in genome-scale data sets, Syst Biol, vol.55, pp.426-440, 2006. ,
Bayesian estimation of concordance among gene trees, Mol Biol Evol, vol.24, pp.412-426, 2007. ,
Phylogenetic inference with weighted codon evolutionary distances, J Mol Evol, vol.68, pp.377-392, 2009. ,
, Mobyle: a new full web bioinformatics framework, vol.25, pp.3005-3011, 2009.
, Cite this article as: Criscuolo and Gribaldo: BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments, BMC Evolutionary Biology, vol.10, p.210, 2010.