
Section: Mathematical & Computational Biology
Topic:
Evolution,
Genetics/genomics,
Microbiology
A systematic assessment of phylogenomic approaches for microbial species tree reconstruction
Corresponding author(s): Bansal, Mukul S. (mukul.bansal@uconn.edu)
10.24072/pcjournal.579 - Peer Community Journal, Volume 5 (2025), article no. e72.
Get full text PDF Peer reviewed and recommended by PCIA key challenge in microbial phylogenomics is that microbial gene families are often affected by extensive horizontal gene transfer (HGT). As a result, most existing methods for microbial phylogenomics can only make use of a small subset of the gene families present in the microbial genomes under consideration, potentially biasing their results and affecting their accuracy. To address this challenge, several methods have recently been developed for inferring microbial species trees from genome-scale datasets of gene families affected by evolutionary events such as HGT, gene duplication, and gene loss. In this work, we use extensive simulated and real biological datasets to systematically assess the accuracies of four recently developed methods for microbial phylogenomics, SpeciesRax, ASTRAL-Pro 2, PhyloGTP, and AleRax, under a range of different conditions. Our analysis reveals important insights into the relative performance of these methods on datasets with different characteristics, identifies shared weaknesses when analyzing complex biological datasets, and demonstrates the importance of accounting for gene tree inference error/uncertainty for improved species tree reconstruction. Among other results, we find that (i) AleRax, the only method that explicitly accounts for gene tree inference error/uncertainty, shows the best species tree reconstruction accuracy among all tested methods, (ii) PhyloGTP (developed previously by the authors of this paper) shows the best overall accuracy among methods that do not account for gene tree error and uncertainty, (iii) ASTRAL-Pro 2 is less accurate than the other methods across nearly all tested conditions, and (iv) explicitly accounting for gene tree inference error/uncertainty can lead to substantial improvements in species tree reconstruction accuracy. Importantly, we also find that all methods, including AleRax and PhyloGTP, are susceptible to biases present in complex real biological datasets and can sometimes yield misleading phylogenies.
Type: Research article
Weiner, Samson 1; Feng, Yutian 2; Gogarten, J. Peter 2, 3; Bansal, Mukul S. 1, 3

@article{10_24072_pcjournal_579, author = {Weiner, Samson and Feng, Yutian and Gogarten, J. Peter and Bansal, Mukul S.}, title = {A systematic assessment of phylogenomic approaches for microbial species tree reconstruction}, journal = {Peer Community Journal}, eid = {e72}, publisher = {Peer Community In}, volume = {5}, year = {2025}, doi = {10.24072/pcjournal.579}, language = {en}, url = {https://peercommunityjournal.org/articles/10.24072/pcjournal.579/} }
TY - JOUR AU - Weiner, Samson AU - Feng, Yutian AU - Gogarten, J. Peter AU - Bansal, Mukul S. TI - A systematic assessment of phylogenomic approaches for microbial species tree reconstruction JO - Peer Community Journal PY - 2025 VL - 5 PB - Peer Community In UR - https://peercommunityjournal.org/articles/10.24072/pcjournal.579/ DO - 10.24072/pcjournal.579 LA - en ID - 10_24072_pcjournal_579 ER -
%0 Journal Article %A Weiner, Samson %A Feng, Yutian %A Gogarten, J. Peter %A Bansal, Mukul S. %T A systematic assessment of phylogenomic approaches for microbial species tree reconstruction %J Peer Community Journal %D 2025 %V 5 %I Peer Community In %U https://peercommunityjournal.org/articles/10.24072/pcjournal.579/ %R 10.24072/pcjournal.579 %G en %F 10_24072_pcjournal_579
Weiner, S.; Feng, Y.; Gogarten, J. P.; Bansal, M. S. A systematic assessment of phylogenomic approaches for microbial species tree reconstruction. Peer Community Journal, Volume 5 (2025), article no. e72. https://doi.org/10.24072/pcjournal.579
PCI peer reviews and recommendation, and links to data, scripts, code and supplementary information: 10.24072/pci.mcb.100408
Conflict of interest of the recommender and peer reviewers:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.
[1] Evolutionary placement of Methanonatronarchaeia, Nature microbiology, Volume 4 (2019) no. 4, pp. 558-559 (Publisher: Nature Publishing Group UK London) | DOI
[2] Extreme halophilic archaea derive from two distinct methanogen Class II lineages, Molecular phylogenetics and evolution, Volume 127 (2018), pp. 46-54 | DOI
[3] Archaea and Frankia gene family alignments and associated scripts, Zenodo (2024) | DOI
[4] Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss, Bioinformatics, Volume 28 (2012) no. 12, pp. 283-291 | DOI
[5] A Note on the Fixed Parameter Tractability of the Gene-Duplication Problem, IEEE/ACM Trans. Comput. Biology Bioinform., Volume 8 (2011) no. 3, pp. 848-850 | DOI
[6] Improved gene tree error correction in the presence of horizontal gene transfer, Bioinformatics, Volume 31 (2015) no. 8, pp. 1211-1218 | DOI
[7] Highways of gene sharing in prokaryotes, Proceedings of the National Academy of Sciences of the United States of America, Volume 102 (2005) no. 40, pp. 14332-14337 | DOI
[8] Phylogeny and evolution of the Archaea: one hundred genomes later, Current opinion in microbiology, Volume 14 (2011) no. 3, pp. 274-281 | DOI
[9] BLAST+: architecture and applications, BMC bioinformatics, Volume 10 (2009), pp. 1-9 | DOI
[10] Toward Automatic Reconstruction of a Highly Resolved Tree of Life, Science, Volume 311 (2006) no. 5765, pp. 1283-1287 | DOI
[11] Rapid evolutionary innovation during an Archaean genetic expansion, Nature, Volume 469 (2011), pp. 93-96 | DOI
[12] Phylogenomic species tree estimation in the presence of incomplete lineage sorting and horizontal gene transfer, BMC Genomics, Volume 16 (Suppl 10) (2015), p. S1 | DOI
[13] Gene transfers can date the tree of life, Nature ecology & evolution, Volume 2 (2018) no. 5, pp. 904-909 | DOI
[14] How big is the iceberg of which organellar genes in nuclear genomes are but the tip?, Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, Volume 358 (2003) no. 1429, pp. 39-58 | DOI
[15] Phylogenetic Classification and the Universal Tree, Science, Volume 284 (1999) no. 5423, pp. 2124-2128 | DOI
[16] OrthoFinder: phylogenetic orthology inference for comparative genomics, Genome biology, Volume 20 (2019), pp. 1-14 | DOI
[17] The evolutionary origins of extreme halophilic archaeal lineages, Genome biology and evolution, Volume 13 (2021) no. 8, p. evab166 | DOI
[18] Inferring species phylogenies from multiple genes: Concatenated sequence tree versus consensus gene tree, Journal of Experimental Zoology Part B: Molecular and Developmental Evolution, Volume 304B (2005) no. 1, pp. 64-74 | DOI
[19] Multilocus sequence analysis (MLSA) in prokaryotic taxonomy, Systematic and Applied Microbiology, Volume 38 (2015) no. 4, pp. 237-245 | DOI
[20] Prokaryotic Evolution in Light of Gene Transfer, Molecular Biology and Evolution, Volume 19 (2002) no. 12, pp. 2226-2238 | DOI
[21] Improving phylogenies based on average nucleotide identity, incorporating saturation correction and nonparametric bootstrap support, Systematic Biology, Volume 71 (2022) no. 2, pp. 396-409 | DOI
[22] Whole-genome prokaryotic phylogeny, Bioinformatics, Volume 21 (2004) no. 10, pp. 2329-2335 | DOI
[23] Horizontal transfer of {ATPase} genes – the tree of life becomes a net of life, Biosystems, Volume 31 (1993) no. 2–3, pp. 111-119 | DOI
[24] Microsporidia are related to Fungi: Evidence from the largest subunit of RNA polymerase II and other proteins, Proceedings of the National Academy of Sciences, Volume 96 (1999) no. 2, pp. 580-585 | DOI
[25] ModelFinder: fast model selection for accurate phylogenetic estimates, Nature methods, Volume 14 (2017) no. 6, pp. 587-589 | DOI
[26] MAFFT multiple sequence alignment software version 7: improvements in performance and usability, Molecular biology and evolution, Volume 30 (2013) no. 4, pp. 772-780 | DOI
[27] Genomic insights that advance the species definition for prokaryotes, Proceedings of the National Academy of Sciences, Volume 102 (2005) no. 7, pp. 2567-2572 | DOI
[28] SaGePhy: an improved phylogenetic simulation framework for gene and subgene evolution, Bioinformatics (2019) | DOI
[29] On the Weighted Quartet Consensus problem, Theoretical Computer Science, Volume 769 (2019), pp. 1-17 | DOI
[30] Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees and Supermatrices, PLoS ONE, Volume 8 (2013) no. 4, p. e62510 | DOI
[31] The impact of HGT on phylogenomic reconstruction methods, Briefings in Bioinformatics, Volume 15 (2014) no. 1, pp. 79-90 | DOI
[32] An Improved General Amino Acid Replacement Matrix, Molecular Biology and Evolution, Volume 25 (2008) no. 7, pp. 1307-1320 | DOI
[33] Estimating Bayesian Phylogenetic Information Content, Systematic Biology, Volume 65 (2016) no. 6, pp. 1009-1023 | DOI
[34] AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era, Molecular Biology and Evolution, Volume 39 (2022) no. 5, p. msac092 | DOI
[35] From Gene Trees to Species Trees, SIAM J. Comput., Volume 30 (2000) no. 3, pp. 729-752 | DOI
[36] IMG 4 version of the integrated microbial genomes comparative analysis system, Nucleic Acids Research, Volume 42 (2014) no. D1, p. D560-D567 | DOI
[37] A timeline of bacterial and archaeal diversification in the ocean, eLife, Volume 12 (2023), p. RP88268 | DOI
[38] The prokaryotic tree of life: past, present... and future?, Trends in Ecology & Evolution, Volume 23 (2008) no. 5, p. 276 | DOI
[39] IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era, Molecular Biology and Evolution, Volume 37 (2020) no. 5, pp. 1530-1534 | DOI
[40] ASTRAL: genome-scale coalescent-based species tree estimation, Bioinformatics, Volume 30 (2014) no. 17, p. i541-i548 | DOI
[41] GeneRax: A Tool for Species-Tree-Aware Maximum Likelihood-Based Gene Family Tree Inference under Gene Duplication, Transfer, and Loss, Molecular Biology and Evolution, Volume 37 (2020) no. 9, pp. 2763-2774 | DOI
[42] SpeciesRax: A Tool for Maximum Likelihood Species Tree Inference from Gene Family Trees under Duplication, Transfer, and Loss, Molecular Biology and Evolution, Volume 39 (2022) no. 2, p. msab365 | DOI
[43] AleRax: a tool for gene and species tree co-estimation and reconciliation under a probabilistic model of gene duplication, transfer, and loss, Bioinformatics, Volume 40 (2024) no. 4, p. btae162 | DOI
[44] De novo metagenomic assembly reveals abundant novel major lineage of Archaea in hypersaline microbial communities, The ISME journal, Volume 6 (2012) no. 1, pp. 81-93 | DOI
[45] The winds of (evolutionary) change: breathing new life into microbiology., Journal of Bacteriology, Volume 176 (1994) no. 1, pp. 1-6 | DOI
[46] FastTree 2–approximately maximum-likelihood trees for large alignments, PloS one, Volume 5 (2010) no. 3, p. e9490 | DOI
[47] Identification and characterization of putative Aeromonas spp. T3SS effectors", PLOS ONE, Volume 14 (2019) no. 6, pp. 1-20 | DOI
[48] The two-domain tree of life is linked to a new root for the Archaea, Proceedings of the National Academy of Sciences, Volume 112 (2015) no. 21, pp. 6670-6675 | DOI
[49] Global phylogenomic analysis disentangles the complex evolutionary history of DNA replication in archaea, Genome biology and evolution, Volume 6 (2014) no. 1, pp. 192-212 | DOI
[50] Comparison of phylogenetic trees, Mathematical Biosciences, Volume 53 (1981) no. 1, pp. 131-147 | DOI
[51] MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space, Systematic Biology, Volume 61 (2012) no. 3, pp. 539-542 | DOI
[52] Inferring species trees under extensive horizontal gene transfer: insights from simulated and empirical data, Peer Community in Mathematical and Computational Biology (2025) no. 100408 | DOI
[53] Horizontal Gene Transfer Phylogenetics: A Random Walk Approach, Molecular Biology and Evolution, Volume 37 (2019) no. 5, pp. 1470-1479 | DOI
[54] Phylo SI: a new genome-wide approach for prokaryotic phylogeny, Nucleic Acids Research, Volume 42 (2014) no. 4, pp. 2391-2404 | DOI
[55] Reply to ‘Evolutionary placement of Methanonatronarchaeia’, Nature Microbiology, Volume 4 (2019) no. 4, pp. 560-561 | DOI
[56] Lateral Gene Transfer from the Dead, Systematic Biology, Volume 62 (2013) no. 3, pp. 386-397 | DOI
[57] Simultaneous Identification of Duplications and Lateral Gene Transfers, IEEE/ACM Trans. Comput. Biology Bioinform., Volume 8 (2011) no. 2, pp. 517-535 | DOI
[58] Assessing the Potential of Gene Tree Parsimony for Microbial Phylogenomics, Comparative Genomics, Springer Nature Switzerland, Cham, 2024, pp. 129-149 | DOI
[59] Supplementary Material: A systematic assessment of phylogenomic approaches for microbial species tree reconstruction, Zenodo, 2025 | DOI
[60] Supertrees Based on the Subtree Prune-and-Regraft Distance, Systematic Biology, Volume 63 (2014) no. 4, pp. 566-581 | DOI
[61] Integrative modeling of gene and genome evolution roots the archaeal tree of life, Proceedings of the National Academy of Sciences, Volume 114 (2017) no. 23, p. E4602-E4611 | DOI
[62] DISCO: Species Tree Inference using Multicopy Gene Family Tree Decomposition, Systematic Biology, Volume 71 (2021) no. 3, pp. 610-629 | DOI
[63] Bacterial evolution., Microbiological Reviews, Volume 51 (1987) no. 2, pp. 221-271 | DOI
[64] Distinct Types of rRNA Operons Exist in the Genome of the Actinomycete Thermomonospora chromogena and Evidence for Horizontal Transfer of an Entire rRNA Operon, Journal of Bacteriology, Volume 181 (1999) no. 17, pp. 5201-5209 | DOI
[65] ASTRAL-Pro 2: ultrafast species tree reconstruction from multi-copy gene family trees, Bioinformatics, Volume 38 (2022) no. 21, pp. 4949-4950 | DOI
[66] Intertwined Evolutionary Histories of Marine Synechococcus and Prochlorococcus marinus, Genome Biology and Evolution, Volume 1 (2009), pp. 325-339 | DOI
[67] Cell sorting analysis of geographically separated hypersaline environments, Extremophiles, Volume 17 (2013), pp. 265-275 | DOI
Cited by Sources: