Section: Mathematical & Computational Biology
Topic:
Biophysics and computational biology,
Evolution,
Genetics/Genomics
An approximate likelihood method reveals ancient gene flow between human, chimpanzee and gorilla
Corresponding author(s): Galtier, Nicolas (nicolas.galtier@umontpellier.fr)
10.24072/pcjournal.359 - Peer Community Journal, Volume 4 (2024), article no. e3.
Get full text PDF Peer reviewed and recommended by PCIGene flow and incomplete lineage sorting are two distinct sources of phylogenetic conflict, i.e., gene trees that differ in topology from each other and from the species tree. Distinguishing between the two processes is a key objective of current evolutionary genomics. This is most often pursued via the so-called ABBA-BABA type of method, which relies on a prediction of symmetry of gene tree discordance made by the incomplete lineage sorting hypothesis. Gene flow, however, need not be asymmetric, and when it is not, ABBA-BABA approaches do not properly measure the prevalence of gene flow. I introduce Aphid, an approximate maximum-likelihood method aimed at quantifying the sources of phylogenetic conflict via topology and branch length analysis of three-species gene trees. Aphid draws information from the fact that gene trees affected by gene flow tend to have shorter branches, and gene trees affected by incomplete lineage sorting longer branches, than the average gene tree. Accounting for the among-loci variance in mutation rate and gene flow time, Aphid returns estimates of the speciation times and ancestral effective population size, and a posterior assessment of the contribution of gene flow and incomplete lineage sorting to the conflict. Simulations suggest that Aphid is reasonably robust to a wide range of conditions. Analysis of coding and non-coding data in primates illustrates the potential of the approach and reveals that a substantial fraction of the human/chimpanzee/gorilla phylogenetic conflict is due to ancient gene flow. Aphid also predicts older speciation times and a smaller estimated effective population size in this group, compared to existing analyses assuming no gene flow.
Type: Research article
Galtier, Nicolas 1
@article{10_24072_pcjournal_359, author = {Galtier, Nicolas}, title = {An approximate likelihood method reveals ancient gene flow between human, chimpanzee and gorilla}, journal = {Peer Community Journal}, eid = {e3}, publisher = {Peer Community In}, volume = {4}, year = {2024}, doi = {10.24072/pcjournal.359}, language = {en}, url = {https://peercommunityjournal.org/articles/10.24072/pcjournal.359/} }
TY - JOUR AU - Galtier, Nicolas TI - An approximate likelihood method reveals ancient gene flow between human, chimpanzee and gorilla JO - Peer Community Journal PY - 2024 VL - 4 PB - Peer Community In UR - https://peercommunityjournal.org/articles/10.24072/pcjournal.359/ DO - 10.24072/pcjournal.359 LA - en ID - 10_24072_pcjournal_359 ER -
%0 Journal Article %A Galtier, Nicolas %T An approximate likelihood method reveals ancient gene flow between human, chimpanzee and gorilla %J Peer Community Journal %D 2024 %V 4 %I Peer Community In %U https://peercommunityjournal.org/articles/10.24072/pcjournal.359/ %R 10.24072/pcjournal.359 %G en %F 10_24072_pcjournal_359
Galtier, Nicolas. An approximate likelihood method reveals ancient gene flow between human, chimpanzee and gorilla. Peer Community Journal, Volume 4 (2024), article no. e3. doi : 10.24072/pcjournal.359. https://peercommunityjournal.org/articles/10.24072/pcjournal.359/
PCI peer reviews and recommendation, and links to data, scripts, code and supplementary information: 10.24072/pci.mcb.100199
Conflict of interest of the recommender and peer reviewers:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.
[1] Lateral gene transfer as a support for the tree of life, Proc Natl Acad Sci U S A, Volume 109 (2012) no. 13, pp. 4962-4967 | DOI
[2] Fossil apes and human evolution, Science, Volume 372 (2021) no. 6542, p. eabb4363 | DOI
[3] Changes in life history and population size can explain the relative neutral diversity levels on X and autosomes in extant human populations, Proc Natl Acad Sci U S A, Volume 117 (2020) no. 33, pp. 20063-20069 | DOI
[4] HyDe: A Python Package for Genome-Scale Hybridization Detection, Syst Biol, Volume 67 (2018) no. 5, pp. 821-829 | DOI
[5] Estimating bonobo (Panpaniscus) and chimpanzee (Pantroglodytes) evolutionary history from nucleotide site patterns, Proc Natl Acad Sci U S A, Volume 119 (2022) no. 17, p. e2200858119 | DOI
[6] Speciation, Oxford University Press, Oxford, New York, 2004
[7] Testing for ancient admixture between closely related populations, Mol Biol Evol, Volume 28 (2011) no. 8, pp. 2239-2252 | DOI
[8] Ancestral population genomics: the coalescent hidden Markov model approach, Genetics, Volume 183 (2009) no. 1, pp. 259-274 | DOI
[9] Genomic architecture and introgression shape a butterfly radiation, Science, Volume 366 (2019) no. 6465, pp. 594-599 | DOI
[10] The different levels of genetic diversity in sex chromosomes and autosomes, Trends Genet, Volume 25 (2009) no. 6, pp. 278-284 | DOI
[11] A Bayesian Implementation of the Multispecies Coalescent Model with Introgression for Phylogenomic Analysis, Mol Biol Evol, Volume 37 (2020) no. 4, pp. 1211-1223 | DOI
[12] Mosquito genomics. Extensive introgression in a malaria vector species complex revealed by phylogenomics, Science, Volume 347 (2015) no. 6217, p. 1258524 | DOI
[13] An approximate likelihood method reveals ancient gene flow between human, chimpanzee and gorilla, data.InDoRES, 2023 (V3) | DOI
[14] Inferring the history of speciation in house mice from autosomal, X-linked, Y-linked and mitochondrial genes, Mol Ecol, Volume 17 (2008) no. 24, pp. 5349-5363 | DOI
[15] Pervasive hybridizations in the history of wheat relatives, Sci Adv, Volume 5 (2019) no. 5, p. eaav9188 | DOI
[16] A draft sequence of the Neandertal genome, Science, Volume 328 (2010) no. 5979, pp. 710-722 | DOI
[17] De Novo Genes Arise at a Slow but Steady Rate along the Primate Lineage and Have Been Subject to Incomplete Lineage Sorting, Genome Biol Evol, Volume 8 (2016) no. 4, pp. 1222-1232 | DOI
[18] Sex-biased evolutionary forces shape genomic patterns of human diversity, PLoS Genet, Volume 4 (2008) no. 9, p. e1000202 | DOI
[19] Phylogenomic approaches to detecting and characterizing introgression, Genetics, Volume 220 (2022) no. 2, p. iyab173 | DOI
[20] Genomic relationships and speciation times of human, chimpanzee, and gorilla inferred from a coalescent hidden Markov model, PLoS Genet, Volume 3 (2007) no. 2, p. e7 | DOI
[21] Incomplete lineage sorting patterns among human, chimpanzee, and orangutan suggest recent orangutan speciation and widespread selection, Genome Res, Volume 21 (2011) no. 3, pp. 349-356 | DOI
[22] Difficulties in detecting hybridization, Syst Biol, Volume 50 (2001) no. 6, pp. 978-982 | DOI
[23] Properties of a neutral allele model with intragenic recombination, Theor Popul Biol, Volume 23 (1983) no. 2, pp. 183-201 | DOI
[24] Comparative recombination rates in the rat, mouse, and human genomes, Genome Res, Volume 14 (2004) no. 4, pp. 528-538 | DOI
[25] Multispecies coalescent and its applications to infer species phylogenies and cross-species gene flow, Natl Sci Rev, Volume 8 (2021) no. 12, p. nwab127 | DOI
[26] A statistical approach for distinguishing hybridization and incomplete lineage sorting, Am Nat, Volume 174 (2009) no. 2, pp. E54-70 | DOI
[27] Accelerated genetic drift on chromosome X during the human dispersal out of Africa, Nat Genet, Volume 41 (2009) no. 1, pp. 66-70 | DOI
[28] Is recombination a problem for species-tree analyses?, Syst Biol, Volume 61 (2012) no. 4, pp. 691-701 | DOI
[29] PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating, Bioinformatics, Volume 25 (2009) no. 17, pp. 2286-2288 | DOI
[30] The influence of gene flow on species tree estimation: a simulation study, Syst Biol, Volume 63 (2014) no. 1, pp. 17-30 | DOI
[31] The Effect of Gene Flow on Coalescent-based Species-Tree Inference, Syst Biol, Volume 67 (2018) no. 5, pp. 770-785 | DOI
[32] A new isolation with migration model along complete genomes infers very different divergence processes among closely related great ape species, PLoS Genet, Volume 8 (2012) no. 12, p. e1003125 | DOI
[33] Strong male-driven evolution of DNA sequences in humans and apes, Nature, Volume 416 (2002) no. 6881, pp. 624-626 | DOI
[34] How reticulated are species?, Bioessays, Volume 38 (2016) no. 2, pp. 140-149 | DOI
[35] Ancient hybridizations among the ancestral genomes of bread wheat, Science, Volume 345 (2014) no. 6194, p. 1250092 | DOI
[36] Evaluating the use of ABBA-BABA statistics to locate introgressed loci, Mol Biol Evol, Volume 32 (2015) no. 1, pp. 244-257 | DOI
[37] Recombination rate variation shapes barriers to introgression across butterfly genomes, PLoS Biol, Volume 17 (2019) no. 2, p. e2006288 | DOI
[38] Widespread genomic signatures of natural selection in hominid evolution, PLoS Genet, Volume 5 (2009) no. 5, p. e1000471 | DOI
[39] Extensive Genome-Wide Phylogenetic Discordance Is Due to Incomplete Lineage Sorting and Not Ongoing Introgression in a Rapidly Radiated Bryophyte Genus, Mol Biol Evol, Volume 38 (2021) no. 7, pp. 2750-2766 | DOI
[40] Gene Tree Discordance Causes Apparent Substitution Rate Variation, Syst Biol, Volume 65 (2016) no. 4, pp. 711-721 | DOI
[41] Disentangling Incomplete Lineage Sorting and Introgression to Refine Species-Tree Estimates for Lake Tanganyika Cichlid Fishes, Syst Biol, Volume 66 (2017) no. 4, pp. 531-550 | DOI
[42] A high-coverage genome sequence from an archaic Denisovan individual, Science, Volume 338 (2012) no. 6104, pp. 222-226 | DOI
[43] A fine-scale map of recombination rates and hotspots across the human genome, Science, Volume 310 (2005) no. 5746, pp. 321-324 | DOI
[44] Relationships between gene trees and species trees, Mol Biol Evol, Volume 5 (1988) no. 5, pp. 568-583 | DOI
[45] Genetic evidence for complex speciation of humans and chimpanzees, Nature, Volume 441 (2006) no. 7097, pp. 1103-1108 | DOI
[46] Detection and Polarization of Introgression in a Five-Taxon Phylogeny, Syst Biol, Volume 64 (2015) no. 4, pp. 651-662 | DOI
[47] Doubts about complex speciation between humans and chimpanzees, Trends Ecol Evol, Volume 24 (2009) no. 10, pp. 533-540 | DOI
[48] Efficient Bayesian Species Tree Inference under the Multispecies Coalescent, Syst Biol, Volume 66 (2017) no. 5, pp. 823-842 | DOI
[49] Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci, Genetics, Volume 164 (2003) no. 4, pp. 1645-1656 | DOI
[50] Genetic history of an archaic hominin group from Denisova Cave in Siberia, Nature, Volume 468 (2010) no. 7327, pp. 1053-1060 | DOI
[51] Pervasive incomplete lineage sorting illuminates speciation and selection in primates, Science, Volume 380 (2023) no. 6648, p. eabn4409 | DOI
[52] TRAILS: tree reconstruction of ancestry using incomplete lineage sorting, BioRxiv, Volume 546039 (2023) | DOI
[53] Legofit: estimating population history from genetic data, BMC Bioinformatics, Volume 20 (2019) no. 1, p. 526 | DOI
[54] An efficient algorithm for estimating population history from genetic data, Peer Community Journal, Volume 2 (2022), p. e32 | DOI
[55] Aphid: A novel statistical method for dissecting gene flow and lineage sorting in phylogenetic conflict, Peer Community In Mathematical and Computational Biology (2024), p. 100199 | DOI
[56] Gene flow contributes to diversification of the major fungal pathogen Candida albicans, Nat Commun, Volume 9 (2018) no. 1, p. 2253 | DOI
[57] Shedding Light on the Grey Zone of Speciation along a Continuum of Genomic Divergence, PLoS Biol, Volume 14 (2016) no. 12, p. e2000234 | DOI
[58] The genomic landscape of Neanderthal ancestry in present-day humans, Nature, Volume 507 (2014) no. 7492, pp. 354-357 | DOI
[59] Insights into hominid evolution from the gorilla genome sequence, Nature, Volume 483 (2012) no. 7388, pp. 169-175 | DOI
[60] Consequences of recombination on traditional phylogenetic analysis, Genetics, Volume 156 (2000) no. 2, pp. 879-891 | DOI
[61] OrthoMaM v10: Scaling-Up Orthologous Coding Sequence and Exon Alignments with More than One Hundred Mammalian Genomes, Mol Biol Evol, Volume 36 (2019) no. 4, pp. 861-862 | DOI
[62] Phylogenetics in the genomic era, Open Access Book, 2020
[63] Incomplete Lineage Sorting in Mammalian Phylogenomics, Syst Biol, Volume 66 (2017) no. 1, pp. 112-120 | DOI
[64] Do Heliconius butterfly species exchange mimicry alleles?, Biology Letters, Volume 9 (2013) no. 4, p. 20130503 | DOI
[65] Genome-wide analysis reveals signatures of complex introgressive gene flow in macaques (genus Macaca), Zool Res, Volume 42 (2021) no. 4, pp. 433-449 | DOI
[66] Widespread introgression across a phylogeny of 155 Drosophila genomes, Curr Biol, Volume 32 (2022) no. 1, pp. 111-123 | DOI
[67] Deep Ancestral Introgression Shapes Evolutionary History of Dragonflies and Damselflies, Syst Biol, Volume 71 (2022) no. 3, pp. 526-546 | DOI
[68] Divergence time and population size in the lineage leading to modern humans, Theor Popul Biol, Volume 48 (1995) no. 2, pp. 198-221 | DOI
[69] Estimating the rate of evolution of the rate of molecular evolution, Mol Biol Evol, Volume 15 (1998) no. 12, pp. 1647-1657 | DOI
[70] Ghost Lineages Highly Influence the Interpretation of Introgression Tests, Syst Biol, Volume 71 (2022) no. 5, pp. 1147-1158 | DOI
[71] Ghost lineages can invalidate or even reverse findings regarding gene flow, PLoS Biol, Volume 20 (2022) no. 9, p. e3001776 | DOI
[72] Primate phylogenomics uncovers multiple rapid radiations and ancient interspecific introgression, PLoS Biol, Volume 18 (2020) no. 12, p. e3000954 | DOI
[73] Complex speciation of humans and chimpanzees, Nature, Volume 452 (2008) no. 7184, p. E3 | DOI
[74] Coestimating Reticulate Phylogenies and Gene Trees from Multilocus Sequence Data, Syst Biol, Volume 67 (2018) no. 3, pp. 439-457 | DOI
[75] Genome analyses substantiate male mutation bias in many species, Bioessays, Volume 33 (2011) no. 12, pp. 938-945 | DOI
[76] Challenges in Species Tree Estimation Under the Multispecies Coalescent Model, Genetics, Volume 204 (2016) no. 4, pp. 1353-1368 | DOI
[77] An autosomal analysis gives no genetic evidence for complex speciation of humans and chimpanzees, Mol Biol Evol, Volume 29 (2012) no. 1, pp. 145-156 | DOI
[78] Likelihood and Bayes estimation of ancestral population sizes in hominoids using data from multiple loci, Genetics, Volume 162 (2002) no. 4, pp. 1811-1823 | DOI
[79] A likelihood ratio test of speciation with gene flow using genomic sequence data, Genome Biol Evol, Volume 2 (2010), pp. 200-211 | DOI
[80] Estimation of Cross-Species Introgression Rates Using Genomic Data Despite Model Unidentifiability, Mol Biol Evol, Volume 39 (2022) no. 5, p. msac083 | DOI
[81] Most Genomic Loci Misrepresent the Phylogeny of an Avian Radiation Because of Ancient Gene Flow, Syst Biol, Volume 70 (2021) no. 5, pp. 961-975 | DOI
[82] Displayed Trees Do Not Determine Distinguishability Under the Network Multispecies Coalescent, Syst Biol, Volume 66 (2017) no. 2, pp. 283-298 | DOI
[83] A simulation study to examine the impact of recombination on phylogenomic inferences under the multispecies coalescent model, Mol Ecol, Volume 31 (2022) no. 10, pp. 2814-2829 | DOI
Cited by Sources: