Section: Evolutionary Biology
Topic: Evolution, Genetics/genomics, Population biology

On the potential for GWAS with phenotypic population means and allele-frequency data (popGWAS)

Corresponding author(s): Pfenninger, Markus (Markus.Pfenninger@senckenberg.de)

10.24072/pcjournal.544 - Peer Community Journal, Volume 5 (2025), article no. e40

Get full text PDF Peer reviewed and recommended by PCI

It is vital to understand the genomic basis of differences in ecologically important traits if we are to understand the impact of global change on biodiversity and enhance our ability for targeted intervention. This study explores the potential of a novel genome-wide association study (GWAS) approach for identifying loci underlying quantitative polygenic traits in natural populations, based on phenotypic population means and genome-wide allele frequency data as obtained e.g. by PoolSeq approaches. Extensive population genetic forward simulations demonstrate that the approach is generally effective for oligogenic and moderately polygenic traits and relatively insensitive to low heritability. However, applicability is limited for highly polygenic architectures and pronounced population structure. The required sample size is moderate with very good results being obtained already for a few dozen populations scored. When combined with machine learning for feature selection, the method performs very well in predicting population means. The data efficiency of the method, particularly when using pooled sequencing and bulk phenotyping, makes GWAS studies more accessible for research in biodiversity genomics. Moreover, in a direct comparison to individual based GWAS, the proposed method performed constistently better with regard to the number of true positive loci identified and prediction accuracy. Overall, this study highlights the promise of popGWAS for dissecting the genetic basis of complex traits in natural populations.

Published online:
DOI: 10.24072/pcjournal.544
Type: Research article

Pfenninger, Markus  1 , 2 , 3

1 Dept. Molecular Ecology, Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, D-60325 – Frankfurt am Main, Germany
2 LOEWE Centre for Translational Biodiversity Genomics, Senckenberg Biodiversity and Climate Research Centre, Senckenberganlage 25, D-60325 – Frankfurt am Main, Germany
3 Institute for Molecular and Organismic Evolution, Johannes Gutenberg University, Johann-Joachim-Becher-Weg 7, D-55128 – Mainz, Germany
License: CC-BY 4.0
Copyrights: The authors retain unrestricted copyrights and publishing rights
@article{10_24072_pcjournal_544,
     author = {Pfenninger, Markus},
     title = {On the potential for {GWAS} with phenotypic population means and allele-frequency data {(popGWAS)
}},
     journal = {Peer Community Journal},
     eid = {e40},
     year = {2025},
     publisher = {Peer Community In},
     volume = {5},
     doi = {10.24072/pcjournal.544},
     language = {en},
     url = {https://peercommunityjournal.org/articles/10.24072/pcjournal.544/}
}
TY  - JOUR
AU  - Pfenninger, Markus
TI  - On the potential for GWAS with phenotypic population means and allele-frequency data (popGWAS)

JO  - Peer Community Journal
PY  - 2025
VL  - 5
PB  - Peer Community In
UR  - https://peercommunityjournal.org/articles/10.24072/pcjournal.544/
DO  - 10.24072/pcjournal.544
LA  - en
ID  - 10_24072_pcjournal_544
ER  - 
%0 Journal Article
%A Pfenninger, Markus
%T On the potential for GWAS with phenotypic population means and allele-frequency data (popGWAS)

%J Peer Community Journal
%D 2025
%V 5
%I Peer Community In
%U https://peercommunityjournal.org/articles/10.24072/pcjournal.544/
%R 10.24072/pcjournal.544
%G en
%F 10_24072_pcjournal_544
Pfenninger, M. On the potential for GWAS with phenotypic population means and allele-frequency data (popGWAS). Peer Community Journal, Volume 5 (2025), article  no. e40. https://doi.org/10.24072/pcjournal.544

PCI peer reviews and recommendation, and links to data, scripts, code and supplementary information: 10.24072/pci.evolbiol.100834

Conflict of interest of the recommender and peer reviewers:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

[1] Adrion, J.; Galloway, J.; Kern, A. Predicting the landscape of recombination using deep learning, Molecular Biology and Evolution, Volume 37 (2020) no. 6, pp. 1790-1808 | DOI

[2] Aguirre-Liguori, J.; Ramírez-Barahona, S.; Gaut, B. The evolutionary genomics of species' responses to climate change, Nature Ecology & Evolution, Volume 5 (2021) no. 10, pp. 1350-1360 | DOI

[3] Barendse, W. The effect of measurement error of phenotypes on genome wide association studies, BMC Genomics, Volume 12 (2011) no. 1, p. 232 | DOI

[4] Barghi, N.; Tobler, R.; Nolte, V.; Jakšić, A.; Mallard, F.; Otte, K.; Dolezal, M.; Taus, T.; Kofler, R.; Schlötterer, C. Genetic redundancy fuels polygenic adaptation in Drosophila, PLoS Biology, Volume 17 (2019) no. 2 | DOI

[5] Barghi, N.; Hermisson, J.; Schlötterer, C. Polygenic adaptation: A unifying framework to understand positive selection, Nature Reviews Genetics, Volume 21 (2020) no. 12, pp. 769-781 | DOI

[6] Barton, N. Clines in polygenic traits, Genetics Research, Volume 74 (1999) no. 3, pp. 223-236 | DOI

[7] Bernatchez, L.; Ferchaud, A.-L.; Berger, C.; Venney, C.; Xuereb, A. Genomics for monitoring and understanding species responses to global climate change, Nature Reviews Genetics, Volume 1-19 (2023) | DOI

[8] Boyle, E.; Li, Y.; Pritchard, J. An expanded view of complex traits: From polygenic to omnigenic, Cell, Volume 169 (2017) no. 7, pp. 1177-1186 | DOI

[9] Brady, S.; Bolnick, D.; Angert, A.; Gonzalez, A.; Barrett, R.; Crispo, E.; Derry, A.; Eckert, C.; Fraser, D.; Fussmann, G. Causes of maladaptation, Evolutionary Applications, Volume 12 (2019) no. 7, pp. 1229-1242 | DOI

[10] Brandes, N.; Weissbrod, O.; Linial, M. Open problems in human trait genetics, Genome Biology, Volume 23 (2022) no. 1, p. 131 | DOI

[11] Capblancq, T.; Fitzpatrick, M.; Bay, R.; Exposito-Alonso, M.; Keller, S. Genomic prediction of (mal) adaptation across current and future climatic landscapes, Annual Review of Ecology, Evolution, and Systematics, Volume 51 (2020), pp. 245-269 | DOI

[12] Chakraborty, R. The distribution of the number of heterozygous loci in an individual in natural populations, Genetics, Volume 98 (1981) no. 2, p. 461 | DOI

[13] Czech, L.; Peng, Y.; Spence, J.; Lang, P.; Bellagio, T.; Hildebrandt, J.; Fritschi, K.; Schwab, R.; Rowan, B.; Weigel, D. Efficient analysis of allele frequency variation from whole-genome pool-sequencing data, Population, Evolutionary (2022)

[14] Czech, L.; Peng, Y.; Spence, J.; Lang, P.; Bellagio, T.; Hildebrandt, J.; Fritschi, K.; Schwab, R.; Rowan, B.; consortium, G. Monitoring rapid evolution of plant populations at scale with Pool-Sequencing, bioRxiv, 2022 | DOI

[15] Los Campos, G.; Vazquez, A.; Hsu, S.; Lello, L. Complex-trait prediction in the era of big data, Trends in Genetics, Volume 34 (2018) no. 10, pp. 746-754 | DOI

[16] Vladar, H.; Barton, N. Stability and response of polygenic traits to stabilizing selection and mutation, Genetics, Volume 197 (2014) no. 2, pp. 749-767 | DOI

[17] Dunker, S.; Boyd, M.; Durka, W.; Erler, S.; Harpole, W.; Henning, S.; Herzschuh, U.; Hornick, T.; Knight, T.; Lips, S.; Mäder, P.; Švara, E.; Mozarowski, S.; Rakosy, D.; Römermann, C.; Schmitt‐Jansen, M.; Stoof‐Leichsenring, K.; Stratmann, F.; Treudler, R.; Wilhelm, C. The potential of multispectral imaging flow cytometry for environmental monitoring, Cytometry Part A, Volume 101 (2022) no. 9, pp. 782-799 | DOI

[18] Exposito‐Alonso, M.; Drost, H.; Burbano, H.; Weigel, D. The Earth BioGenome project: Opportunities and challenges for plant genomics and conservation, The Plant Journal, Volume 102 (2020) no. 2, pp. 222-229 | DOI

[19] Flister, M.; Tsaih, S.-W.; O'Meara, C.; Endres, B.; Hoffman, M.; Geurts, A.; Dwinell, M.; Lazar, J.; Jacob, H.; Moreno, C. Identifying multiple causative genes at a single GWAS locus, Genome Research, Volume 23 (2013) no. 12, pp. 1996-2002 | DOI

[20] Formenti, G.; Theissinger, K.; Fernandes, C.; Bista, I.; Bombarely, A.; Bleidorn, C.; Ciofi, C.; Crottini, A.; Godoy, J.; Höglund, J. The era of reference genomes in conservation genomics, Trends in Ecology & Evolution, Volume 37 (2022) no. 3, pp. 197-202 | DOI

[21] Gauzere, J.; Pemberton, J.; Slate, J.; Morris, A.; Morris, S.; Walling, C.; Johnston, S. A polygenic basis for birth weight in a wild population of red deer (Cervus elaphus), G3: Genes, Genomes, Genetics, Volume 13 (2023) no. 4 | DOI

[22] Giorello, F.; Farias, J.; Basile, P.; Balmelli, G.; Silva, C. Evaluating the potential of XP-GWAS in Eucalyptus: Leaf heteroblasty as a case study, Plant Gene, Volume 36 (2023), p. 100430 | DOI

[23] Guillaume, F. popGWAS: Data-efficient trait mapping in natural populations for biodiversity research, Peer Community in Evolutionary Biology (2025) | DOI

[24] Harpak, A.; Przeworski, M. The evolution of group differences in changing environments, PLoS Biology, Volume 19 (2021) no. 1 | DOI

[25] Heuertz, M.; Carvalho, S.; Galindo, J.; Rinkevich, B.; Robakowski, P.; Aavik, T.; Altinok, I.; Barth, J.; Cotrim, H.; Goessen, R. The application gap: Genomics for biodiversity and ecosystem service management, Biological Conservation, Volume 278 (2023), p. 109883 | DOI

[26] Hill, W.; Goddard, M.; Visscher, P. Data and theory point to mainly additive genetic variance for complex traits, PLoS Genetics, Volume 4 (2008) no. 2 | DOI

[27] Hogg, C. Translating genomic advances into biodiversity conservation, Nature Reviews Genetics, Volume 1-12 (2023) | DOI

[28] Höllinger, I.; Wölfl, B.; Hermisson, J. A theory of oligogenic adaptation of a quantitative trait, Genetics, Volume 225 (2023) no. 2 | DOI

[29] Ithnin, M.; Vu, W.; Shin, M.-G.; Suryawanshi, V.; Sherbina, K.; Zolkafli, S.; Serdari, N.; Amiruddin, M.; Abdullah, N.; Mustaffa, S. Genomic diversity and genome-wide association analysis related to yield and fatty acid composition of wild American oil palm, Plant Science, Volume 304 (2021), p. 110731 | DOI

[30] Jain, K.; Stephan, W. Response of polygenic traits under stabilizing selection and mutation when loci have unequal effects, G3: Genes, Genomes, Genetics, Volume 5 (2015) no. 6, pp. 1065-1074 | DOI

[31] James, C.; Pemberton, J. M.; Navarro, P.; Knott, S. The impact of SNP density on quantitative genetic analyses of body size traits in a wild population of Soay sheep, Ecology and Evolution, Volume 12 (2022) no. 12 | DOI

[32] Johri, P.; Aquadro, C.; Beaumont, M.; Charlesworth, B.; Excoffier, L.; Eyre-Walker, A.; Keightley, P.; Lynch, M.; McVean, G.; Payseur, B. Recommendations for improving statistical inference in population genomics, PLoS Biology, Volume 20 (2022) no. 5 | DOI

[33] Kachuri, L.; Chatterjee, N.; Hirbo, J.; Schaid, D. J.; Martin, I.; Kullo, I. J.; Kenny, E. E.; Pasaniuc, B.; Auer, P. L.; Conomos, M. P.; Conti, D. V.; Ding, Y.; Wang, Y.; Zhang, H.; Zhang, Y.; Witte, J. S.; Ge, T. Principles and methods for transferring polygenic risk scores across global populations, Nature Reviews Genetics, Volume 25 (2023) no. 1, pp. 8-25 | DOI

[34] Kaneko, K.; Furusawa, C. An evolutionary relationship between genetic variation and phenotypic fluctuation, Journal of Theoretical Biology, Volume 240 (2006) no. 1, pp. 78-86 | DOI

[35] Kofler, R.; Orozco-terWengel, P.; Maio, N.; Pandey, R.; Nolte, V.; Futschik, A.; Kosiol, C.; Schlötterer, C. PoPoolation: A toolbox for population genetic analysis of next generation sequencing data from pooled individuals, PloS One, Volume 6 (2011) no. 1 | DOI

[36] Kumar, S.; Deng, C.; Molloy, C.; Kirk, C.; Plunkett, B.; Lin-Wang, K.; Allan, A.; Espley, R. Extreme-phenotype GWAS unravels a complex nexus between apple (Malus domestica) red-flesh colour and internal flesh browning, Fruit Research, Volume 2 (2022) no. 1, pp. 1-14 | DOI

[37] Lotterhos, K.; Fitzpatrick, M.; Blackmon, H. Simulation Tests of Methods in Evolution, Ecology, and Systematics: Pitfalls, Progress, and Principles, Annual Review of Ecology, Evolution, and Systematics, Volume 53 (2022) no. 1, pp. 113-136 | DOI

[38] Lynch, M.; Walsh, B. Genetics and analysis of quantitative traits, Sinauer Sunderland, MA, 1998

[39] Mackay, T. Epistasis and quantitative traits: Using model organisms to study gene-gene interactions, Nature Reviews Genetics, Volume 15 (2014) no. 1, pp. 22-33 | DOI

[40] Marchini, J.; Cardon, L.; Phillips, M.; Donnelly, P. The effects of human population structure on large genetic association studies, Nature Genetics, Volume 36 (2004) no. 5, pp. 512-517 | DOI

[41] Mathieson, I. The omnigenic model and polygenic prediction of complex traits, The American Journal of Human Genetics, Volume 108 (2021) no. 9, pp. 1558-1563 | DOI

[42] Moore, J.; Williams, S. Traversing the conceptual divide between biological and statistical epistasis: Systems biology and a more modern synthesis, BioEssays, Volume 27 (2005) no. 6, pp. 637-646 | DOI

[43] Müller, R.; Kaj, I.; Mugal, C. A nearly neutral model of molecular signatures of natural selection after change in population size, Genome Biology and Evolution, Volume 14 (2022) no. 5 | DOI

[44] O'Connor, L. The distribution of common-variant effect sizes, Nature Genetics, Volume 53 (2021) no. 8, pp. 1243-1249 | DOI

[45] Orr, H. Testing natural selection vs. Genetic drift in phenotypic evolution using quantitative trait locus data, Genetics, Volume 149 (1998) no. 4, pp. 2099-2104 | DOI

[46] Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; Vanderplas, J.; Passos, A.; Cournapeau, D.; Brucher, M.; Perrot, M.; Duchesnay, É. Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, Volume 12 (2011) no. 85, pp. 2825-2830

[47] Pfenninger, M. Supplementary Material for the manuscript On the potential for GWAS with phenotypic population means and allele-frequency data (popGWAS), Zenodo, 2025 | DOI

[48] Pfenninger, M.; Foucault, Q. Population Genomic Time Series Data of a Natural Population Suggests Adaptive Tracking of Fluctuating Environmental Changes, Integrative and Comparative Biology, Volume 62 (2022) no. 6, pp. 1812-1826 | DOI

[49] Pfenninger, M.; Patel, S.; Arias‐Rodriguez, L.; Feldmeyer, B.; Riesch, R.; Plath, M. Unique evolutionary trajectories in repeated adaptation to hydrogen sulphide‐toxic habitats of a neotropical fish (Poecilia mexicana, Molecular Ecology, Volume 24 (2015) no. 21, pp. 5446-5459 | DOI

[50] Pfenninger, M.; Reuss, F.; Kiebler, A.; Schönnenbeck, P.; Caliendo, C.; Gerber, S.; Cocchiararo, B.; Reuter, S.; Blüthgen, N.; Mody, K. Genomic basis for drought resistance in European beech forests threatened by climate change, Elife, Volume 10 (2021), p. 65532 | DOI

[51] Pritchard, J.; Rienzo, A. Adaptation-not by sweeps alone, Nature Reviews Genetics, Volume 11 (2010) no. 10, pp. 665-667 | DOI

[52] R Core team R: A language and environment for statistical computing, 2013

[53] Rijsbergen, C. v. Information retrieval, Butterworth-Heinemann, United States, 1979 (https://dl.acm.org/doi/abs/10.5555/539927)

[54] Rudman, S.; Greenblum, S.; Rajpurohit, S.; Betancourt, N.; Hanna, J.; Tilk, S.; Yokoyama, T.; Petrov, D.; Schmidt, P. Direct observation of adaptive tracking on ecological timescales in Drosophila, bioRxiv, 2021 | DOI

[55] Santure, A.; Garant, D. Wild GWAS-association mapping in natural populations, Molecular Ecology Resources, Volume 18 (2018) no. 4, pp. 729-738 | DOI

[56] Schlötterer, C.; Tobler, R.; Kofler, R.; Nolte, V. Sequencing pools of individuals-Mining genome-wide polymorphism data without big funding, Nature Reviews Genetics, Volume 15 (2014) no. 11, pp. 749-763 | DOI

[57] Sella, G.; Barton, N. Thinking about the evolution of complex traits in the era of genome-wide association studies, Annual Review of Genomics and Human Genetics, Volume 20 (2019), pp. 461-493 | DOI

[58] Shendure, J.; Findlay, G.; Snyder, M. Genomic medicine-progress, pitfalls, and promise, Cell, Volume 177 (2019) no. 1, pp. 45-57 | DOI

[59] Shmueli, G. To explain or to predict? Statist, Sci, Volume 25 (2010) no. 3, p. 289 | DOI

[60] Stinchcombe, J.; Hoekstra, H. Combining population genomics and quantitative genetics: Finding the genes underlying ecologically important traits, Heredity, Volume 100 (2008) no. 2, pp. 158-170 | DOI

[61] Sul, J.; Martin, L.; Eskin, E. Population structure in genetic studies: Confounding factors and mixed models, PLoS Genetics, Volume 14 (2018) no. 12 | DOI

[62] Taylor, C.; Higgs, P. A population genetics model for multiple quantitative traits exhibiting pleiotropy and epistasis, Journal of Theoretical Biology, Volume 203 (2000) no. 4, pp. 419-437 | DOI

[63] Team, T. PyPy, 2019 (https://www.pypy.org/)

[64] Tills, O.; Holmes, L.; Quinn, E.; Everett, T.; Truebano, M.; Spicer, J. Phenomics enables measurement of complex responses of developing animals to global environmental drivers, Science of the Total Environment, Volume 858 (2023), p. 159555 | DOI

[65] Turchin, M.; Chiang, C.; Palmer, C.; Sankararaman, S.; Reich, D.; Hirschhorn, J. Evidence of widespread selection on standing variation in Europe at height-associated SNPs, Nature Genetics, Volume 44 (2012) no. 9, pp. 1015-1019 | DOI

[66] Uffelmann, E.; Huang, Q.; Munung, N.; Vries, J.; Okada, Y.; Martin, A.; Martin, H.; Lappalainen, T.; Posthuma, D. Genome-wide association studies, Nature Reviews Methods Primers, Volume 1 (2021) no. 1, p. 59 | DOI

[67] Rossum, G.; Drake, F. Introduction to python 3: Python documentation manual part 1, CreateSpace, 2009 (https://dl.acm.org/doi/abs/10.5555/1592885)

[68] Visscher, P.; Brown, M.; McCarthy, M.; Yang, J. Five years of GWAS discovery, The American Journal of Human Genetics, Volume 90 (2012) no. 1, pp. 7-24 | DOI

[69] Visscher, P.; Wray, N.; Zhang, Q.; Sklar, P.; McCarthy, M.; Brown, M.; Yang, J. 10 years of GWAS discovery: Biology, function, and translation, The American Journal of Human Genetics, Volume 101 (2017) no. 1, pp. 5-22 | DOI

[70] Waldvogel, A.-M.; Feldmeyer, B.; Rolshausen, G.; Exposito-Alonso, M.; Rellstab, C.; Kofler, R.; Mock, T.; Schmid, K.; Schmitt, I.; Bataillon, T. Evolutionary genomics can improve prediction of species' responses to climate change, Evolution Letters, Volume 4 (2020) no. 1, pp. 4-18 | DOI

[71] Wallace, C. A more accurate method for colocalisation analysis allowing for multiple causal variants, PLoS Genetics, Volume 17 (2021) no. 9 | DOI

[72] Wang, K.; Dickson, S.; Stolle, C.; Krantz, I.; Goldstein, D.; Hakonarson, H. Interpretation of association signals and identification of causal variants from genome-wide association studies, The American Journal of Human Genetics, Volume 86 (2010) no. 5, pp. 730-742 | DOI

[73] Wray, N.; Kemper, K.; Hayes, B.; Goddard, M.; Visscher, P. Complex trait prediction from genome data: Contrasting EBV in livestock to PRS in humans: genomic prediction, Genetics, Volume 211 (2019) no. 4, pp. 1131-1141 | DOI

[74] Wright, S. The Genetical stucture of populations, Annals of Eugenics, Volume 15 (1949) no. 1, pp. 323-354 | DOI

[75] Xie, C.; Yang, C. A review on plant high-throughput phenotyping traits using UAV-based sensors, Computers and Electronics in Agriculture, Volume 178 (2020), p. 105731 | DOI

[76] Yang, J.; Jiang, H.; Yeh, C.-T.; Yu, J.; Jeddeloh, J.; Nettleton, D.; Schnable, P. Extreme-phenotype genome-wide association study (XP-GWAS): A method for identifying trait-associated variants by sequencing pools of individuals selected from a diversity panel, The Plant Journal, Volume 84 (2015) no. 3, pp. 587-596 | DOI

[77] Zhang, W.; Liu, A.; Albert, P.; Ashmead, R.; Schisterman, E.; Mills, J. A pooling strategy to effectively use genotype data in quantitative traits genome‐wide association studies, Statistics in Medicine, Volume 37 (2018) no. 27, pp. 4083-4095 | DOI

[78] Ziyatdinov, A.; Kim, J.; Prokopenko, D.; Privé, F.; Laporte, F.; Loh, P.-R.; Kraft, P.; Aschard, H. Estimating the effective sample size in association studies of quantitative traits, G3, Volume 11 (2021) no. 6 | DOI

Cited by Sources: