A simple procedure to detect, test for the presence of stuttering, and cure stuttered data with spreadsheet programs

10.24072/pcjournal.165 - Peer Community Journal, Volume 2 (2022), article no. e52.

Get full text PDF Peer reviewed and recommended by PCI

Microsatellites are powerful markers for empirical population genetics, but may be affected by amplification problems like stuttering that produces heterozygote deficits between alleles with one repeat difference. In this paper, we present a simple procedure that aims at detecting stuttering for each locus overall subsamples and only requires the use of a spreadsheet interactive application on any operating system. We compare the performances of this procedure with the one of MicroChecker on simulations of dioecious pangamic populations, monoecious selfing populations and clonal populations with or without stuttering, and on real data of vectors and parasites. We also propose a cure for loci affected and compare the results with those expected without stuttering. In sexual populations (dioecious or selfers), the new procedure appeared more than three times more efficient than MicroChecker. Cure was able to restore Wright's FIS of stuttered data to the expected value, and particularly so in selfing simulations. In clones, lack of segregation artificially increased false stuttering detection, and only highly significant stuttering tests and loci strongly deviating from others, could be usefully cured, in which case FIS estimate could be much improved. In doubt, and whenever possible, removal of affected and not curable loci may help to shift population genetics parameter estimates towards more reliable values.

Published online:
DOI: 10.24072/pcjournal.165
De Meeûs, Thierry 1; Noûs, Camille 2

1 Univ Montpellier, Cirad, IRD, Intertryp - Montpellier, France
2 Cogitamus laboratory – France
License: CC-BY 4.0
Copyrights: The authors retain unrestricted copyrights and publishing rights
     author = {De Mee\^us, Thierry and No\^us, Camille},
     title = {A simple procedure to detect, test for the presence of stuttering, and cure stuttered data with spreadsheet programs},
     journal = {Peer Community Journal},
     eid = {e52},
     publisher = {Peer Community In},
     volume = {2},
     year = {2022},
     doi = {10.24072/pcjournal.165},
     url = {https://peercommunityjournal.org/articles/10.24072/pcjournal.165/}
TI  - A simple procedure to detect, test for the presence of stuttering, and cure stuttered data with spreadsheet programs
JO  - Peer Community Journal
PY  - 2022
DA  - 2022///
VL  - 2
PB  - Peer Community In
UR  - https://peercommunityjournal.org/articles/10.24072/pcjournal.165/
UR  - https://doi.org/10.24072/pcjournal.165
DO  - 10.24072/pcjournal.165
ID  - 10_24072_pcjournal_165
ER  - 
%0 Journal Article
%T A simple procedure to detect, test for the presence of stuttering, and cure stuttered data with spreadsheet programs
%J Peer Community Journal
%D 2022
%V 2
%I Peer Community In
%U https://doi.org/10.24072/pcjournal.165
%R 10.24072/pcjournal.165
%F 10_24072_pcjournal_165
De Meeûs, Thierry; Noûs, Camille. A simple procedure to detect, test for the presence of stuttering, and cure stuttered data with spreadsheet programs. Peer Community Journal, Volume 2 (2022), article  no. e52. doi : 10.24072/pcjournal.165. https://peercommunityjournal.org/articles/10.24072/pcjournal.165/

Peer reviewed and recommended by PCI : 10.24072/pci.zool.100016

[1] Balloux, F. EASYPOP (Version 1.7): A Computer Program for Population Genetics Simulations, Journal of Heredity, Volume 92 (2001) no. 3, pp. 301-302 | DOI

[2] Balloux, F. Heterozygote excess in small populations and the heterozygote-excess effective population size, Evolution, Volume 58 (2004) no. 9, pp. 1891-1900 | DOI

[3] Balloux, F.; Lehmann, L.; de Meeûs, T. The Population Genetics of Clonal and Partially Clonal Diploids, Genetics, Volume 164 (2003) no. 4, pp. 1635-1644 | DOI

[4] Bates, D.; Mächler, M.; Bolker, B.; Walker, S. Fitting Linear Mixed-Effects Models Using lme4, Journal of Statistical Software, Volume 67 (2015) no. 1, pp. 1-48 | DOI

[5] Benjamini, Y.; Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing, Journal of the Royal Statistical Society: Series B (Methodological), Volume 57 (1995) no. 1, pp. 289-300 | DOI

[6] Benjamini, Y.; Yekutieli, D. The control of the false discovery rate in multiple testing under dependency, The Annals of Statistics, Volume 29 (2001) no. 4, pp. 1165-1188 | DOI

[7] Berté, D.; De Meeûs, T.; Kaba, D.; Séré, M.; Djohan, V.; Courtin, F.; N'Djetchi Kassi, M.; Koffi, M.; Jamonneau, V.; Ta, B. T. D.; Solano, P.; N'Goran, E. K.; Ravel, S. Population genetics of Glossina palpalis palpalis in sleeping sickness foci of Côte d’Ivoire before and after vector control, Infection, Genetics and Evolution, Volume 75 (2019) | DOI

[8] Castle, W. E. The Laws of Heredity of Galton and Mendel, and Some Laws Governing Race Improvement by Selection, Proceedings of the American Academy of Arts and Sciences, Volume 39 (1903) no. 8, pp. 223-242 | DOI

[9] Chapuis, M.-P.; Estoup, A. Microsatellite Null Alleles and Estimation of Population Differentiation, Molecular Biology and Evolution, Volume 24 (2007) no. 3, pp. 621-631 | DOI

[10] Correa, A. C.; De Meeûs, T.; Dreyfuss, G.; Rondelaud, D.; Hurtrez-Boussès, S. Galba truncatula and Fasciola hepatica : Genetic costructures and interactions with intermediate host dispersal, Infection, Genetics and Evolution, Volume 55 (2017), pp. 186-194 | DOI

[11] De Meeûs, T. Genetic identities and local inbreeding in pure diploid clones with homoplasic markers: SNPs may be misleading, Infection, Genetics and Evolution, Volume 33 (2015), pp. 227-232 | DOI

[12] De Meeûs, T. Revisiting FIS, FST, Wahlund Effects, and Null Alleles, Journal of Heredity, Volume 109 (2017) no. 4, pp. 446-456 | DOI

[13] De Meeûs, T.; Balloux, F. Clonal reproduction and linkage disequilibrium in diploids: a simulation study, Infection, Genetics and Evolution, Volume 4 (2004) no. 4, pp. 345-351 | DOI

[14] De Meeûs, T.; Balloux, F. F-statistics of clonal diploids structured in numerous demes, Molecular Ecology, Volume 14 (2005) no. 9, pp. 2695-2702 | DOI

[15] De Meeûs, T.; Chan, C. T.; Ludwig, J. M.; Tsao, J. I.; Patel, J.; Bhagatwala, J.; Beati, L. Deceptive combined effects of short allele dominance and stuttering: an example with Ixodes scapularis, the main vector of Lyme disease in the U.S.A., Peer Community Journal, Volume 1 (2021) | DOI

[16] De Meeûs, T.; Guégan, J.-F.; Teriokhin, A. T. MultiTest V.1.2, a program to binomially combine independent tests and performance comparison with other related methods on proportional data, BMC Bioinformatics, Volume 10 (2009) no. 1 | DOI

[17] De Meeûs, T.; Humair, P.-F.; Grunau, C.; Delaye, C.; Renaud, F. Non-Mendelian transmission of alleles at microsatellite loci: an example in Ixodes ricinus, the vector of Lyme disease, International Journal for Parasitology, Volume 34 (2004) no. 8, pp. 943-950 | DOI

[18] De Meeûs, T.; Lehmann, L.; Balloux, F. Molecular epidemiology of clonal diploids: A quick overview and a short DIY (do it yourself) notice, Infection, Genetics and Evolution, Volume 6 (2006) no. 2, pp. 163-170 | DOI

[19] De Meeûs, T.; McCoy, K. D.; Prugnolle, F.; Chevillon, C.; Durand, P.; Hurtrez-Boussès, S.; Renaud, F. Population genetics and molecular epidemiology or how to “débusquer la bête”, Infection, Genetics and Evolution, Volume 7 (2007) no. 2, pp. 308-332 | DOI

[20] Fox, J. The R Commander: A Basic-Statistics Graphical User Interface to R, Journal of Statistical Software, Volume 14 (2005) no. 9, pp. 1-42 | DOI

[21] Fox, J. Extending the R commander by "plug in" packages, R News,, Volume 7 (2007), pp. 46-52 (https://stat.ethz.ch/pipermail/r-help/attachments/20071101/3603125e/attachment.pdf)

[22] Gómez-Palacio, A.; Triana, O.; Jaramillo-O, N.; Dotson, E. M.; Marcet, P. L. Eco-geographical differentiation among Colombian populations of the Chagas disease vector Triatoma dimidiata (Hemiptera: Reduviidae), Infection, Genetics and Evolution, Volume 20 (2013), pp. 352-361 | DOI

[23] Goudet, J. FSTAT (Version 1.2): A Computer Program to Calculate F-Statistics, Journal of Heredity, Volume 86 (1995) no. 6, pp. 485-486 | DOI

[24] Goudet, J. Fstat (ver. 2.9.4), a program to estimate and test population genetics parameters. Updated from Goudet (1995)., 2003 (http://www.t-de-meeus.fr/Programs/Fstat294.zip)

[25] Guichoux, E.; Lagache, L.; Wagner, S.; Chaumeil, P.; Léger, P.; Lepais, O.; Lepoittevin, C.; Malausa, T.; Revardel, E.; Salin, F.; Petit, R. Current trends in microsatellite genotyping, Molecular Ecology Resources, Volume 11 (2011) no. 4, pp. 591-611 | DOI

[26] Jarne, P.; Lagoda, P. J. Microsatellites, from molecules to populations and back, Trends in Ecology & Evolution, Volume 11 (1996) no. 10, pp. 424-429 | DOI

[27] Koffi, M.; De Meeûs, T.; Bucheton, B.; Solano, P.; Camara, M.; Kaba, D.; Cuny, G.; Ayala, F. J.; Jamonneau, V. Population genetics of Trypanosoma brucei gambiense, the agent of sleeping sickness in Western Africa, Proceedings of the National Academy of Sciences, Volume 106 (2009) no. 1, pp. 209-214 | DOI

[28] Manangwa, O.; De Meeûs, T.; Grébaut, P.; Ségard, A.; Byamungu, M.; Ravel, S. Detecting Wahlund effects together with amplification problems: Cryptic species, null alleles and short allele dominance in Glossina pallidipes populations from Tanzania, Molecular Ecology Resources, Volume 19 (2019) no. 3, pp. 757-772 | DOI

[29] Nébavi, F.; Ayala, F. J.; Renaud, F.; Bertout, S.; Eholié, S.; Moussa, K.; Mallié, M.; de Meeûs, T. Clonal population structure and genetic diversity of Candida albicans in AIDS patients from Abidjan (Côte d’Ivoire), Proceedings of the National Academy of Sciences, Volume 103 (2006) no. 10, pp. 3663-3668 | DOI

[30] Nei, M.; Chesser, R. K. Estimation of fixation indices and gene diversities, Annals of Human Genetics, Volume 47 (1983) no. 3, pp. 253-259 | DOI

[31] Prudhomme, J.; De Meeûs, T.; Toty, C.; Cassan, C.; Rahola, N.; Vergnes, B.; Charrel, R.; Alten, B.; Sereno, D.; Bañuls, A.-L. Altitude and hillside orientation shapes the population structure of the Leishmania infantum vector Phlebotomus ariasi, Scientific Reports, Volume 10 (2020) no. 1 | DOI

[32] R-Core-Team R: A Language and Environment for Statistical Computing, Version 3.6.3 (2020-02-29), R Foundation for Statistical Computing, Vienna, Austria, 2020 (http://www.R-project.org)

[33] Reichel, K.; Masson, J.-P.; Malrieu, F.; Arnaud-Haond, S.; Stoeckel, S. Rare sex or out of reach equilibrium? The dynamics of FIS in partially clonal organisms, BMC Genetics, Volume 17 (2016) no. 1 | DOI

[34] Séré, M.; Kaboré, J.; Jamonneau, V.; Belem, A. M.; Ayala, F. J.; De Meeûs, T. Null allele, allelic dropouts or rare sex detection in clonal organisms: simulations and application to real data sets of pathogenic microbes, Parasites & Vectors, Volume 7 (2014) no. 1 | DOI

[35] Stoeckel, S.; Masson, J.-P. The Exact Distributions of FIS under Partial Asexuality in Small Finite Populations with Mutation, PLoS ONE, Volume 9 (2014) no. 1 | DOI

[36] Van Oosterhout, C.; Hutchinson, W. F.; Wills, D. P. M.; Shipley, P. micro-checker: software for identifying and correcting genotyping errors in microsatellite data, Molecular Ecology Notes, Volume 4 (2004) no. 3, pp. 535-538 | DOI

[37] Vitalis, R. Sex-specific genetic differentiation and coalescence times: estimating sex-biased dispersal rates, Molecular Ecology, Volume 11 (2002) no. 1, pp. 125-138 | DOI

[38] Wang, C.; Schroeder, K. B.; Rosenberg, N. A. A Maximum-Likelihood Method to Correct for Allelic Dropout in Microsatellite Data with No Replicate Genotypes, Genetics, Volume 192 (2012) no. 2, pp. 651-669 | DOI

[39] Wattier, R.; Engel, C. R.; Saumitou‐Laprade, P.; Valero, M. Short allele dominance as a source of heterozygote deficiency at microsatellite loci: experimental evidence at the dinucleotide locus Gv1CT in Gracilaria gracilis (Rhodophyta), Molecular Ecology, Volume 7 (1998) no. 11, pp. 1569-1573 | DOI

[40] Weinberg, W. Über den Nachweis der Verebung beim Menschen, Jahresheft des Vereins fur Vaterlundische Naturkunde in Wurttemberg, Volume 64 (1908), pp. 368-382 (https://archive.org/details/b30613000/page/370/mode/2up)

[41] Weir, B. S.; Cockerham, C. C. Estimating F-statistics for the analysis of population structure, Evolution, Volume 38 (1984) no. 6, pp. 1358-1370 | DOI

[42] Weir, W.; Capewell, P.; Foth, B.; Clucas, C.; Pountain, A.; Steketee, P.; Veitch, N.; Koffi, M.; De Meeûs, T.; Kaboré, J.; Camara, M.; Cooper, A.; Tait, A.; Jamonneau, V.; Bucheton, B.; Berriman, M.; MacLeod, A. Population genomics reveals the origin and asexual evolution of human infective trypanosomes, eLife, Volume 5 (2016) | DOI

[43] Wright, S. The interpretation of population structure by F-statistics with special regard to systems of mating, Evolution, Volume 19 (1965) no. 3, pp. 395-420 | DOI

Cited by Sources: