Development of nine microsatellite loci for Trypanosoma lewisi, a potential human pathogen in Western Africa and South-East Asia, and preliminary population genetics analyses

Trypanosoma lewisi belongs to the so-called atypical trypanosomes that occasionally aﬀect humans. It shares the same hosts and ﬂea vector of other medically relevant pathogenic agents as Yersinia pestis, the agent of plague. Increasing knowledge on the population structure (reproductive mode, population size, dispersal) of this parasite thus represents a challenging but important issue. The use of polymorphic genetic markers, together with suitable population genetics tools, is a convenient way to achieve such objectives. To date, the population biology of T. lewisi is poorly known and, to our knowledge, no population genetics studies have ever been conducted. Here, we present the development of nine microsatellite markers of this species. We investigated their polymorphism in diﬀerent countries from Africa and South-East Asia from DNAs extracted from the spleen of their rodent reservoirs (essentially rat species). Several ampliﬁcation problems arose, especially with South-East Asian individuals. This led to retain only those individuals with complete genotypes (most of them originating from West Africa, notably Cotonou, Benin) to ensure an optimal estimate of heterozygosity. Our results pointed towards a mainly (at least 95-99%) clonal mode of propagation, a strong subdivision at the smallest scale available (i.e., urban neighborhoods, i.e. 0.250 km²), and a generation time most probably shorter than 4 months. In future studies, more extensive sampling at smaller geographic scales (i.e., households), within a one-or two-months window and with improved ampliﬁcation conditions, should lead to a more precise picture of the ﬁne population structure of this parasite.


Introduction
The two classic forms of human trypanosomiasis are sleeping sickness or human African trypanosomiasis (HAT, also known as gHAT, due to Trypanosoma brucei gambiense, and rHAT, due to T. b. rhodesiense) and Chagas disease, or American trypanosomiasis, due to Trypanosoma cruzi.Other trypanosomoses normally exclusively infect animals in Africa where they are called "African animal trypanosomoses" (or AAT; Nagana, surra and dourine) as well as in Asia and America (surra and dourine).Consequently, the latter parasites are considered as atypical when they infect humans (Truc, Büscher et al., 2013).Atypical human trypanosomoses (a-HT) that have been documented to induce pathologies are: Trypanosoma evansi, Trypanosoma lewisi, Trypanosoma congolense, Trypanosoma brucei brucei and Trypanosoma vivax.These trypanosomes usually infect cattle, equines, camelids, suids and rodents.In humans, these parasites are considered to be non-or poorly pathogenic.They have infected individuals with either spontaneous cures, pathologic profiles requiring treatment, and sometimes with fatal issues (Doke & Kar, 2011).Following advances in molecular biology technologies, more recent human cases have been described in India, Gambia, Egypt, Thailand and Vietnam (Wabale, Nalage et al., 2015;Chau, Chau et al., 2016).In Africa, confusion of these atypical forms with sleeping sickness is suspected.Indeed, a patient with a mixed infection with T. brucei and T. congolense was identified in Côte d'Ivoire, and successfully treated (Truc, Jamonneau et al., 1998).More recently, during an investigation, T. congolense's DNA was detected in the blood of 11 out of 480 subjects tested in the Maro sleeping sickness focus in Chad (Ibrahim, Weber et al., 2021).
Trypanosoma lewisi is a worldwide blood parasite of rodents transmitted by fleas (Hoare, 1972).This parasite is resistant to normal human serum (Lun, Wen et al., 2015).Several surveys found that small mammals (rodents and shrews) were infected with T. lewisi in West African villages and cities (e.g.(Tatard, Garba et al., 2017;Rossi, Kadaouré et al., 2018;Dobigny, Gauthier et al., 2019)) as well as in Southeast Asian villages (Pumhom, Morand et al., 2015).The high prevalence found in small mammals, especially rats, within the domestic and peri-domestic environment, suggested that many people, especially infants, may be at high risk of trypanosomes spill-over from rodents.
Heterozygosity levels and population structure of T. lewisi remain unknown, and so is its reproductive mode.Yet, such knowledge is crucial to understand the epidemiology of this potential zoonotic disease and get clues on its transmission dynamics.From there, we decided to develop T. lewisispecific microsatellite markers and to investigate further the genetic diversity and population structure of this parasite.Such markers would indeed open the gate to investigations of population genetic co-structure of rodent reservoir and flea vector, together with other zoonotic pathosystem that are responsible for major zoonotic diseases such as plague and murine typhus.

Ethical statement
In Benin, researches were conducted within the framework of the research agreement between the Republic of Benin and the French National Institute for Sustainable Development (IRD) that was reapproved on the 6th April 2017, as well as the partnership agreement between IRD and the University of Abomey-Calavi (signed on the 30th September 2010 and renewed on the 3rd July 2019).
In Senegal, researches were carried out under the framework agreement established between IRD, the Republic of Senegal and the Senegalese Head Office of Waters and Forests (available upon request).At the time of sampling, no ethic agreement was required to investigate pest rodents in these two countries.
In Lao Republic and Thailand, ethic agreements were obtained from the National Ethics Committee of Health Research (Ministry of Health Council of Medical Sciences, 51/NECHR) and the Ethical Committee of Mahidol University, Bangkok (0517.1116/661),respectively.Samples from Cambodia were used under the courtesy of the Pasteur Institute of Cambodia (CeroPath project, coord.P. Buchy).
In all countries, explicit oral agreements were systematically obtained from local traditional (e.g.family and household heads, shop, firm and garden owners) as well as administrative (City Hall services, urban district chiefs) authorities before rodent trapping.
None of the rodent species captured for this study has protected status according to IUCN/CITES.Rodents were captured and brought alive to the lab where they were treated in a respectful manner in accordance with the guidelines of the American Society of Mammalogists (Sikes & Gannon, 2011), sedated and then sacrificed by cervical dislocation as recommended by Mills et al. (Mills, Yates et al., 1995).Handling procedures were performed under our laboratory agreement for experiments on wild animals (no. 34-169-1).
Access to and benefit-sharing of genetic resources in Benin produced during the course of the present study was authorized by the Benin national authorities following the Nagoya international protocol (permit 608/DGEFC/DCPRN/PF-APA/SA).The other samples were collected before Nagoya protocol-associated procedures implementation.Moreover, there is no possibility of commercial use of any of the genetic diversity evidenced during this work, and the co-authorship with our partners from the countries involved testifies of the access and benefit sharing on the utilization of the genetic diversity studied in this paper.
Biological material transfers to France have been systematically approved by the Regional Head of Veterinary Service Hérault, France.
Samples and associated data were deposited in the Small Mammal Collection at the IRD/CBGP (https://doi.org/10.15454/WWNUPO)as well as at URIB/LARBA/EPAC and Kasetsart University (Thailand).They are available upon request.

Sampling
Isolates (or ramets) of T. lewisi came from two continental landmasses with heterogeneous subsamples sizes and cohort compositions (Table 1) (225 isolates).Most of DNAs were extracted directly from qPCRpositive rodent spleens: in Thailand, Cambodia and Lao RP (Pumhom et al., 2015); in Niger and Nigeria (Tatard et al., 2017); in Senegal: (Cassan, Diagne et al., 2018); and in Benin (Dobigny et al., 2019).However, 12 ramets from Thailand and DRC were extracted from isolated strains cultivated in vivo in rats.Incubation in the mammal host lasts five to six days, followed by a multiplication phase of 7 to 10 days (Hoare, 1972;Zhang, Li et al., 2019), after which non-multiplying adult forms (trypomastigotes) appear and stay in the blood for weeks if not months ( (Hoare, 1972), page 221).Inside the flea, the entire cycle lasts five days, but the parasite then remains infective up to a year ( (Hoare, 1972), page 229).Assuming large variances around these median values, we assumed that the entire cycle is completed within two months.Considering a two months generation time, sampling corresponded to 39 different cohorts (Table 1).For African isolates (Benin and Niger), rats could be sampled in different neighborhoods (quarters) of two cities (Cotonou and Niamey) (Table 1).

DNA extraction and detection of Trypanosoma lewisi-carrying samples
Total DNA was extracted from ethanol-preserved spleen tissue and pellets of in vivo cultured trypanosomes using the DNeasy 96 Blood and Tissue Kit (Qiagen) according to manufacturer instructions.Whole DNA was eluted with 200 µL of elution buffer.
Screening for the presence of Trypanosoma in rodent samples were carried out with a 131 bp-long fragment of the 18S rRNA gene qPCR-based assay with two primers (TRYPA1: AGGAATGAAGGAGGGTAGTTCG, TRYPA2: CACACTTTGGTTCTTGATTGAGG) and a pair of hybridization probes (TRYPA3: LCRed640 -AGAATTTCACCTCTGACGCCCAGT -Ph, TRYPA4: GCTGTAGTTCGTCTTGGTGCGGTCT -FITC), using a LightCycler® 480 (Roche Diagnostics).Each reaction was duplicated and set in 10 µL final volume using the LC480 Probes Master Kit (Roche Diagnostics, Meylan, France) with 0.5 µM of each primer, 0.25 µM of each probe and 0.25 µL of uracil-DNA-glycosylase (UDG) (Biolabs, Courtaboeuf, France).After an initial incubation step at 50 °C for 1 min and a denaturation step at 95 °C for 10 min, cycling conditions were performed for 50 cycles with a denaturation step at 95 °C for 10 s, annealing step at 56 °C for 10 s and extension step at 72 °C for 15 s.All qPCR-positive samples were sequenced for 400 bp-long fragments of the SSU rDNA gene to determine Trypanosoma species, using the primer pair TRYPASEQ1 (ACACTGCAAACGATGACACC) and TRYPASEQ2 (TCAACCAAACAAATCACTCCA). Reaction was carried out in a 50 µL final volume containing 2.5 mM MgCl2, 50 mM KCl, 10 mM Tris-HCl (pH 8.3), 2 mM of each dNTP, 0.2 µM of each primer and 1.25 U of Fast Start Taq DNA polymerase (Roch Diagnostics, Meylan, France).An initial denaturation step was performed for 10 min at 95 °C.Then, the amplification went for 45 cycles with a denaturation at 95 °C for 30 s, annealing at 60 °C for 30 s and extension at 72 °C for 1 min (Dobigny, Poirier et al., 2011;Pumhom et al., 2015;Tatard et al., 2017;Cassan et al., 2018;Dobigny et al., 2019).

Development of the microsatellite loci
The DNA library was prepared from the strain Wery L307 24/9/68 (isolated in Kinshasa in 1968 by Pr Wéry of Institute of Tropical Medicine in Antwerp Belgium, kindly provided by Etienne Pays, Université Libre de Bruxelles Belgium) and using the Nextera DNA sample kit (Illumina, San Diego, CA, USA).Sequencing was performed on a MiSeq Sequencer (Illumina).We used the QDD identification tool (Meglécz, Pech et al., 2014) to detect fragments containing microsatellite markers and to design primers.These primers sequences are available in the supplementary file S1.
A total of 58 primers pairs were screened on 17 T. lewisi DNA from Benin and Thaïland.Amplification products were first visualized using 2% agarose gel electrophoresis.A total of 29 primer pairs with amplification products of the expected size were retained.These primer pairs were then evaluated for their polymorphism on all extracts using an ABI3500xL sequencer (Applied Biosystems, Waltham, Massachussets, USA) and using GENEMAPPER 4.1 software.This allowed retaining nine primer pairs (Table 2).The specificity of these primer pairs was tested using purified DNA from 15 Trypanosomatidae (Table 3).

PCR conditions
Touch-down PCR reactions were as follow: 3' at 95 °C for the first denaturation, followed by 10 cycles at: 96 °c for 30", annealing temperature + 5 °C for 30" and 72 °C for 1', then followed by 30 cycles at: 96 °C for 30", annealing temperature for 30" and 72 °C for 1', and finally 5' at 72 °c for final elongation.PCRs were carried out in a thermocycler (Eppendorf® Mastercycler® nexus) in 10 µL final volume, containing 0.5 U MP Biomedicals Taq DNA polymerase, 1 X reaction buffer, 200 µmol/L dNTPs, 20 pmol of each primer and 1 µL DNA.PCR tests for specificity were carried out in 25 µL final volume containing the same mix.

Quality tests for loci and samples
There was a substantial number of blanks (Bs) (no amplification) and unreadable genotypes (Us) (Supplementary file S1, available at https://zenodo.org/record/7234790).We thus used the same approach as the one developed by Kaboré et al. (Kaboré, MacLeod et al., 2011).If missing data (Bs and Us) translates into poor quality of corresponding isolates (poor conservation conditions of the extract, poor PCR conditions and/or outlier individuals displaying poor match of primers with targeted flanking sequences), this should then also correlate with an increase of dropouts of one allele in heterozygous individuals (fake homozygous profiles).We thus expected a negative correlation between the number of blanks (NBs) and the number of heterozygous loci (NHz), in the same individual, and between the number of unreadable loci (NUs) and NHz.Additionally, NBs and NUs should be positively correlated.We measured and tested these correlations with one-sided Spearman's rank correlation tests under the package Rcommander (rcmdr) (Fox, 2005;Fox, 2007) for R version 4.0.5 (R-Core-Team, 2020).

Finding the relevant structure levels
Population genetics analyses were performed on dataset previously converted into the appropriate format by Create v2.37 (Coombs, Letcher et al., 2008).
We first tried to study the effect of cohorts.To do this, we tested subdivision between each possible cohort of the same quarter.This was done with the G-based test (Goudet, Raymond et al., 1996) over all loci with 10,000 randomizations of individuals between subsample pairs.This procedure was identified as the most powerful way to combine tests over loci (De Meeûs, Guégan et al., 2009).This procedure could only be undertaken in Cotonou subsamples, more precisely in Agla, Ladji and Saint-Jean neighborhoods, and between cohorts 37 and 39 (see Table 1 and Supplementary File S1, available at https://zenodo.org/record/7234790).
We also used subsamples from Benin and Niger to test for the existence of a Wahlund effect when cohorts were ignored.To do so, we compared the number of significant linkage disequilibrium (LD) tests between locus pairs, and Wright's FIS (Wright, 1965) (measure of the relative contribution of nonrandom union of gametes on inbreeding), estimated with the Weir and Cockerham's (1984) method.These computations were also implemented in Fstat.For the LD tests, we used the G-based randomization test with 10,000 randomizations, combined over all subsamples for each locus pair, which was shown to be the most powerful combination method (De Meeûs et al., 2009).Since LD tests produce series of nonindependent tests, we adjusted the p-values obtained with the Benjamini and Yekutieli procedure (Benjamini & Yekutieli, 2001) with R (command p.adjust).Levels of significance (at BY level) were compared using a one sided signed rank test for paired data with rcmdr, the pairing unit being the locus pair.
The nLD/HT criterion (Manangwa, De Meeûs et al., 2019) was used to study the correlation between the number of times a locus occurred in a significant LD pair (nLD), and its total genetic diversity estimated with Nei's unbiased HT (Nei & Chesser, 1983).This was tested with a two-sided Spearman rank correlation test with rcmdr.For FIS comparisons between data with or without cohorts, we used a one-sided Wilcoxon signed rank test for paired data with rcmdr, the pairing unit being the locus.
To test for the effect of neighborhoods (quarters), we could only use subsamples from Benin, cohorts 37 and 39 separately, because of temporal issues or missing information.We measured the subdivision index (FST) in each cohort, and tested its significance with the G-based test in Fstat 2.9.4.We then averaged the FST, and FST', and combined the two p-values with MultiTest, as described above.We also used 95% confidence intervals (95%CI) obtained by 5,000 bootstraps over loci in Fstat 2.9.4.

Tracking amplification problems in complete genotypes within cohorts and quarters
Wright's FIS and FST were estimated with Weir and Cockerham's unbiased estimators in Fstat.We computed 95%CI of jackknives over subsamples to draw a picture of the variation of these indices across subsamples.We also used 5,000 bootstraps over loci to obtain 95%CI across loci.In order to detect amplification problems, the following criteria were used.
First, the ratio between the standard errors of FIS (StdrdErrFIS) and FST (StdrdErrFST), obtained by jackknives over loci, RS=StdrdErrFIS/StdrdErrFST was measured.This statistic can be indicative for the presence of null alleles if above 2, as is a positive correlation between F IS and F ST (De Meeûs, 2018).Correlation was tested with a one-sided Spearman's rank correlation test with rcmdr.
Second, short allele dominance (SAD) was tested, for each locus, with the FIT / allele size correlation criterion (Manangwa et al., 2019), with a one-sided (negative correlation) Spearman's rank correlation test.In case of doubt, we also computed the regression between FIS and allele size, weighted by the product pT(1-pT), where pT is the total allele frequency of the allele as provided by Fstat (All_W) (De Meeûs, Humair et al., 2004).In case of negative slope, we halved the p-value to obtain a one-sided test result retrieved with rcmdr.
Third, not knowing the reproductive system may make it difficult to undertake a specific stuttering test (De Meeûs & Noûs, 2022).This is why, in case of stuttering suspicion, we chose to directly pool alleles with one repeat difference or less (i.e., imperfect microsatellite loci), taking care that pooling groups always contained at least one allele with frequency above 0.05.Indeed, pooling rare alleles together may result in a fairly frequent artificial allele with an unjustified weight on the results (De Meeûs, Chan et al., 2021).
Peer Community Journal, Vol. 2 (2022), article e69 https://doi.org/10.24072/pcjournal.188 Population genetics structure analyses Linkage disequilibrium (LD) was tested between each pair of locus with the G-based randomization test with 10,000 shuffling of genotypes between loci, as described above.We computed Weir and Cockerham (Weir & Cockerham, 1984) unbiased estimators of Wright's F-statistics: FIS measures the inbreeding of individuals relative to the inbreeding of subsamples; FST represents the inbreeding of subsamples relative to inbreeding of the total sample; and FIT corresponds to the inbreeding of individuals relative to inbreeding in the total sample.The significant deviation from 0 was tested by randomizing 10,000 times alleles between individuals within each subsample in order to explore deviation from panmixia, individuals between subsamples in order to investigate subdivision, and of alleles between subsamples in order to test for the deviation of FIT from 0. The statistics used were the unbiased estimators of FIS and FIT (panmixia within subsamples and within the total sample, respectively), and the G-statistic (subdivision).
Jackknives over populations were used to obtain 95%CI around loci values and 5,000 bootstraps over loci to get the 95%CI globally.Jackknives 95%CI use standard error (SE) of the F's to retrieve the average value + or -SE×t0.05,n-1,where t0.05,n-1 is the Student parameter with type 1 error 0.05, and n is the number of items (here, subsamples) used.As this procedure assumes a normal distribution, which F-statistics cannot follow, it has only an illustrative purpose.For its part, 95%CI of bootstraps do not assume any distribution and can thus be used for statistical decisions (De Meeûs, McCoy et al., 2007).
We checked the importance of clonal reproduction by studying the behavior of LD, FIS, FST, FIT, and the number of repeated multi-locus genotypes (MLGs), as described in previous papers (Balloux, Lehmann et al., 2003;Arnaud-Haond, Alberto et al., 2005;De Meeûs, Lehmann et al., 2006;Arnaud-Haond, Duarte et al., 2007;Séré, Kabore et al., 2014).According to these references, in purely clonal populations, all loci with enough polymorphism are expected to be in LD; FIS should be constantly negative with a small variance across loci; FST cannot outreach 0.5 in the most subdivided populations, in which case FIT=0; and finally, many identical multi-locus genotypes (MLGs) should be present.In population where the clonal rate is very big (e.g., c>0.95), but with a small sexual rate, the proportion of loci in significant LD should drop down, a strong variance of FIS should be observed, strongly subdivided populations may display a FST>0.5 and a FIT>0, and less numerous MLGs should be observed.When sexual rate becomes more important (c<0.95),all parameters should tend to mimic panmixia, except LD and MLGs that should be more important than what is expected under panmixia, if populations keep substantial levels of clonality.We measured the MLG diversity using R=(G-1)/(N-1), where G is the number of different MLGs and N is the subsample size (Arnaud-Haond et al., 2005) .In pure clones, R should be much smaller than 1, while in panmictic populations of reasonable size, R→1.
In pure clones, a direct relationship is also known to link Nei's estimator of local genetic diversity HS (Nei & Chesser, 1983) and FIS, which may lead to an "expected" value for pure clones, with large number of possible alleles and 'perfect' data (no amplification problems): HS≥0.5 and FIS_exp=-(1-HS)/HS (Séré et al., 2014).This can lead to define a superimposition criterion for clonal organisms.Here we defined a slightly different criterion as compared to what was previously published (Séré et al., 2014): SFIS=1-abs(FIS-FIS_exp)/abs(FIS_exp), where "abs" means absolute value.When SFIS≥0.95, data are considered to fit the null hypothesis (pure clonality, many alleles and no amplification problem).According to Séré et al.'s (2014) simulations, 0.95>S FIS >0.5 may be explained by amplification problems, while S FIS <0.5 probably reflects the occurrence of rare sexual events (e.g., clonal rate 0.99<c<0.999).
Statistical scripts (simple R commands) are provided in the Material and Methods section.
The specificity for T. lewisi of the nine microsatellite primer pairs was tested using DNA from 15 other Trypanosomatidae species: none of these was amplified by any of the nine primer pairs (data not shown).

Quality tests for loci over the entire dataset
All correlation tests between number of heterozygous sites and amplification failures were highly significant: ρ=-0.65 (p-values<0.0001)between NBs and NHz; ρ=-0.27(p-values<0.0001)between NUs and NHz;) between NBs and NUs.We thus pooled NBs and NUs as the number of missing genotypes (NMs) which was strongly correlated with NHz (ρ=-0.67,p-value<0.0001)(Figure 1).This means that missing data corresponded to individuals with amplification problems, such as null alleles or allelic dropouts.Because of the probable importance of clonality in T. lewisi reproduction, it was not possible to correct for these issues, and we chose to keep only isolates with complete genotypes since the absence of missing data most probably reflect no or minimal amplification failures.

Finding the relevant structure levels
The correlation between GST and HS was significantly negative both considering cohorts (ρ=-0.3571,p-value=0.0288)and ignoring those (ρ=-0.7381,p-value=0.0229).There was a small and marginally not significant differentiation between cohorts 37 and 39 (FST'=0.0347,p-value=0.0794).Nevertheless, some evidence of a Wahlund effect was detected when cohorts were ignored: data with cohorts outputted a slightly smaller FIS=0.015than the data ignoring them (FIS=0.025),but the difference was not significant (p-value=0.1563).There was no Wahlund effect signature with the nLD/HT criterion in the dataset without cohorts (p-value=0.6512).However, the theoretical behavior of this criterion in clonal populations has never been explored.Moreover, the p-values of LD tests when cohorts were ignored were significantly smaller than in the data with cohorts (p-value=0.0119).As a consequence, we chose to be conservative and to keep the cohort as a potentially useful information.

Tracking amplification problems in complete genotypes within cohorts and quarters
For these analyses, both Cotonou (Benin, cohorts 37 and 39) and Niamey (Niger, cohort 12) neighborhoods were included.
There was a non-significant but highly variable FIS=0.021 in 95%CI=[-0.134,0.165], (p-value=0.5486),across loci.The standard error ratio was RS=2.7.This may be explained by rare sexual recombination events (De Meeûs et al., 2006) and/or by null alleles and/or SAD (Séré et al., 2014).The correlation between FIS and FST was strongly negative (ρ=-0.8333).The absence of significant correlation between FIS and the number of missing genotypes that the initial dataset displayed (ρ=-0.5714,p-value=0.9339)suggested that null alleles poorly explain the observed patterns.
In order to counteract the uneasy detection and test of stuttering in clonal populations (De Meeûs & Noûs, 2022), we directly corrected loci with stuttering suspicion (i.e.loci LEW2, LEW12, LEW32, LEW42 and LEW55) following recent recommendations (De Meeûs et al., 2021).As such, allele 126 and 124 of locus LEW2 were pooled.In the same manner, we pooled alleles 160 and 158 for LEW12; 261 and 263, 289 and 287 for LEW32; 314 and 312 for LEW42; as well as 343-349 and 341 for LEW55.The resulting dataset led to a clear decrease of FIS for all loci but LEW32, which displayed only a very weak decrease (Table 4).As a consequence, stuttering corrections were applied in all subsequent analyses.Among the 27 locus pairs for which a test was possible, there were seven locus pairs in significant LD (25.9%), three of which remaining significant even after BY correction (11%).
There was a global and significant heterozygote excess within subsamples: F IS =-0.105 in 95%CI=[-0.309,0.079] (p-value=0.0326), with an important variation across loci, as illustrated by the jackknife over loci (SE=0.11)being 2.44 times the SE of FST.The fact that the correlation was negative between FIS values on the one hand, and FST or the number of missing genotypes in the initial dataset on the other hand, confirmed a minor role for null alleles.No evidence of SAD could be found.
The general MLG diversity was quite high (Table 5), with much higher values in Cotonou subsamples than those obtained in Niamey.

NJTree analysis over all countries
We used isolates with complete genotypes at all the nine loci and for all possible isolates (African and Asiatic).Only two isolates from South-East Asia (Thailand) fulfilled this condition.Both were grouped into one Operative Taxonomic Unit (OTU).The results are presented in the Figure 2.For African subsamples, OTUs corresponded to subsamples described in Table 5. Genetic distances were quite large between OTUs.The smallest distances occurred between cohorts of the same neighborhood (quarter) (Figure 2).Subsamples from the same country also belonged to the same lineage, except for Senegalese OTUs (very small subsamples).

Discussion
Isolates from South-East Asia suffered from many amplification problems, probably due to primer mismatches and/or degradation of DNA samples, thus precluding any conclusions about this particular subdataset.Consequently, future studies of T. lewisi from that part of the world will most probably require the de novo design of dedicated loci and primers.
In African isolates, though in a lesser extent, many loci obviously suffered from some amplification failures.This means that future studies would probably benefit from a refinement of the primers described here, and/or from an adaption of PCR conditions in order to improve amplification success.Isolates with a qPCR cycle threshold Ct<30 (see Appendix) provided amplification success at all 9 loci for the present data set as a probable result of a higher concentration of trypanosomes' DNA.For future studies, we will thus recommend to select extracts with a Ct<30.Alternatively, other techniques such as those relying on highthroughput sequencing-based production of microsatellite markers (Lepais, Chancerel et al., 2020) could be of great interest for further fine-scale investigation of population genetics in T. lewisi.
On the other hand, once isolates with missing data were excluded from the dataset, some characteristics of rodent-borne T. lewisi population structure could be described for the first time using the present panel of newly developed microsatellite loci.First, the most coherent genetically-defined units retrieved here clearly corresponded to West African urban neighborhoods within a two months window.It is even plausible that they could correspond to smaller areas, such as households and/or rodent demes.Unfortunately, the sampling design, together with possible amplification biases (but see below) may have masked part of the heterozygous profiles, making any definitive conclusion difficult.It was also difficult to conclude on the exact reproductive mode of T. lewisi on the sole basis of the present microsatellite-based study.If one considers that real subpopulations corresponded to clusters at the household and/or rat deme levels, our results may be compatible with full clonality together with the presence of some amplification problems and/or a residual Wahlund effect.However, they would also be compatible with rare events of sex, though still with a clonal rate close to 1 (c>0.99).According to De Meeûs (2015) (De Meeûs, 2015), with a limited number of alleles K, small mutation rates and full clonality, the expected probability of identity between individuals of the same subpopulation would be QS=(K+1)/2K, meaning that the genetic diversity should be HS=1-QS=(K-1)/2K.If all the isolates from the entire initial dataset are taken into account, it results, on average, in K'=14 possible alleles, which gives H S '=0.46.This value is surprisingly close to the value that we computed for the uncured dataset in West African isolates (Hs=0.457).The fact that the superimposition criterion from Séré et al. (Séré et al., 2014) was not applicable, may originate from a combination of amplification problems, Wahlund effects and small subsample sizes.Another expectation in pure clones is that the probability to find two identical alleles within individuals, QI=1/K=0.342.This value is much smaller than what was found in our African dataset before stuttering cure with QI_obs=0.554,and does not support a total absence of sexual segregation.The absence of null allele or SAD signatures, the relatively low proportion of locus pairs in significant LD as well as of repeated MLGs in most datasets, the strong subdivision measures (with FST>0.5 and FIT>0) and the total lack of superimposition of F IS with the expected one under full clonality represent other strong arguments in favor of sexual recombination in these populations, particularly in Cotonou.Given the relatively low genetic diversity observed, which may also originate from very low mutation rates and/or Peer Community Journal, Vol. 2 (2022), article e69 https://doi.org/10.24072/pcjournal.188very small effective subpopulation sizes, it is probable that T. lewisi subpopulations are mainly clonal but experience rare events of sexual recombination, i.e. 0.99<c<1 and are strongly subdivided into small units, potentially corresponding to neighborhoods -or even smaller ensembles (e.g.reservoirs' demes or households).These units always appeared quite isolated from each other, compatible with small numbers of immigrants per generation.Though preliminary and clearly deserving further confirmation, our results have important consequences in terms of the ecology of T. lewisi as well as other pathogenic agents that share similar reservoir and vector species, such as the plague bacillus or the Rickettsia typhi typhus, among many others (Azad, Radulovic et al., 1997).
Finally, a two months window to define different cohorts appeared to be a reasonable hypothesis since four months-spaced subsamples provided close to significant subdivision signatures.This suggests that T. lewisi generation time fits well to the average or median life cycle time of 1-2 months previously proposed for T. lewisi (Hoare, 1972).This is also in very good agreement with that found in T. brucei gambiense (Koffi, De Meeûs et al., 2009).
The nine newly developed T. lewisi-specific microsatellite markers provide important insights on African (not Asian) atypical trypanosome ecology, and opens the gate to promising avenues of research.However, our conclusions will need to be refined using larger sample sizes, at smaller geographic scales and, if possible, with better amplified markers and samples tested for quality of DNA.As far as the study of Asian isolates is concerned, our data strongly suggest that a new and specifically designed set of markers could be required.Indeed, the spectacular proportion of amplification failures observed here for a panel of South-East Asian samples could be linked to a poor-quality DNA although these isolates may probably belong to very distant lineages as compared to their African counterparts.

Figure 1 :
Figure 1: Correlation between the number of missing genotypes (blanks or not interpretable results) and the number of heterozygous sites in Trypanosoma lewisi isolates.Regression equation and determinant coefficient are presented for the sake of illustration.Spearman's rank correlation and onesided test outputs are also provided

Figure 2 :
Figure 2: Neighbor-Joining Tree of Trypanosoma lewisi Operative Taxonomic Units (OTUs) from Africa and Thailand with complete genotypes on nine microsatellite loci.OTU's names indicate their cohort number (e.g.C37), their country, town and neighborhood of origin.Number of isolates per OTU can be checked out in Table 1.Thailand was represented by only two individuals, while ref-Wery is a reference strain from Democratic Republic of Congo.The tree was drawn after a Cavalli-Sforza and Edward's chord distance matrix (na: information not available).

Table 2 :
Primers and characteristics of the nine microsatellite markers retained

Table 4 :
Comparisons, for each locus, of the FIS before and after stuttering correction, except for loci for which cure was not relevant (NR) (i.e.loci with no suspicion of stuttering).NA: not available (monomorphic locus in the subsamples studied).See text for details