Genomics

EukProt: A database of genome-scale predicted proteins across the diversity of eukaryotes

10.24072/pcjournal.173 - Peer Community Journal, Volume 2 (2022), article no. e56.

Get full text PDF Peer reviewed and recommended by PCI

EukProt is a database of published and publicly available predicted protein sets selected to represent the breadth of eukaryotic diversity, currently including 993 species from all major supergroups as well as orphan taxa. The goal of the database is to provide a single, convenient resource for gene-based research across the spectrum of eukaryotic life, such as phylogenomics and gene family evolution. Each species is placed within the UniEuk taxonomic framework in order to facilitate downstream analyses, and each data set is associated with a unique, persistent identifier to facilitate comparison and replication among analyses. The database is regularly updated, and all versions will be permanently stored and made available via FigShare. The current version has a number of updates, notably ‘The Comparative Set’ (TCS), a reduced taxonomic set with high estimated completeness while maintaining a substantial phylogenetic breadth, which comprises 196 predicted proteomes. A BLAST web server and graphical displays of data set completeness are available at http://evocellbio.com/eukprot/. We invite the community to provide suggestions for new data sets and new annotation features to be included in subsequent versions, with the goal of building a collaborative resource that will promote research to understand eukaryotic diversity and diversification.

Published online:
DOI: 10.24072/pcjournal.173
Richter, Daniel J. 1; Berney, Cédric 2, 3; Strassert, Jürgen F. H. 4; Poh, Yu-Ping 5; Herman, Emily K. 6; Muñoz-Gómez, Sergio A. 7; Wideman, Jeremy G. 5; Burki, Fabien 8; de Vargas, Colomban 2, 3

1 Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra) – Barcelona, Spain
2 Research Federation for the study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE – Paris, France
3 Sorbonne Université, CNRS, Station Biologique de Roscoff, UMR 7144, ECOMAP – Roscoff, France
4 Department of Evolutionary and Integrative Ecology, Leibniz Institute of Freshwater Ecology and Inland Fisheries (IGB) – Berlin, Germany
5 Biodesign Center for Mechanisms of Evolution, School of Life Sciences, Arizona State University – Tempe, Arizona, United States of America
6 Department of Agricultural, Food and Nutritional Sciences, Faculty of Agricultural, Life, and Environmental Sciences, University of Alberta – Edmonton, Alberta, Canada
7 Unité d’Ecologie, Systématique et Evolution, Université Paris-Saclay – Orsay, France
8 Department of Organismal Biology and Science for Life Laboratory, Uppsala University – Uppsala, Sweden
License: CC-BY 4.0
Copyrights: The authors retain unrestricted copyrights and publishing rights
@article{10_24072_pcjournal_173,
     author = {Richter, Daniel J. and Berney, C\'edric and Strassert, J\"urgen F. H. and Poh, Yu-Ping and Herman, Emily K. and Mu\~noz-G\'omez, Sergio A. and Wideman, Jeremy G. and Burki, Fabien and de Vargas, Colomban},
     title = {EukProt: {A} database of genome-scale predicted proteins across the diversity of eukaryotes},
     journal = {Peer Community Journal},
     eid = {e56},
     publisher = {Peer Community In},
     volume = {2},
     year = {2022},
     doi = {10.24072/pcjournal.173},
     url = {https://peercommunityjournal.org/articles/10.24072/pcjournal.173/}
}
TY  - JOUR
TI  - EukProt: A database of genome-scale predicted proteins across the diversity of eukaryotes
JO  - Peer Community Journal
PY  - 2022
DA  - 2022///
VL  - 2
PB  - Peer Community In
UR  - https://peercommunityjournal.org/articles/10.24072/pcjournal.173/
UR  - https://doi.org/10.24072/pcjournal.173
DO  - 10.24072/pcjournal.173
ID  - 10_24072_pcjournal_173
ER  - 
%0 Journal Article
%T EukProt: A database of genome-scale predicted proteins across the diversity of eukaryotes
%J Peer Community Journal
%D 2022
%V 2
%I Peer Community In
%U https://doi.org/10.24072/pcjournal.173
%R 10.24072/pcjournal.173
%F 10_24072_pcjournal_173
Richter, Daniel J.; Berney, Cédric; Strassert, Jürgen F. H.; Poh, Yu-Ping; Herman, Emily K.; Muñoz-Gómez, Sergio A.; Wideman, Jeremy G.; Burki, Fabien; de Vargas, Colomban. EukProt: A database of genome-scale predicted proteins across the diversity of eukaryotes. Peer Community Journal, Volume 2 (2022), article  no. e56. doi : 10.24072/pcjournal.173. https://peercommunityjournal.org/articles/10.24072/pcjournal.173/

Peer reviewed and recommended by PCI : 10.24072/pci.genomics.100021

[1] Acinas, S. G.; Sánchez, P.; Salazar, G.; Cornejo-Castillo, F. M.; Sebastián, M.; Logares, R.; Royo-Llonch, M.; Paoli, L.; Sunagawa, S.; Hingamp, P.; Ogata, H.; Lima-Mendez, G.; Roux, S.; González, J. M.; Arrieta, J. M.; Alam, I. S.; Kamau, A.; Bowler, C.; Raes, J.; Pesant, S.; Bork, P.; Agustí, S.; Gojobori, T.; Vaqué, D.; Sullivan, M. B.; Pedrós-Alió, C.; Massana, R.; Duarte, C. M.; Gasol, J. M. Deep ocean metagenomes provide insight into the metabolic architecture of bathypelagic microbial communities, Communications Biology, Volume 4 (2021) no. 1 | DOI

[2] Adl, S. M.; Bass, D.; Lane, C. E.; Lukeš, J.; Schoch, C. L.; Smirnov, A.; Agatha, S.; Berney, C.; Brown, M. W.; Burki, F.; Cárdenas, P.; Čepička, I.; Chistyakova, L.; Campo, J.; Dunthorn, M.; Edvardsen, B.; Eglit, Y.; Guillou, L.; Hampl, V.; Heiss, A. A.; Hoppenrath, M.; James, T. Y.; Karnkowska, A.; Karpov, S.; Kim, E.; Kolisko, M.; Kudryavtsev, A.; Lahr, D. J.; Lara, E.; Le Gall, L.; Lynn, D. H.; Mann, D. G.; Massana, R.; Mitchell, E. A.; Morrow, C.; Park, J. S.; Pawlowski, J. W.; Powell, M. J.; Richter, D. J.; Rueckert, S.; Shadwick, L.; Shimano, S.; Spiegel, F. W.; Torruella, G.; Youssef, N.; Zlatogursky, V.; Zhang, Q. Revisions to the Classification, Nomenclature, and Diversity of Eukaryotes, Journal of Eukaryotic Microbiology, Volume 66 (2019) no. 1, pp. 4-119 | DOI

[3] Alexander, H.; Hu, S. K.; Krinos, A. I.; Pachiadaki, M.; Tully, B. J.; Neely, C. J.; Reiter, T. Eukaryotic genomes from a global metagenomic dataset illuminate trophic modes and biogeography of ocean plankton, bioRxiv, 2021 | DOI

[4] Altschul, S. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Research, Volume 25 (1997) no. 17, pp. 3389-3402 | DOI

[5] Bass, D. Professor Thomas Cavalier-Smith FRS FRSC, Protistology UK, 2021 (https://www.protistology.org.uk/t-c-s)

[6] Berney, C. EukRibo: a manually curated eukaryotic 18S rDNA reference database, Zenodo, 2022 | DOI

[7] Berney, C.; Ciuprina, A.; Bender, S.; Brodie, J.; Edgcomb, V.; Kim, E.; Rajan, J.; Parfrey, L. W.; Adl, S.; Audic, S.; Bass, D.; Caron, D. A.; Cochrane, G.; Czech, L.; Dunthorn, M.; Geisen, S.; Glöckner, F. O.; Mahé, F.; Quast, C.; Kaye, J. Z.; Simpson, A. G. B.; Stamatakis, A.; del Campo, J.; Yilmaz, P.; de Vargas, C. UniEuk: Time to Speak a Common Language in Protistology!, Journal of Eukaryotic Microbiology, Volume 64 (2017) no. 3, pp. 407-411 | DOI

[8] Biller, S. J.; Berube, P. M.; Dooley, K.; Williams, M.; Satinsky, B. M.; Hackl, T.; Hogle, S. L.; Coe, A.; Bergauer, K.; Bouman, H. A.; Browning, T. J.; De Corte, D.; Hassler, C.; Hulston, D.; Jacquot, J. E.; Maas, E. W.; Reinthaler, T.; Sintes, E.; Yokokawa, T.; Chisholm, S. W. Marine microbial metagenomes sampled across space and time, Scientific Data, Volume 5 (2018) no. 1 | DOI

[9] Bolger, A. M.; Lohse, M.; Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data, Bioinformatics, Volume 30 (2014) no. 15, pp. 2114-2120 | DOI

[10] Bork, P.; Bowler, C.; de Vargas, C.; Gorsky, G.; Karsenti, E.; Wincker, P. Tara Oceans studies plankton at planetary scale, Science, Volume 348 (2015) no. 6237, p. 873-873 | DOI

[11] Brown, M. W.; Heiss, A. A.; Kamikawa, R.; Inagaki, Y.; Yabuki, A.; Tice, A. K.; Shiratori, T.; Ishida, K.-I.; Hashimoto, T.; Simpson, A. G. B.; Roger, A. J. Phylogenomics Places Orphan Protistan Lineages in a Novel Eukaryotic Super-Group, Genome Biology and Evolution, Volume 10 (2018) no. 2, pp. 427-433 | DOI

[12] Burki, F.; Okamoto, N.; Pombert, J.-F.; Keeling, P. J. The evolutionary history of haptophytes and cryptophytes: phylogenomic evidence for separate origins, Proceedings of the Royal Society B: Biological Sciences, Volume 279 (2012) no. 1736, pp. 2246-2254 | DOI

[13] Burki, F.; Roger, A. J.; Brown, M. W.; Simpson, A. G. The New Tree of Eukaryotes, Trends in Ecology and Evolution, Volume 35 (2020) no. 1, pp. 43-55 | DOI

[14] del Campo, J.; Sieracki, M. E.; Molestina, R.; Keeling, P.; Massana, R.; Ruiz-Trillo, I. The others: our biased perspective of eukaryotic genomes, Trends in Ecology and Evolution, Volume 29 (2014) no. 5, pp. 252-259 | DOI

[15] Carradec, Q.; Pelletier, E.; Da Silva, C.; Alberti, A.; Seeleuthner, Y.; Blanc-Mathieu, R.; Lima-Mendez, G.; Rocha, F.; Tirichine, L.; Labadie, K.; Kirilovsky, A.; Bertrand, A.; Engelen, S.; Madoui, M.-A.; Méheust, R.; Poulain, J.; Romac, S.; Richter, D. J.; Yoshikawa, G.; Dimier, C.; Kandels-Lewis, S.; Picheral, M.; Searson, S.; Jaillon, O.; Aury, J.-M.; Karsenti, E.; Sullivan, M. B.; Sunagawa, S.; Bork, P.; Not, F.; Hingamp, P.; Raes, J.; Guidi, L.; Ogata, H.; de Vargas, C.; Iudicone, D.; Bowler, C.; Wincker, P. A global ocean atlas of eukaryotic genes, Nature Communications, Volume 9 (2018) no. 1 | DOI

[16] Close, T. J.; Bhat, P. R.; Lonardi, S.; Wu, Y.; Rostoks, N.; Ramsay, L.; Druka, A.; Stein, N.; Svensson, J. T.; Wanamaker, S.; Bozdag, S.; Roose, M. L.; Moscou, M. J.; Chao, S.; Varshney, R. K.; Szűcs, P.; Sato, K.; Hayes, P. M.; Matthews, D. E.; Kleinhofs, A.; Muehlbauer, G. J.; DeYoung, J.; Marshall, D. F.; Madishetty, K.; Fenton, R. D.; Condamine, P.; Graner, A.; Waugh, R. Development and implementation of high-throughput SNP genotyping in barley, BMC Genomics, Volume 10 (2009) no. 1 | DOI

[17] Duarte, C. M. Seafaring in the 21St Century: The Malaspina 2010 Circumnavigation Expedition, Limnology and Oceanography Bulletin, Volume 24 (2015) no. 1, pp. 11-14 | DOI

[18] Eddy, S. HMMER: biosequence analysis using profile hidden Markov models, 2020 (http://hmmer.org/)

[19] Eisen, J. A.; Fraser, C. M. Phylogenomics: Intersection of Evolution and Genomics, Science, Volume 300 (2003) no. 5626, pp. 1706-1707 | DOI

[20] El-Gebali, S.; Mistry, J.; Bateman, A.; Eddy, S. R.; Luciani, A.; Potter, S. C.; Qureshi, M.; Richardson, L. J.; Salazar, G. A.; Smart, A.; Sonnhammer, E. L. L.; Hirsh, L.; Paladin, L.; Piovesan, D.; Tosatto, S. C. E.; Finn, R. D. The Pfam protein families database in 2019, Nucleic Acids Research, Volume 47 (2019) no. D1 | DOI

[21] Gawryluk, R. M. R.; Tikhonenkov, D. V.; Hehenberger, E.; Husnik, F.; Mylnikov, A. P.; Keeling, P. J. Non-photosynthetic predators are sister to red algae, Nature, Volume 572 (2019) no. 7768, pp. 240-243 | DOI

[22] The Gene Ontology Resource: 20 years and still GOing strong, Nucleic Acids Research, Volume 47 (2019) no. D1 | DOI

[23] Grigoriev, I. V.; Hayes, R. D.; Calhoun, S.; Kamel, B.; Wang, A.; Ahrendt, S.; Dusheyko, S.; Nikitin, R.; Mondo, S. J.; Salamov, A.; Shabalov, I.; Kuo, A. PhycoCosm, a comparative algal genomics resource, Nucleic Acids Research, Volume 49 (2021) no. D1 | DOI

[24] Grigoriev, I. V.; Nikitin, R.; Haridas, S.; Kuo, A.; Ohm, R.; Otillar, R.; Riley, R.; Salamov, A.; Zhao, X.; Korzeniewski, F.; Smirnova, T.; Nordberg, H.; Dubchak, I.; Shabalov, I. MycoCosm portal: gearing up for 1000 fungal genomes, Nucleic Acids Research, Volume 42 (2014) no. D1 | DOI

[25] Gruber-Vodicka, H. R.; Seah, B. K. B.; Pruesse, E. phyloFlash: Rapid Small-Subunit rRNA Profiling and Targeted Assembly from Metagenomes, mSystems, Volume 5 (2020) no. 5 | DOI

[26] Haas, B. J.; Papanicolaou, A.; Yassour, M.; Grabherr, M.; Blood, P. D.; Bowden, J.; Couger, M. B.; Eccles, D.; Li, B.; Lieber, M.; MacManes, M. D.; Ott, M.; Orvis, J.; Pochet, N.; Strozzi, F.; Weeks, N.; Westerman, R.; William, T.; Dewey, C. N.; Henschel, R.; LeDuc, R. D.; Friedman, N.; Regev, A. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis, Nature Protocols, Volume 8 (2013) no. 8, pp. 1494-1512 | DOI

[27] Huang, X.; Madan, A. CAP3: A DNA Sequence Assembly Program, Genome Research, Volume 9 (1999) no. 9, pp. 868-877 | DOI

[28] Huerta-Cepas, J.; Szklarczyk, D.; Heller, D.; Hernández-Plaza, A.; Forslund, S. K.; Cook, H.; Mende, D. R.; Letunic, I.; Rattei, T.; Jensen, L. J.; von Mering, C.; Bork, P. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses, Nucleic Acids Research, Volume 47 (2019) no. D1 | DOI

[29] Janouškovec, J.; Tikhonenkov, D. V.; Burki, F.; Howe, A. T.; Rohwer, F. L.; Mylnikov, A. P.; Keeling, P. J. A New Lineage of Eukaryotes Illuminates Early Mitochondrial Genome Reduction, Current Biology, Volume 27 (2017) no. 23 | DOI

[30] Kamikawa, R.; Kolisko, M.; Nishimura, Y.; Yabuki, A.; Brown, M. W.; Ishikawa, S. A.; Ishida, K.-i.; Roger, A. J.; Hashimoto, T.; Inagaki, Y. Gene Content Evolution in Discobid Mitochondria Deduced from the Phylogenetic Position and Complete Mitochondrial Genome of Tsukubamonas globosa, Genome Biology and Evolution, Volume 6 (2014) no. 2, pp. 306-315 | DOI

[31] Karsenti, E.; Acinas, S. G.; Bork, P.; Bowler, C.; De Vargas, C.; Raes, J.; Sullivan, M.; Arendt, D.; Benzoni, F.; Claverie, J.-M.; Follows, M.; Gorsky, G.; Hingamp, P.; Iudicone, D.; Jaillon, O.; Kandels-Lewis, S.; Krzic, U.; Not, F.; Ogata, H.; Pesant, S.; Reynaud, E. G.; Sardet, C.; Sieracki, M. E.; Speich, S.; Velayoudon, D.; Weissenbach, J.; Wincker, P. A Holistic Approach to Marine Eco-Systems Biology, PLoS Biology, Volume 9 (2011) no. 10 | DOI

[32] Keeling, P. J.; Burki, F.; Wilcox, H. M.; Allam, B.; Allen, E. E.; Amaral-Zettler, L. A.; Armbrust, E. V.; Archibald, J. M.; Bharti, A. K.; Bell, C. J.; Beszteri, B.; Bidle, K. D.; Cameron, C. T.; Campbell, L.; Caron, D. A.; Cattolico, R. A.; Collier, J. L.; Coyne, K.; Davy, S. K.; Deschamps, P.; Dyhrman, S. T.; Edvardsen, B.; Gates, R. D.; Gobler, C. J.; Greenwood, S. J.; Guida, S. M.; Jacobi, J. L.; Jakobsen, K. S.; James, E. R.; Jenkins, B.; John, U.; Johnson, M. D.; Juhl, A. R.; Kamp, A.; Katz, L. A.; Kiene, R.; Kudryavtsev, A.; Leander, B. S.; Lin, S.; Lovejoy, C.; Lynn, D.; Marchetti, A.; McManus, G.; Nedelcu, A. M.; Menden-Deuer, S.; Miceli, C.; Mock, T.; Montresor, M.; Moran, M. A.; Murray, S.; Nadathur, G.; Nagai, S.; Ngam, P. B.; Palenik, B.; Pawlowski, J.; Petroni, G.; Piganeau, G.; Posewitz, M. C.; Rengefors, K.; Romano, G.; Rumpho, M. E.; Rynearson, T.; Schilling, K. B.; Schroeder, D. C.; Simpson, A. G. B.; Slamovits, C. H.; Smith, D. R.; Smith, G. J.; Smith, S. R.; Sosik, H. M.; Stief, P.; Theriot, E.; Twary, S. N.; Umale, P. E.; Vaulot, D.; Wawrik, B.; Wheeler, G. L.; Wilson, W. H.; Xu, Y.; Zingone, A.; Worden, A. Z. The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): Illuminating the Functional Diversity of Eukaryotic Life in the Oceans through Transcriptome Sequencing, PLoS Biology, Volume 12 (2014) no. 6 | DOI

[33] King, N.; Rokas, A. Embracing Uncertainty in Reconstructing Early Animal Evolution, Current Biology, Volume 27 (2017) no. 19 | DOI

[34] Kiss, E.; Hegedüs, B.; Virágh, M.; Varga, T.; Merényi, Z.; Kószó, T.; Bálint, B.; Prasanna, A. N.; Krizsán, K.; Kocsubé, S.; Riquelme, M.; Takeshita, N.; Nagy, L. G. Comparative genomics reveals the origin of fungal hyphae and multicellularity, Nature Communications, Volume 10 (2019) no. 1 | DOI

[35] Kück, P.; Longo, G. C. FASconCAT-G: extensive functions for multiple sequence alignment preparations concerning phylogenetic studies, Frontiers in Zoology, Volume 11 (2014) no. 1 | DOI

[36] Larkin, A. A.; Garcia, C. A.; Garcia, N.; Brock, M. L.; Lee, J. A.; Ustick, L. J.; Barbero, L.; Carter, B. R.; Sonnerup, R. E.; Talley, L. D.; Tarran, G. A.; Volkov, D. L.; Martiny, A. C. High spatial resolution global ocean metagenomes from Bio-GO-SHIP repeat hydrography transects, Scientific Data, Volume 8 (2021) no. 1 | DOI

[37] Lax, G.; Eglit, Y.; Eme, L.; Bertrand, E. M.; Roger, A. J.; Simpson, A. G. B. Hemimastigophora is a novel supra-kingdom-level lineage of eukaryotes, Nature, Volume 564 (2018) no. 7736, pp. 410-414 | DOI

[38] Letunic, I.; Bork, P. Interactive Tree Of Life (iTOL) v4: recent updates and new developments, Nucleic Acids Research, Volume 47 (2019) no. W1 | DOI

[39] Li, W.; Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences, Bioinformatics, Volume 22 (2006) no. 13, pp. 1658-1659 | DOI

[40] Li, Y.; Steenwyk, J. L.; Chang, Y.; Wang, Y.; James, T. Y.; Stajich, J. E.; Spatafora, J. W.; Groenewald, M.; Dunn, C. W.; Hittinger, C. T.; Shen, X.-X.; Rokas, A. A genome-scale phylogeny of the kingdom Fungi, Current Biology, Volume 31 (2021) no. 8 | DOI

[41] MacManes, M. D. On the optimal trimming of high-throughput mRNA sequence data, Frontiers in Genetics, Volume 5 (2014) | DOI

[42] Manni, M.; Berkeley, M. R.; Seppey, M.; Simão, F. A.; Zdobnov, E. M. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes, Molecular Biology and Evolution, Volume 38 (2021) no. 10, pp. 4647-4654 | DOI

[43] Marron, A. O.; Ratcliffe, S.; Wheeler, G. L.; Goldstein, R. E.; King, N.; Not, F.; de Vargas, C.; Richter, D. J. The Evolution of Silicon Transport in Eukaryotes, Molecular Biology and Evolution, Volume 33 (2016) no. 12, pp. 3226-3248 | DOI

[44] Menardo, F.; Loiseau, C.; Brites, D.; Coscolla, M.; Gygli, S. M.; Rutaihwa, L. K.; Trauner, A.; Beisel, C.; Borrell, S.; Gagneux, S. Treemmer: a tool to reduce large phylogenetic datasets with minimal loss of diversity, BMC Bioinformatics, Volume 19 (2018) no. 1 | DOI

[45] Minh, B. Q.; Schmidt, H. A.; Chernomor, O.; Schrempf, D.; Woodhams, M. D.; von Haeseler, A.; Lanfear, R. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era, Molecular Biology and Evolution, Volume 37 (2020) no. 5, pp. 1530-1534 | DOI

[46] Mitchell, A. L.; Attwood, T. K.; Babbitt, P. C.; Blum, M.; Bork, P.; Bridge, A.; Brown, S. D.; Chang, H.-Y.; El-Gebali, S.; Fraser, M. I.; Gough, J.; Haft, D. R.; Huang, H.; Letunic, I.; Lopez, R.; Luciani, A.; Madeira, F.; Marchler-Bauer, A.; Mi, H.; Natale, D. A.; Necci, M.; Nuka, G.; Orengo, C.; Pandurangan, A. P.; Paysan-Lafosse, T.; Pesseat, S.; Potter, S. C.; Qureshi, M. A.; Rawlings, N. D.; Redaschi, N.; Richardson, L. J.; Rivoire, C.; Salazar, G. A.; Sangrador-Vegas, A.; Sigrist, C. J. A.; Sillitoe, I.; Sutton, G. G.; Thanki, N.; Thomas, P. D.; Tosatto, S. C. E.; Yong, S.-Y.; Finn, R. D. InterPro in 2019: improving coverage, classification and access to protein sequence annotations, Nucleic Acids Research, Volume 47 (2019) no. D1 | DOI

[47] Neely, C. J.; Hu, S. K.; Alexander, H.; Tully, B. J. The high-throughput gene prediction of more than 1,700 eukaryote genomes using the software package EukMetaSanity, bioRxiv, 2021 | DOI

[48] Pertea, G.; Pertea, M. GFF Utilities: GffRead and GffCompare, F1000Research, Volume 9 (2020) | DOI

[49] Plotly Technologies Inc Collaborative data science, 2015 (https://plot.ly)

[50] Priyam, A.; Woodcroft, B. J.; Rai, V.; Moghul, I.; Munagala, A.; Ter, F.; Chowdhary, H.; Pieniak, I.; Maynard, L. J.; Gibbins, M. A.; Moon, H.; Davis-Richardson, A.; Uludag, M.; Watson-Haigh, N. S.; Challis, R.; Nakamura, H.; Favreau, E.; Gómez, E. A.; Pluskal, T.; Leonard, G.; Rumpf, W.; Wurm, Y. Sequenceserver: A Modern Graphical User Interface for Custom BLAST Databases, Molecular Biology and Evolution, Volume 36 (2019) no. 12, pp. 2922-2924 | DOI

[51] Prjibelski, A.; Antipov, D.; Meleshko, D.; Lapidus, A.; Korobeynikov, A. Using SPAdes De Novo Assembler, Current Protocols in Bioinformatics, Volume 70 (2020) no. 1 | DOI

[52] Quast, C.; Pruesse, E.; Yilmaz, P.; Gerken, J.; Schweer, T.; Yarza, P.; Peplies, J.; Glöckner, F. O. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools, Nucleic Acids Research, Volume 41 (2012) no. D1 | DOI

[53] Rice, P.; Longden, I.; Bleasby, A. EMBOSS: The European Molecular Biology Open Software Suite, Trends in Genetics, Volume 16 (2000) no. 6, pp. 276-277 | DOI

[54] Richards, T. Thomas Cavalier-Smith (1942–2021), Nature, Volume 593 (2021) no. 7858, p. 190-190 | DOI

[55] Richter, D. J.; Levin, T. C. The origin and evolution of cell-intrinsic antibacterial defenses in eukaryotes, Current Opinion in Genetics and Development, Volume 58-59 (2019), pp. 111-122 | DOI

[56] Richter, D. J.; Watteaux, R.; Vannier, T.; Leconte, J.; Frémont, P.; Reygondeau, G.; Maillet, N.; Henry, N.; Benoit, G.; Da Silva, O.; Delmont, T. O.; Fernàndez-Guerra, A.; Suweis, S.; Narci, R.; Berney, C.; Eveillard, D.; Gavory, F.; Guidi, L.; Labadie, K.; Mahieu, E.; Poulain, J.; Romac, S.; Roux, S.; Dimier, C.; Kandels, S.; Picheral, M.; Searson, S.; Pesant, S.; Aury, J.-M.; Brum, J. R.; Lemaitre, C.; Pelletier, E.; Bork, P.; Sunagawa, S.; Lombard, F.; Karp-Boss, L.; Bowler, C.; Sullivan, M. B.; Karsenti, E.; Mariadassou, M.; Probert, I.; Peterlongo, P.; Wincker, P.; de Vargas, C.; Ribera d'Alcalà, M.; Iudicone, D.; Jaillon, O. Genomic evidence for global ocean plankton biogeography shaped by large-scale current systems, eLife, Volume 11 (2022) | DOI

[57] Richter, D.; Berney, C.; Strassert, J.; Poh, Y.-P.; Herman, E.; Muñoz-Gómez, S.; Wideman, J.; Burki, F.; de Vargas, C. EukProt: A database of genome-scale predicted proteins across the diversity of eukaryotes, FigShare, 2022 | DOI

[58] Richter, D.; Berney, C.; Strassert, J.; Poh, Y.-P.; Herman, E.; Muñoz-Gómez, S.; Wideman, J.; Burki, F.; de Vargas, C. beaplab/EukProt: Initial release to accompany v3 of the EukProt database. GitHub / Zenodo, Zenodo, 2022 | DOI

[59] Roger, A. J. Thomas Cavalier-Smith (1942–2021), Current Biology, Volume 31 (2021) no. 16, p. R977-R981 | DOI

[60] Saldarriaga, J. Obituary for Professor Thomas Cavalier-Smith FRS, FRSC, International Society for Evolutionary Protistology, 2021 (https://www.isep-protists.com/post/obituary-for-professor-thomas-cavalier-smith)

[61] Sayers, E. W.; Beck, J.; Brister, J. R.; Bolton, E. E.; Canese, K.; Comeau, D. C.; Funk, K.; Ketter, A.; Kim, S.; Kimchi, A.; Kitts, P. A.; Kuznetsov, A.; Lathrop, S.; Lu, Z.; McGarvey, K.; Madden, T. L.; Murphy, T. D.; O’Leary, N.; Phan, L.; Schneider, V. A.; Thibaud-Nissen, F.; Trawick, B. W.; Pruitt, K. D.; Ostell, J. Database resources of the National Center for Biotechnology Information, Nucleic Acids Research, Volume 48 (2020) no. D1 | DOI

[62] Simão, F. A.; Waterhouse, R. M.; Ioannidis, P.; Kriventseva, E. V.; Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs, Bioinformatics, Volume 31 (2015) no. 19, pp. 3210-3212 | DOI

[63] Steenwyk, J. L.; Buida, T. J.; Li, Y.; Shen, X.-X.; Rokas, A. ClipKIT: A multiple sequence alignment trimming software for accurate phylogenomic inference, PLOS Biology, Volume 18 (2020) no. 12 | DOI

[64] Strassert, J. F. H.; Jamy, M.; Mylnikov, A. P.; Tikhonenkov, D. V.; Burki, F. New Phylogenomic Analysis of the Enigmatic Phylum Telonemia Further Resolves the Eukaryote Tree of Life, Molecular Biology and Evolution, Volume 36 (2019) no. 4, pp. 757-765 | DOI

[65] Strassert, J. F.; Monaghan, M. T. Phylogenomic insights into the early diversification of fungi, Current Biology, Volume 32 (2022) no. 16 | DOI

[66] Sunagawa, S.; Acinas, S. G.; Bork, P.; Bowler, C.; Acinas, S. G.; Babin, M.; Bork, P.; Boss, E.; Bowler, C.; Cochrane, G.; de Vargas, C.; Follows, M.; Gorsky, G.; Grimsley, N.; Guidi, L.; Hingamp, P.; Iudicone, D.; Jaillon, O.; Kandels, S.; Karp-Boss, L.; Karsenti, E.; Lescot, M.; Not, F.; Ogata, H.; Pesant, S.; Poulton, N.; Raes, J.; Sardet, C.; Sieracki, M.; Speich, S.; Stemmann, L.; Sullivan, M. B.; Sunagawa, S.; Wincker, P.; Eveillard, D.; Gorsky, G.; Guidi, L.; Iudicone, D.; Karsenti, E.; Lombard, F.; Ogata, H.; Pesant, S.; Sullivan, M. B.; Wincker, P.; de Vargas, C. Tara Oceans: towards global ocean ecosystems biology, Nature Reviews Microbiology, Volume 18 (2020) no. 8, pp. 428-445 | DOI

[67] Tice, A. K.; Žihala, D.; Pánek, T.; Jones, R. E.; Salomaki, E. D.; Nenarokov, S.; Burki, F.; Eliáš, M.; Eme, L.; Roger, A. J.; Rokas, A.; Shen, X.-X.; Strassert, J. F. H.; Kolísko, M.; Brown, M. W. PhyloFisher: A phylogenomic package for resolving eukaryotic relationships, PLOS Biology, Volume 19 (2021) no. 8 | DOI

[68] Wang, H.-C.; Minh, B. Q.; Susko, E.; Roger, A. J. Modeling Site Heterogeneity with Posterior Mean Site Frequency Profiles Accelerates Accurate Phylogenomic Estimation, Systematic Biology, Volume 67 (2018) no. 2, pp. 216-235 | DOI

[69] Wickett, N. J.; Mirarab, S.; Nguyen, N.; Warnow, T.; Carpenter, E.; Matasci, N.; Ayyampalayam, S.; Barker, M. S.; Burleigh, J. G.; Gitzendanner, M. A.; Ruhfel, B. R.; Wafula, E.; Der, J. P.; Graham, S. W.; Mathews, S.; Melkonian, M.; Soltis, D. E.; Soltis, P. S.; Miles, N. W.; Rothfels, C. J.; Pokorny, L.; Shaw, A. J.; DeGironimo, L.; Stevenson, D. W.; Surek, B.; Villarreal, J. C.; Roure, B.; Philippe, H.; dePamphilis, C. W.; Chen, T.; Deyholos, M. K.; Baucom, R. S.; Kutchan, T. M.; Augustin, M. M.; Wang, J.; Zhang, Y.; Tian, Z.; Yan, Z.; Wu, X.; Sun, X.; Wong, G. K.-S.; Leebens-Mack, J. Phylotranscriptomic analysis of the origin and early diversification of land plants, Proceedings of the National Academy of Sciences, Volume 111 (2014) no. 45 | DOI

[70] Wilkinson, M. D.; Dumontier, M.; Aalbersberg, I. J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.-W.; da Silva Santos, L. B.; Bourne, P. E.; Bouwman, J.; Brookes, A. J.; Clark, T.; Crosas, M.; Dillo, I.; Dumon, O.; Edmunds, S.; Evelo, C. T.; Finkers, R.; Gonzalez-Beltran, A.; Gray, A. J.; Groth, P.; Goble, C.; Grethe, J. S.; Heringa, J.; ’t Hoen, P. A.; Hooft, R.; Kuhn, T.; Kok, R.; Kok, J.; Lusher, S. J.; Martone, M. E.; Mons, A.; Packer, A. L.; Persson, B.; Rocca-Serra, P.; Roos, M.; van Schaik, R.; Sansone, S.-A.; Schultes, E.; Sengstag, T.; Slater, T.; Strawn, G.; Swertz, M. A.; Thompson, M.; van der Lei, J.; van Mulligen, E.; Velterop, J.; Waagmeester, A.; Wittenburg, P.; Wolstencroft, K.; Zhao, J.; Mons, B. The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data, Volume 3 (2016) no. 1 | DOI

[71] Yabuki, A.; Kamikawa, R.; Ishikawa, S. A.; Kolisko, M.; Kim, E.; Tanabe, A. S.; Kume, K.; Ishida, K.-i.; Inagki, Y. Palpitomonas bilix represents a basal cryptist lineage: insight into the character evolution in Cryptista, Scientific Reports, Volume 4 (2015) no. 1 | DOI

Cited by Sources: