EukProt: A database of genome-scale predicted proteins across the diversity of eukaryotes

10.24072/pcjournal.173 - Peer Community Journal, Volume 2 (2022), article no. e56.

EukProt is a database of published and publicly available predicted protein sets selected to represent the breadth of eukaryotic diversity, currently including 993 species from all major supergroups as well as orphan taxa. The goal of the database is to provide a single, convenient resource for gene-based research across the spectrum of eukaryotic life, such as phylogenomics and gene family evolution. Each species is placed within the UniEuk taxonomic framework in order to facilitate downstream analyses, and each data set is associated with a unique, persistent identifier to facilitate comparison and replication among analyses. The database is regularly updated, and all versions will be permanently stored and made available via FigShare. The current version has a number of updates, notably ‘The Comparative Set’ (TCS), a reduced taxonomic set with high estimated completeness while maintaining a substantial phylogenetic breadth, which comprises 196 predicted proteomes. A BLAST web server and graphical displays of data set completeness are available at We invite the community to provide suggestions for new data sets and new annotation features to be included in subsequent versions, with the goal of building a collaborative resource that will promote research to understand eukaryotic diversity and diversification.

Published online:
DOI: 10.24072/pcjournal.173
Richter, Daniel J. 1; Berney, Cédric 2, 3; Strassert, Jürgen F. H. 4; Poh, Yu-Ping 5; Herman, Emily K. 6; Muñoz-Gómez, Sergio A. 7; Wideman, Jeremy G. 5; Burki, Fabien 8; de Vargas, Colomban 2, 3

1 Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra) – Barcelona, Spain
2 Research Federation for the study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE – Paris, France
3 Sorbonne Université, CNRS, Station Biologique de Roscoff, UMR 7144, ECOMAP – Roscoff, France
4 Department of Evolutionary and Integrative Ecology, Leibniz Institute of Freshwater Ecology and Inland Fisheries (IGB) – Berlin, Germany
5 Biodesign Center for Mechanisms of Evolution, School of Life Sciences, Arizona State University – Tempe, Arizona, United States of America
6 Department of Agricultural, Food and Nutritional Sciences, Faculty of Agricultural, Life, and Environmental Sciences, University of Alberta – Edmonton, Alberta, Canada
7 Unité d’Ecologie, Systématique et Evolution, Université Paris-Saclay – Orsay, France
8 Department of Organismal Biology and Science for Life Laboratory, Uppsala University – Uppsala, Sweden
License: CC-BY 4.0
Copyrights: The authors retain unrestricted copyrights and publishing rights
Peer reviewed and recommended by PCI : 10.24072/pci.genomics.100021

