Section: Genomics
Topic:
Biophysics and computational biology,
Genetics/genomics
Conference: JOBIM
GrAnnoT, a tool for efficient and reliable annotation transfer through pangenome graph
Corresponding author(s): Marthe, Nina (nina.marthe@ird.fr); Sabot, François (francois.sabot@ird.fr)
10.24072/pcjournal.651 - Peer Community Journal, Volume 5 (2025), article no. e133
Get full text PDF Peer reviewed and recommended by PCIThe increasing availability of genome sequences has highlighted the limitations of using a single reference genome to represent the diversity within a species. Pangenomes, encompassing the genomic information from multiple genomes, thus offer a more comprehensive representation of intraspecific diversity. However, pangenomes in form of a variation graph often lack annotation information and tools for manipulating it, which limits their utility for downstream analyses. We introduce here GrAnnoT, a tool designed for an efficient and reliable integration of annotation information in such variation graphs. It projects existing annotations from a source genome to the variation graph and subsequently to other embedded genomes. GrAnnoT was benchmarked against state-of-the-art tools on pangenome variation graphs and linear genomes from Asian rice, and tested on human and E. coli data. The results demonstrate that GrAnnoT is consensual, conservative, and fast. It provides informative outputs, such as presence-absence matrices for genes, and alignments of transferred features between source and target genomes, helping in the study of genomic variations and evolution. GrAnnoT’s robustness and replicability across different species make it a valuable tool for enhancing pangenome analyses. GrAnnoT is available under the GNU GPLv3 licence at https://forge.ird.fr/diade/dynadiv/grannot.
Type: Research article
Marthe, Nina 1; Zytnicki, Matthias 2; Sabot, François 1
CC-BY 4.0
@article{10_24072_pcjournal_651,
author = {Marthe, Nina and Zytnicki, Matthias and Sabot, Fran\c{c}ois},
title = {GrAnnoT, a tool for efficient and reliable annotation transfer through pangenome graph
},
journal = {Peer Community Journal},
eid = {e133},
year = {2025},
publisher = {Peer Community In},
volume = {5},
doi = {10.24072/pcjournal.651},
language = {en},
url = {https://peercommunityjournal.org/articles/10.24072/pcjournal.651/}
}
TY - JOUR AU - Marthe, Nina AU - Zytnicki, Matthias AU - Sabot, François TI - GrAnnoT, a tool for efficient and reliable annotation transfer through pangenome graph JO - Peer Community Journal PY - 2025 VL - 5 PB - Peer Community In UR - https://peercommunityjournal.org/articles/10.24072/pcjournal.651/ DO - 10.24072/pcjournal.651 LA - en ID - 10_24072_pcjournal_651 ER -
%0 Journal Article %A Marthe, Nina %A Zytnicki, Matthias %A Sabot, François %T GrAnnoT, a tool for efficient and reliable annotation transfer through pangenome graph %J Peer Community Journal %D 2025 %V 5 %I Peer Community In %U https://peercommunityjournal.org/articles/10.24072/pcjournal.651/ %R 10.24072/pcjournal.651 %G en %F 10_24072_pcjournal_651
Marthe, N.; Zytnicki, M.; Sabot, F. GrAnnoT, a tool for efficient and reliable annotation transfer through pangenome graph. Peer Community Journal, Volume 5 (2025), article no. e133. https://doi.org/10.24072/pcjournal.651
PCI peer reviews and recommendation, and links to data, scripts, code and supplementary information: 10.24072/pci.genomics.100432
Conflict of interest of the recommender and peer reviewers:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.
[1] Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing, Genome Biology, Volume 23 (2022) no. 1 | DOI
[2] Basic local alignment search tool, Journal of Molecular Biology, Volume 215 (1990) no. 3, pp. 403-410 | DOI
[3] Comparing methods for constructing and representing human pangenome graphs, Genome Biology, Volume 24 (2023) no. 1 | DOI
[4] Plant pan-genomes are the new reference, Nature Plants, Volume 6 (2020) no. 8, pp. 914-920 | DOI
[5] Combining DNA and protein alignments to improve genome annotation with LiftOn, Genome Research (2024) | DOI
[6] Reference flow: reducing reference bias using multiple population genomes, Genome Biology, Volume 22 (2021) no. 1 | DOI
[7] GrAnnoT: Efficient annotation transfer through pangenome graphs, Peer Community in Genomics (2025) no. 100432 | DOI
[8] Design of novel visual representations and tools applied to plant pangenome visualization, Thesis. Université de Montpellier (2022) (https://theses.hal.science/tel-04135739)
[9] Panache: a web browser-based viewer for linearized pangenomes, Bioinformatics, Volume 37 (2021) no. 23, pp. 4556-4558 | DOI
[10] Haplotype-resolved diverse human genomes and integrated analysis of structural variation, Science, Volume 372 (2021) no. 6537, p. eabf7117 | DOI
[11] Pangenome Graphs, Annual Review of Genomics and Human Genetics, Volume 21 (2020) no. 1, pp. 139-162 | DOI
[12] Comparative Annotation Toolkit (CAT)—simultaneous clade and personal genome annotation, Genome Research, Volume 28 (2018) no. 7, pp. 1029-1038 | DOI
[13] Building pangenome graphs, Nature Methods, Volume 21 (2024) no. 11, pp. 2008-2012 | DOI
[14] Variation graph toolkit improves read mapping by representing genetic variation in the reference, Nature Biotechnology, Volume 36 (2018) no. 9, pp. 875-879 | DOI
[15] PPanGGOLiN: Depicting microbial diversity via a partitioned pangenome graph, PLOS Computational Biology, Volume 17 (2021) no. 12, p. e1009687 | DOI
[16] Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure, Nature Communications, Volume 8 (2017) no. 1 | DOI
[17] ODGI: understanding pangenome graphs, Bioinformatics, Volume 38 (2022) no. 13, pp. 3319-3326 | DOI
[18] Transposable elements, Current Biology, Volume 32 (2022) no. 17, p. R904-R909 | DOI
[19] Cluster-efficient pangenome graph construction with nf-core/pangenome, Bioinformatics, Volume 40 (2024) no. 11 | DOI
[20] Genotyping structural variants in pangenome graphs using the vg toolkit, Genome Biology, Volume 21 (2020) no. 1 | DOI
[21] Pangenome graph construction from genome alignments with Minigraph-Cactus, Nature Biotechnology, Volume 42 (2023) no. 4, pp. 663-673 | DOI
[22] Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs, Genome Biology, Volume 21 (2020) no. 1 | DOI
[23] Helixer–de novoPrediction of Primary Eukaryotic Gene Models Combining Deep Learning and a Hidden Markov Model (2023) | DOI
[24] Accurate and fast graph-based pangenome annotation and clustering with ggCaller, Genome Research, Volume 33 (2023) no. 9, pp. 1622-1637 | DOI
[25] PanTools v3: functional annotation, classification and phylogenomics, Bioinformatics, Volume 38 (2022) no. 18, pp. 4403-4405 | DOI
[26] Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data, Rice, Volume 6 (2013) no. 1 | DOI
[27] Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data, Rice, Volume 6 (2013) no. 1 | DOI
[28] GeMoMa: Homology-Based Gene Prediction Utilizing Intron Position Conservation and RNA-seq Data, Gene Prediction, 2019, pp. 161-177 | DOI
[29] Highly contiguous assemblies of 101 drosophilid genomes, eLife, Volume 10 (2021) | DOI
[30] Méthodes bioinformatiques pour l'étude des Variants de Structure avec des données de séquençages génomiques, Thesis. Université Rennes 1 (2021) (https://theses.hal.science/tel-03497793)
[31] The design and construction of reference pangenome graphs with minigraph, Genome Biology, Volume 21 (2020) no. 1 | DOI
[32] A pan-genome of 69 Arabidopsis thaliana accessions reveals a conserved genome structure throughout the global species range, Nature Genetics, Volume 56 (2024) no. 5, pp. 982-991 | DOI
[33] A draft human pangenome reference, Nature, Volume 617 (2023) no. 7960, pp. 312-324 | DOI
[34] PPanG: a precision pangenome browser enabling nucleotide-level analysis of genomic variations in individual genomes and their graph-based pangenome, BMC Genomics, Volume 25 (2024) no. 1 | DOI
[35] Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph, Genome Biology, Volume 21 (2020) no. 1 | DOI
[36] Reference genome bias in light of species-specific chromosomal reorganization and translocations, BioRxiv, 2024 | DOI
[37] Pan1c (Pangenome at chromosome level), 2024 (https://hal.science/hal-05034842v1)
[38] Interactive visualization and interpretation of pangenome graphs by linear reference–based coordinate projection and annotation integration, Genome Research, Volume 35 (2025) no. 2, pp. 296-310 | DOI
[39] The Need for a Human Pangenome Reference Sequence, Annual Review of Genomics and Human Genetics, Volume 22 (2021) no. 1, pp. 81-102 | DOI
[40] Orthology: Promises and Challenges, Evolutionary Biology—A Transdisciplinary Approach, Springer International Publishing, 2020, pp. 203-228 | DOI
[41] Efficient indexing and querying of annotations in a pangenome graph, BioRxiv (2024) | DOI
[42] RATT: Rapid Annotation Transfer Tool, Nucleic Acids Research, Volume 39 (2011) no. 9, p. e57-e57 | DOI
[43] Methods and Developments in Graphical Pangenomics, Journal of the Indian Institute of Science, Volume 101 (2021) no. 3, pp. 485-498 | DOI
[44] PanViz: interactive visualization of the structure of functionally annotated pangenomes, Bioinformatics, Volume 33 (2016) no. 7, pp. 1081-1082 | DOI
[45] The fire ant social chromosome supergene variant Sb shows low diversity but high divergence from SB, Molecular Ecology, Volume 26 (2017) no. 11, pp. 2864-2879 | DOI
[46] BEDTools: The Swiss‐Army Tool for Genome Feature Analysis, Current Protocols in Bioinformatics, Volume 47 (2014) no. 1 | DOI
[47] GraphAligner: rapid and versatile sequence-to-graph alignment, Genome Biology, Volume 21 (2020) no. 1 | DOI
[48] A pangenome graph reference of 30 chicken genomes allows genotyping of large and complex structural variants, BMC Biology, Volume 21 (2023) no. 1 | DOI
[49] Gene Family Level Comparative Analysis of Gene Expression in Mammals Validates the Ortholog Conjecture, Genome Biology and Evolution, Volume 6 (2014) no. 4, pp. 754-762 | DOI
[50] Investigating the topological motifs of inversions in pangenome graphs, BioRxiv (2025) | DOI
[51] The bacterial pangenome as a new tool for analysing pathogenic bacteria, New Microbes and New Infections, Volume 7 (2015), pp. 72-85 | DOI
[52] Gene algebra from a genetic code algebraic structure, Journal of Mathematical Biology, Volume 51 (2005) no. 4, pp. 431-457 | DOI
[53] Pangenome graphs and their applications in biodiversity genomics, Nature Genetics, Volume 57 (2025) no. 1, pp. 13-26 | DOI
[54] Plant pan-genomics and its applications, Molecular Plant, Volume 16 (2023) no. 1, pp. 168-186 | DOI
[55] Liftoff: accurate mapping of gene annotations, Bioinformatics, Volume 37 (2021) no. 12, pp. 1639-1643 | DOI
[56] Pangenomics enables genotyping of known structural variants in 5202 diverse genomes, Science, Volume 374 (2021) no. 6574 | DOI
[57] Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae : Implications for the microbial “pan-genome”, Proceedings of the National Academy of Sciences, Volume 102 (2005) no. 39, pp. 13950-13955 | DOI
[58] Plant Pangenome: Impacts on Phenotypes and Evolution, Annual Plant Reviews online (2019), pp. 453-478 | DOI
[59] High-Quality Arabidopsis Thaliana Genome Assembly with Nanopore and HiFi Long Reads, Genomics, Proteomics & Bioinformatics, Volume 20 (2021) no. 1, pp. 4-13 | DOI
[60] Genomic variation in 3, 010 diverse accessions of Asian cultivated rice, Nature, Volume 557 (2018) no. 7703, pp. 43-49 | DOI
[61] Genomic investigation of 18,421 lines reveals the genetic architecture of rice, Science, Volume 385 (2024) no. 6704, p. eadm8762 | DOI
[62] Bandage: interactive visualization of de novo genome assemblies, Bioinformatics, Volume 31 (2015) no. 20, pp. 3350-3352 | DOI
[63] A unified classification system for eukaryotic transposable elements, Nature reviews genetics, Volume 8 (2007) no. 12, pp. 973-982 | DOI
[64] The complete and fully-phased diploid genome of a male Han Chinese, Cell Research, Volume 33 (2023) no. 10, pp. 745-761 | DOI
[65] CrossMap: a versatile tool for coordinate conversion between genome assemblies, Bioinformatics, Volume 30 (2013) no. 7, pp. 1006-1007 | DOI
[66] Is ICE hot? A genomic comparative study reveals integrative and conjugative elements as “hot” vectors for the dissemination of antibiotic resistance genes, mSystems, Volume 8 (2023) no. 6 | DOI
[67] Graph pangenome captures missing heritability and empowers tomato breeding, Nature, Volume 606 (2022) no. 7914, pp. 527-534 | DOI
[68] A platinum standard pan-genome resource that represents the population structure of Asian rice, Scientific Data, Volume 7 (2020) no. 1 | DOI
Cited by Sources: