Section: Genomics
Topic:
Genetics/Genomics
A deep dive into genome assemblies of non-vertebrate animals
Corresponding author(s): Guiglielmoni, Nadège (nguiglie@uni-koeln.de)
10.24072/pcjournal.128 - Peer Community Journal, Volume 2 (2022), article no. e29.
Get full text PDF Peer reviewed and recommended by PCINon-vertebrate species represent about 95% of known metazoan (animal) diversity. They remain to this day relatively unexplored genetically, but understanding their genome structure and function is pivotal for expanding our current knowledge of evolution, ecology and biodiversity. Following the continuous improvements and decreasing costs of sequencing technologies, many genome assembly tools have been released, leading to a significant amount of genome projects being completed in recent years. In this review, we examine the current state of genome projects of non-vertebrate animal species. We present an overview of available sequencing technologies, assembly approaches, as well as pre and post-processing steps, genome assembly evaluation methods, and their application to non-vertebrate animal genomes.
Type: Review article
Guiglielmoni, Nadège 1, 2; Rivera-Vicéns, Ramón 3; Koszul, Romain 4; Flot, Jean-François 1, 5
@article{10_24072_pcjournal_128, author = {Guiglielmoni, Nad\`ege and Rivera-Vic\'ens, Ram\'on and Koszul, Romain and Flot, Jean-Fran\c{c}ois}, title = {A deep dive into genome assemblies of non-vertebrate animals}, journal = {Peer Community Journal}, eid = {e29}, publisher = {Peer Community In}, volume = {2}, year = {2022}, doi = {10.24072/pcjournal.128}, url = {https://peercommunityjournal.org/articles/10.24072/pcjournal.128/} }
TY - JOUR AU - Guiglielmoni, Nadège AU - Rivera-Vicéns, Ramón AU - Koszul, Romain AU - Flot, Jean-François TI - A deep dive into genome assemblies of non-vertebrate animals JO - Peer Community Journal PY - 2022 VL - 2 PB - Peer Community In UR - https://peercommunityjournal.org/articles/10.24072/pcjournal.128/ DO - 10.24072/pcjournal.128 ID - 10_24072_pcjournal_128 ER -
%0 Journal Article %A Guiglielmoni, Nadège %A Rivera-Vicéns, Ramón %A Koszul, Romain %A Flot, Jean-François %T A deep dive into genome assemblies of non-vertebrate animals %J Peer Community Journal %D 2022 %V 2 %I Peer Community In %U https://peercommunityjournal.org/articles/10.24072/pcjournal.128/ %R 10.24072/pcjournal.128 %F 10_24072_pcjournal_128
Guiglielmoni, Nadège; Rivera-Vicéns, Ramón; Koszul, Romain; Flot, Jean-François. A deep dive into genome assemblies of non-vertebrate animals. Peer Community Journal, Volume 2 (2022), article no. e29. doi : 10.24072/pcjournal.128. https://peercommunityjournal.org/articles/10.24072/pcjournal.128/
PCI peer reviews and recommendation, and links to data, scripts, code and supplementary information: 10.24072/pci.genomics.100016
Conflict of interest of the recommender and peer reviewers:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.
[1] New Approaches for Genome Assembly and Scaffolding, Annual Review of Animal Biosciences, Volume 7 (2019) no. 1, pp. 17-40 | DOI
[2] GenBank, https://www.ncbi.nlm.nih.gov/genbank/, 2021
[3] Red List, www.iucnredlist.org/resources/summary-statistics
[4] The Timetree of Life, Systematic Biology, Volume 58 (2009) no. 4, pp. 461-462 | DOI
[5] Animal biodiversity: An update of classification and diversity in 2013”. In: Animal Biodiversity: An Outline of Higher-level Classification and Survey of Taxonomic Richness (Addenda 2013), Zootaxa, Volume 3703 (2013) no. 1 | DOI
[6] Insect genomes: progress and challenges, Insect Molecular Biology, Volume 28 (2019) no. 6, pp. 739-758 | DOI
[7] Research trends in ecosystem services provided by insects, Basic and Applied Ecology, Volume 26 (2018), pp. 8-23 | DOI
[8] The draft genome of whitefly Bemisia tabaci MEAM1, a global crop pest, provides novel insights into virus transmission, host adaptation, and insecticide resistance, BMC Biology, Volume 14 (2016) no. 1 | DOI
[9] Genome Sequence of Aedes aegypti, a Major Arbovirus Vector, Science, Volume 316 (2007) no. 5832, pp. 1718-1723 | DOI
[10] The Genome of Anopheles darlingi , the main neotropical malaria vector, Nucleic Acids Research, Volume 41 (2013) no. 15, pp. 7387-7400 | DOI
[11] Improved reference genome of Aedes aegypti informs arbovirus vector control, Nature, Volume 563 (2018) no. 7732, pp. 501-507 | DOI
[12] De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds, Science, Volume 356 (2017) no. 6333, pp. 92-95 | DOI
[13] Toward a genome sequence for every animal: Where are we now?, Proceedings of the National Academy of Sciences, Volume 118 (2021) no. 52 | DOI
[14] Marine natural products, Natural Product Reports, Volume 37 (2020) no. 2, pp. 175-223 | DOI
[15] Marine Natural Products: A Source of Novel Anticancer Drugs, Marine Drugs, Volume 17 (2019) no. 9 | DOI
[16] Antibacterial products of marine organisms, Applied Microbiology and Biotechnology, Volume 99 (2015) no. 10, pp. 4145-4173 | DOI
[17] Terpenoids in Marine Heterobranch Molluscs, Marine Drugs, Volume 18 (2020) no. 3 | DOI
[18] Natural Products from Sponges, Symbiotic Microbiomes of Coral Reefs Sponges and Corals, Springer Netherlands, Dordrecht, 2019, pp. 329-463 | DOI
[19] Molluscan Genomics: Implications for Biology and Aquaculture, Current Molecular Biology Reports, Volume 3 (2017) no. 4, pp. 297-305 | DOI
[20] Invertebrates, ecosystem services and climate change, Biological Reviews, Volume 88 (2012) no. 2, pp. 327-348 | DOI
[21] Molluscan genomics: the road so far and the way forward, Hydrobiologia, Volume 847 (2019) no. 7, pp. 1705-1726 | DOI
[22] The Biology and Evolution of Calcite and Aragonite Mineralization in Octocorallia, Frontiers in Ecology and Evolution, Volume 9 (2021) | DOI
[23] Molecular mechanisms of biomineralization in marine invertebrates, Journal of Experimental Biology, Volume 223 (2020) no. 11 | DOI
[24] Using the Acropora digitifera genome to understand coral responses to environmental change, Nature, Volume 476 (2011) no. 7360, pp. 320-323 | DOI
[25] The Roles of Introgression and Climate Change in the Rise to Dominance of Acropora Corals, Current Biology, Volume 28 (2018) no. 21 | DOI
[26] Population genetics of the coral Acropora millepora: Toward genomic prediction of bleaching, Science, Volume 369 (2020) no. 6501 | DOI
[27] Eighteen Coral Genomes Reveal the Evolutionary Origin of Acropora Strategies to Accommodate Environmental Changes, Molecular Biology and Evolution, Volume 38 (2020) no. 1, pp. 16-30 | DOI
[28] Phylogenetic tree building in the genomic age, Nature Reviews Genetics, Volume 21 (2020) no. 7, pp. 428-444 | DOI
[29] Genomic insights into the evolutionary origin of Myxozoa within Cnidaria, Proceedings of the National Academy of Sciences, Volume 112 (2015) no. 48, pp. 14912-14917 | DOI
[30] Comparative genomics and the nature of placozoan species, PLOS Biology, Volume 16 (2018) no. 7 | DOI
[31] The Mutational Meltdown in Asexual Populations, Journal of Heredity, Volume 84 (1993) no. 5, pp. 339-344 | DOI
[32] The Ecoresponsive Genome of Daphnia pulex, Science, Volume 331 (2011) no. 6017, pp. 555-561 | DOI
[33] A New Reference Genome Assembly for the Microcrustacean Daphnia pulex, G3 Genes|Genomes|Genetics, Volume 7 (2017) no. 5, pp. 1405-1416 | DOI
[34] Signatures of the Evolution of Parthenogenesis and Cryptobiosis in the Genomes of Panagrolaimid Nematodes, iScience, Volume 21 (2019), pp. 587-602 | DOI
[35] Haplotype divergence supports long-term asexuality in the oribatid mite Oppiella nova, Proceedings of the National Academy of Sciences, Volume 118 (2021) no. 38 | DOI
[36] Chromosome-level genome assembly reveals homologous chromosomes and recombination in asexual rotifer Adineta vaga, Science Advances, Volume 7 (2021) no. 41 | DOI
[37] The Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing of Benefits Arising from their Utilization to the Convention on Biological Diversity, Review of European Community & International Environmental Law, Volume 20 (2011) no. 1, pp. 47-61 | DOI
[38] BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs, Bioinformatics, Volume 31 (2015) no. 19, pp. 3210-3212 | DOI
[39] BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics, Molecular Biology and Evolution, Volume 35 (2017) no. 3, pp. 543-548 | DOI
[40] BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes, Molecular Biology and Evolution, Volume 38 (2021) no. 10, pp. 4647-4654 | DOI
[41] Basic local alignment search tool, Journal of Molecular Biology, Volume 215 (1990) no. 3, pp. 403-410 | DOI
[42] The Global Invertebrate Genomics Alliance (GIGA): Developing Community Resources to Study Diverse Invertebrate Genomes, Journal of Heredity, Volume 105 (2013) no. 1, pp. 1-18 | DOI
[43] Advancing genomics through the Global Invertebrate Genomics Alliance (GIGA), Invertebrate Systematics, Volume 31 (2017) no. 1 | DOI
[44] Earth BioGenome Project: Sequencing life for the future of life, Proceedings of the National Academy of Sciences, Volume 115 (2018) no. 17, pp. 4325-4333 | DOI
[45] Darwin Tree of Life, www.darwintreeoflife.org, 2021
[46] Aquatic Symbiosis Genomics Project, www. sanger.ac.uk/collaboration/aquatic-symbiosis-genomics-project, 2021
[47] The era of reference genomes in conservation genomics, Trends in Ecology & Evolution, Volume 37 (2022) no. 3, pp. 197-202 | DOI
[48] DNA sequencing with chain-terminating inhibitors, Proceedings of the National Academy of Sciences, Volume 74 (1977) no. 12, pp. 5463-5467 | DOI
[49] Life with 6000 Genes, Science, Volume 274 (1996) no. 5287, pp. 546-567 | DOI
[50] Genome Sequence of the Nematode C. elegans: A Platform for Investigating Biology, Science, Volume 282 (1998) no. 5396, pp. 2012-2018 | DOI
[51] The A, C, G, and T of Genome Assembly, BioMed Research International, Volume 2016 (2016), pp. 1-10 | DOI
[52] Initial sequencing and analysis of the human genome, Nature, Volume 409 (2001) no. 6822, pp. 860-921 | DOI
[53] Bioinformatics challenges of new sequencing technology, Trends in Genetics, Volume 24 (2008) no. 3, pp. 142-149 | DOI
[54] Genome sequencing in microfabricated high-density picolitre reactors, Nature, Volume 437 (2005) no. 7057, pp. 376-380 | DOI
[55] An integrated semiconductor device enabling non-optical genome sequencing, Nature, Volume 475 (2011) no. 7356, pp. 348-352 | DOI
[56] Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding, Genome Research, Volume 19 (2009) no. 9, pp. 1527-1541 | DOI
[57] Sequencing technologies — the next generation, Nature Reviews Genetics, Volume 11 (2009) no. 1, pp. 31-46 | DOI
[58] Accurate whole human genome sequencing using reversible terminator chemistry, Nature, Volume 456 (2008) no. 7218, pp. 53-59 | DOI
[59] Long reads: their purpose and place, Human Molecular Genetics, Volume 27 (2018) no. R2 | DOI
[60] Real-Time DNA Sequencing from Single Polymerase Molecules, Science, Volume 323 (2009) no. 5910, pp. 133-138 | DOI
[61] Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome, Nature Biotechnology, Volume 37 (2019) no. 10, pp. 1155-1162 | DOI
[62] Three decades of nanopore sequencing, Nature Biotechnology, Volume 34 (2016) no. 5, pp. 518-524 | DOI
[63] The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community, Genome Biology, Volume 17 (2016) no. 1 | DOI
[64] Performance of neural network basecalling tools for Oxford Nanopore sequencing, Genome Biology, Volume 20 (2019) no. 1 | DOI
[65] Nanopore sequencing and assembly of a human genome with ultra-long reads, Nature Biotechnology, Volume 36 (2018) no. 4, pp. 338-345 | DOI
[66] https://github.com/nanoporetech/bonito
[67] Poreover, https://github.com/jordisr/poreover, 2017
[68] Oxford Nanopore R10.4 long-read sequencing enables near-perfect bacterial genomes from pure cultures and metagenomes without short-read or reference polishing, bioRxiv (2021) | DOI
[69] Optimization of high molecular weight DNA extraction methods in shrimp for a long-read sequencing platform, PeerJ, Volume 8 (2020) | DOI
[70] High-Throughput Gene Mapping in Caenorhabditis elegans, Genome Research, Volume 12 (2002) no. 7, pp. 1100-1105 | DOI
[71] The Atlas Genome Assembly System, Genome Research, Volume 14 (2004) no. 4, pp. 721-732 | DOI
[72] CAP3: A DNA Sequence Assembly Program, Genome Research, Volume 9 (1999) no. 9, pp. 868-877 | DOI
[73] Consensus generation and variant detection by Celera Assembler, Bioinformatics, Volume 24 (2008) no. 8, pp. 1035-1040 | DOI
[74] An Eulerian path approach to DNA fragment assembly, Proceedings of the National Academy of Sciences, Volume 98 (2001) no. 17, pp. 9748-9753 | DOI
[75] Whole-Genome Shotgun Assembly and Analysis of the Genome of Fugu rubripes, Science, Volume 297 (2002) no. 5585, pp. 1301-1310 | DOI
[76] Minimus: a fast, lightweight genome assembler, BMC Bioinformatics, Volume 8 (2007) no. 1 | DOI
[77] Genome sequence assembly using trace signals and additional sequence information, German Conference on Bioinformatics, Volume 99 (1999), pp. 45-56
[78] Base-Calling of Automated Sequencer Traces Using Phred. II. Error Probabilities, Genome Research, Volume 8 (1998) no. 3, pp. 186-194 | DOI
[79] The Phusion Assembler, Genome Research, Volume 13 (2002) no. 1, pp. 81-90 | DOI
[80] Scoring-and-unfolding trimmed tree assembler: concepts, constructs and comparisons, Bioinformatics, Volume 27 (2010) no. 2, pp. 153-160 | DOI
[81] TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects, Genome Science and Technology, Volume 1 (1995) no. 1, pp. 9-19 | DOI
[82] ABySS: A parallel assembler for short read sequence data, Genome Research, Volume 19 (2009) no. 6, pp. 1117-1123 | DOI
[83] ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter, Genome Research, Volume 27 (2017) no. 5, pp. 768-777 | DOI
[84] ALLPATHS: De novo assembly of whole-genome shotgun microreads, Genome Research, Volume 18 (2008) no. 5, pp. 810-820 | DOI
[85] BASE: a practical de novo assembler for large genomes using long NGS reads, BMC Genomics, Volume 17 (2016) no. S5 | DOI
[86] Aggressive assembly of pyrosequencing reads with mates, Bioinformatics, Volume 24 (2008) no. 24, pp. 2818-2824 | DOI
[87] De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer, Genome Research, Volume 18 (2008) no. 5, pp. 802-809 | DOI
[88] EPGA: de novo assembly using the distributions of reads and insert size, Bioinformatics, Volume 31 (2014) no. 6, pp. 825-833 | DOI
[89] Short read fragment assembly of bacterial genomes, Genome Research, Volume 18 (2007) no. 2, pp. 324-330 | DOI
[90] Gossamer -- a resource-efficient de novo assembler, Bioinformatics, Volume 28 (2012) no. 14, pp. 1937-1938 | DOI
[91] IDBA – A Practical Iterative de Bruijn Graph De Novo Assembler, Lecture Notes in Computer Science, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010, pp. 426-440 | DOI
[92] ISEA: Iterative Seed-Extension Algorithm for De Novo Assembly Using Paired-End Information and Insert Size Distribution, IEEE/ACM Transactions on Computational Biology and Bioinformatics, Volume 14 (2017) no. 4, pp. 916-925 | DOI
[93] Assembler for de novo assembly of large genomes, Proceedings of the National Academy of Sciences, Volume 110 (2013) no. 36 | DOI
[94] LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads, Bioinformatics, Volume 32 (2016) no. 21, pp. 3215-3223 | DOI
[95] Meraculous: De Novo Genome Assembly with Short Paired-End Reads, PLoS ONE, Volume 6 (2011) no. 8 | DOI
[96] https://cals.arizona.edu/swes/maier_lab/kartchner/documentation/index.php/home/docs/newbler, 2012
[97] PCAP: A Whole-Genome Assembly Program, Genome Research, Volume 13 (2003) no. 9, pp. 2164-2170 | DOI
[98] PERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM and Look Ahead Approach, PLoS ONE, Volume 9 (2014) no. 12 | DOI
[99] Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads, Genome Research, Volume 24 (2014) no. 8, pp. 1384-1395 | DOI
[100] PE-Assembler: de novo assembler using short paired-end reads, Bioinformatics, Volume 27 (2011) no. 2, pp. 167-174 | DOI
[101] QSRA – a quality-value guided de novo short read assembler, BMC Bioinformatics, Volume 10 (2009) no. 1 | DOI
[102] Ray: Simultaneous Assembly of Reads from a Mix of High-Throughput Sequencing Technologies, Journal of Computational Biology, Volume 17 (2010) no. 11, pp. 1519-1533 | DOI
[103] Readjoiner: a fast and memory efficient string graph-based sequence assembler, BMC Bioinformatics, Volume 13 (2012) no. 1 | DOI
[104] Efficient de novo assembly of large genomes using compressed data structures, Genome Research, Volume 22 (2011) no. 3, pp. 549-556 | DOI
[105] SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing, Genome Research, Volume 17 (2007) no. 11, pp. 1697-1706 | DOI
[106] De novo assembly of human genomes with massively parallel short read sequencing, Genome Research, Volume 20 (2009) no. 2, pp. 265-272 | DOI
[107] SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler, GigaScience, Volume 1 (2012) no. 1 | DOI
[108] SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing, Journal of Computational Biology, Volume 19 (2012) no. 5, pp. 455-477 | DOI
[109] Exploiting sparseness in de novo genome assembly, BMC Bioinformatics, Volume 13 (2012) no. S6 | DOI
[110] Assembling millions of short DNA sequences using SSAKE, Bioinformatics, Volume 23 (2006) no. 4, pp. 500-501 | DOI
[111] A fast hybrid short read fragment assembly algorithm, Bioinformatics, Volume 25 (2009) no. 17, pp. 2279-2280 | DOI
[112] Extending assembly of short DNA sequences to handle error, Bioinformatics, Volume 23 (2007) no. 21, pp. 2942-2944 | DOI
[113] Using the Velvet de novo Assembler for Short‐Read Sequencing Technologies, Current Protocols in Bioinformatics, Volume 31 (2010) no. 1 | DOI
[114] Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation, Genome Research, Volume 27 (2017) no. 5, pp. 722-736 | DOI
[115] Phased diploid genome assembly with single-molecule real-time sequencing, Nature Methods, Volume 13 (2016) no. 12, pp. 1050-1054 | DOI
[116] Assembly of long, error-prone reads using repeat graphs, Nature Biotechnology, Volume 37 (2019) no. 5, pp. 540-546 | DOI
[117] HINGE: long-read assembly achieves optimal repeat resolution, Genome Research, Volume 27 (2017) no. 5, pp. 747-756 | DOI
[118] MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads, Nature Methods, Volume 14 (2017) no. 11, pp. 1072-1074 | DOI
[119] Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences, Bioinformatics, Volume 32 (2016) no. 14, pp. 2103-2110 | DOI
[120] Efficient assembly of nanopore reads via highly accurate and intact error correction, Nature Communications, Volume 12 (2021) no. 1 | DOI
[121] NextDenovo, https://github.com/Nextomics/NextDenovo, 2019
[122] Yet another de novo genome assembler, International Symposium on Image and Signal Processing and Analysis (ISPA) (2019), pp. 147-151 | DOI
[123] Time- and memory-efficient genome assembly with Raven, Nature Computational Science, Volume 1 (2021) no. 5, pp. 332-336 | DOI
[124] Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes, Nature Biotechnology, Volume 38 (2020) no. 9, pp. 1044-1053 | DOI
[125] SMARTdenovo: a de novo assembler using long noisy reads, Gigabyte, Volume 2021 (2021), pp. 1-9 | DOI
[126] wtdbg, https://github.com/ruanjue/wtdbg, 2016
[127] Fast and accurate long-read assembly with wtdbg2, Nature Methods, Volume 17 (2019) no. 2, pp. 155-158 | DOI
[128] HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads, Genome Research, Volume 30 (2020) no. 9, pp. 1291-1305 | DOI
[129] Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm, Nature Methods, Volume 18 (2021) no. 2, pp. 170-175 | DOI
[130] IPA, https://github.com/PacificBiosciences/pbbioconda , 2018
[131] LJA: Assembling Long and Accurate Reads Using Multiplex de Bruijn Graphs, bioRxiv, 2021 no. 2020.12.10.420448 | DOI
[132] Minimizer-space de Bruijn graphs: Whole-genome assembly of long reads in minutes on a personal computer, Cell Systems, Volume 12 (2021) no. 10 | DOI
[133] MBG: Minimizer-based sparse de Bruijn Graph construction, Bioinformatics, Volume 37 (2021) no. 16, pp. 2476-2478 | DOI
[134] Human genome assembly in 100 minutes, bioRxiv (2019) no. 705616 | DOI
[135] A greedy approximation algorithm for constructing shortest common superstrings, Theoretical Computer Science, Volume 57 (1988) no. 1, pp. 131-145 | DOI
[136] NOVOPlasty: de novo assembly of organelle genomes from whole genome data, Nucleic Acids Research, Volume 45 (2017) no. 4 | DOI
[137] A strategy of DNA sequencing employing computer programs, Nucleic Acids Research, Volume 6 (1979) no. 7, pp. 2601-2610 | DOI
[138] The Sequence of the Human Genome, Science, Volume 291 (2001) no. 5507, pp. 1304-1351 | DOI
[139] A Combinatorial Problem, Koninklijke Nederlandse Akademie v. Wetenschappen, Volume 49 (1946), pp. 758-764
[140] 48, L’Intermédiaire des Mathématiciens, Volume 1 (1894), pp. 107-110
[141] How to apply de Bruijn graphs to genome assembly, Nature Biotechnology, Volume 29 (2011) no. 11, pp. 987-991 | DOI
[142] A cnidarian parasite of salmon (Myxozoa: Henneguya) lacks a mitochondrial genome, Proceedings of the National Academy of Sciences, Volume 117 (2020) no. 10, pp. 5358-5363 | DOI
[143] An evolutionarily-conserved Wnt3/β-catenin/Sp5 feedback loop restricts head organizer activity in Hydra, Nature Communications, Volume 10 (2019) no. 1 | DOI
[144] Revisiting an Old Riddle: What Determines Genetic Diversity Levels within Species?, PLoS Biology, Volume 10 (2012) no. 9 | DOI
[145] Filtlong, https://github.com/rrwick/Filtlong, 2017
[146] CoLoRMap: Correcting Long Reads by Mapping short reads, Bioinformatics, Volume 32 (2016) no. 17 | DOI
[147] Hercules: a profile HMM-based hybrid error correction algorithm for long reads, Nucleic Acids Research, Volume 46 (2018) no. 21 | DOI
[148] Hybrid correction of highly noisy long reads using a variable-order de Bruijn graph, Bioinformatics, Volume 34 (2018) no. 24, pp. 4213-4222 | DOI
[149] Jabba: hybrid error correction for long sequencing reads, Algorithms for Molecular Biology, Volume 11 (2016) no. 1 | DOI
[150] LoRDEC: accurate and efficient long read error correction, Bioinformatics, Volume 30 (2014) no. 24, pp. 3506-3514 | DOI
[151] Accurate self-correction of errors in long reads using de Bruijn graphs, Bioinformatics, Volume 33 (2017) no. 6, pp. 799-806 | DOI
[152] Genome assembly using Nanopore-guided long and error-free DNA reads, BMC Genomics, Volume 16 (2015) no. 1 | DOI
[153] proovread : large-scale high-accuracy PacBio correction through iterative short read consensus, Bioinformatics, Volume 30 (2014) no. 21, pp. 3004-3011 | DOI
[154] Scalable long read self-correction and assembly polishing with multiple sequence alignment, Scientific Reports, Volume 11 (2021) no. 1 | DOI
[155] Non Hybrid Long Read Consensus Using Local De Bruijn Graph Assembly, bioRxiv (2017) no. 106252 | DOI
[156] FLAS: fast and high-throughput algorithm for PacBio long-read self-correction, Bioinformatics, Volume 35 (2019) no. 20, pp. 3953-3960 | DOI
[157] HALC: High throughput algorithm for long read error correction, BMC Bioinformatics, Volume 18 (2017) no. 1 | DOI
[158] ntEdit: scalable genome sequence polishing, Bioinformatics, Volume 35 (2019) no. 21, pp. 4430-4432 | DOI
[159] Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement, PLoS ONE, Volume 9 (2014) no. 11 | DOI
[160] The genome polishing tool POLCA makes fast and accurate corrections in genome assemblies, PLOS Computational Biology, Volume 16 (2020) no. 6 | DOI
[161] Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm, Bioinformatics, Volume 36 (2020) no. 12, pp. 3669-3679 | DOI
[162] Hapo-G, haplotype-aware polishing of genome assemblies with accurate reads, NAR Genomics and Bioinformatics, Volume 3 (2021) no. 2 | DOI
[163] “HyPo : super fast & accurate polisher for long read assemblies, bioRxiv (2019) no. 2019.12.19.882506 | DOI
[164] Fast and accurate de novo genome assembly from long uncorrected reads, Genome Research, Volume 27 (2017) no. 5, pp. 737-746 | DOI
[165] GenomicConsensus, https://github.com/PacificBiosciences/GenomicConsensus, 2014
[166] Medaka, https://github.com/nanoporetech/medaka, 2014
[167] NextPolish: a fast and efficient genome polishing tool for long-read assembly, Bioinformatics, Volume 36 (2019) no. 7, pp. 2253-2255 | DOI
[168] Nanopolish, https://github.com jts/nanopolish, 2014
[169] HaploMerger2: rebuilding both haploid sub-assemblies from high-heterozygosity diploid genome assembly, Bioinformatics, Volume 33 (2017) no. 16, pp. 2577-2579 | DOI
[170] Identifying and removing haplotypic duplication in primary genome assemblies, Bioinformatics, Volume 36 (2020) no. 9, pp. 2896-2898 | DOI
[171] Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies, BMC Bioinformatics, Volume 19 (2018) no. 1 | DOI
[172] Hierarchical Scaffolding With Bambus, Genome Research, Volume 14 (2004) no. 1, pp. 149-159 | DOI
[173] Solving scaffolding problem with repeats, bioRxiv (2018) no. 330472 | DOI
[174] BESST - Efficient scaffolding of large fragmented assemblies, BMC Bioinformatics, Volume 15 (2014) no. 1 | DOI
[175] BOSS: a novel scaffolding algorithm based on an optimized scaffold graph, Bioinformatics, Volume 33 (2016) no. 2, pp. 169-176 | DOI
[176] GRASS: a generic algorithm for scaffolding next-generation sequencing assemblies, Bioinformatics, Volume 28 (2012) no. 11, pp. 1429-1437 | DOI
[177] Fast scaffolding with small independent mixed integer programs, Bioinformatics, Volume 27 (2011) no. 23, pp. 3259-3265 | DOI
[178] Opera: Reconstructing Optimal Genomic Scaffolds with High-Throughput Paired-End Sequences, Journal of Computational Biology, Volume 18 (2011) no. 11, pp. 1681-1691 | DOI
[179] ScaffMatch: scaffolding algorithm based on maximum weight matching, Bioinformatics, Volume 31 (2015) no. 16, pp. 2632-2638 | DOI
[180] ScaffoldScaffolder: solving contig orientation via bidirected to directed graph reduction, Bioinformatics, Volume 32 (2016) no. 1, pp. 17-24 | DOI
[181] SCARPA: scaffolding reads with practical algorithms, Bioinformatics, Volume 29 (2012) no. 4, pp. 428-434 | DOI
[182] SCOP: a novel scaffolding algorithm based on contig classification and optimization, Bioinformatics, Volume 35 (2018) no. 7, pp. 1142-1150 | DOI
[183] SLIQ: Simple Linear Inequalities for Efficient Contig Scaffolding, Journal of Computational Biology, Volume 19 (2012) no. 10, pp. 1162-1175 | DOI
[184] SOPRA: Scaffolding algorithm for paired reads via statistical optimization, BMC Bioinformatics, Volume 11 (2010) no. 1 | DOI
[185] Scaffolding pre-assembled contigs using SSPACE, Bioinformatics, Volume 27 (2010) no. 4, pp. 578-579 | DOI
[186] WiseScaffolder: an algorithm for the semi-automatic scaffolding of Next Generation Sequencing data, BMC Bioinformatics, Volume 16 (2015) no. 1 | DOI
[187] DENTIST — using long reads for closing assembly gaps at high accuracy, GigaScience, Volume 11 (2022) | DOI
[188] Gapless provides combined scaffolding, gap filling and assembly correction with long reads, bioRxiv, 2022 | DOI
[189] LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads, GigaScience, Volume 4 (2015) no. 1 | DOI
[190] LRScaf: improving draft genomes using long noisy reads, BMC Genomics, Volume 20 (2019) no. 1 | DOI
[191] Scaffolding and completing genome assemblies in real-time with nanopore sequencing, Nature Communications, Volume 8 (2017) no. 1 | DOI
[192] Mind the Gap: Upgrading Genomes with Pacific Biosciences RS Long-Read Sequencing Technology, PLoS ONE, Volume 7 (2012) no. 11 | DOI
[193] RAILS and Cobbler: Scaffolding and automated finishing of draft genomes using long DNA sequences, The Journal of Open Source Software, Volume 1 (2016) no. 7 | DOI
[194] SLR: a scaffolding algorithm based on long reads and contig classification, BMC Bioinformatics, Volume 20 (2019) no. 1 | DOI
[195] SMIS, https://www.sanger.ac.uk/tool/smis/, 2015
[196] Single molecule sequencing-guided scaffolding and correction of draft assemblies, BMC Genomics, Volume 18 (2017) no. S10 | DOI
[197] SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information, BMC Bioinformatics, Volume 15 (2014) no. 1 | DOI
[198] ALLMAPS: robust scaffold ordering based on multiple maps, Genome Biology, Volume 16 (2015) no. 1 | DOI
[199] AGORA: Assembly Guided by Optical Restriction Alignment, BMC Bioinformatics, Volume 13 (2012) no. 1 | DOI
[200] BiSCoT: improving large eukaryotic genome assemblies with optical maps, PeerJ, Volume 8 (2020) | DOI
[201] OMGS: Optical Map-Based Genome Scaffolding, Journal of Computational Biology, Volume 27 (2020) no. 4, pp. 519-533 | DOI
[202] Tools and pipelines for BioNano data: molecule assembly pipeline and FASTA super scaffolding tool, BMC Genomics, Volume 16 (2015) no. 1 | DOI
[203] Scaffolding and validation of bacterial genome assemblies using optical restriction maps, Bioinformatics, Volume 24 (2008) no. 10, pp. 1229-1235 | DOI
[204] ARBitR: an overlap-aware genome assembly scaffolder for linked reads, Bioinformatics, Volume 37 (2020) no. 15, pp. 2203-2205 | DOI
[205] Genome assembly from synthetic long read clouds, Bioinformatics, Volume 32 (2016) no. 12 | DOI
[206] ARCS: scaffolding genome drafts with linked reads, Bioinformatics, Volume 34 (2017) no. 5, pp. 725-731 | DOI
[207] ARKS: chromosome-scale scaffolding of human genome drafts with linked read kmers, BMC Bioinformatics, Volume 19 (2018) no. 1 | DOI
[208] In vitro, long-range sequence information for de novo genome assembly via transposase contiguity, Genome Research, Volume 24 (2014) no. 12, pp. 2041-2049 | DOI
[209] Scaff10X, https://github.com/wtsi-hpag/Scaff10X, 2018
[210] High-throughput genome scaffolding from in vivo DNA interaction frequency, Nature Biotechnology, Volume 31 (2013) no. 12, pp. 1143-1147 | DOI
[211] High-quality genome (re)assembly using chromosomal contact data, Nature Communications, Volume 5 (2014) no. 1 | DOI
[212] Hi-C guided assemblies reveal conserved regulatory topologies on X and autosomes despite extensive genome shuffling, Genes & Development, Volume 33 (2019) no. 21-22, pp. 1591-1612 | DOI
[213] instaGRAAL: chromosome-level quality scaffolding of genomes using a proximity ligation-based scaffolder, Genome Biology, Volume 21 (2020) no. 1 | DOI
[214] Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions, Nature Biotechnology, Volume 31 (2013) no. 12, pp. 1119-1125 | DOI
[215] Efficient iterative Hi-C scaffolder based on N-best neighbors, BMC Bioinformatics, Volume 22 (2021) no. 1 | DOI
[216] Scaffolding of long read assemblies using long range contact information, BMC Genomics, Volume 18 (2017) no. 1 | DOI
[217] Integrating Hi-C links with assembly graphs for chromosome-scale assembly, PLOS Computational Biology, Volume 15 (2019) no. 8 | DOI
[218] scaffhic, https://github.com/wtsi-hpag/scaffHiC, 2019
[219] YaHS: yet another Hi-C scaffolding tool. Version v1.1a, Zenodo (2021) | DOI
[220] Toward almost closed genomes with GapFiller, Genome Biology, Volume 13 (2012) no. 6 | DOI
[221] GAPPadder: a sensitive approach for closing gaps on draft genomes with short sequence reads, BMC Genomics, Volume 20 (2019) no. S5 | DOI
[222] Sealer: a scalable gap-closing application for finishing draft genomes, BMC Bioinformatics, Volume 16 (2015) no. 1 | DOI
[223] FGAP: an automated gap closing tool, BMC Research Notes, Volume 7 (2014) no. 1 | DOI
[224] GMcloser: closing gaps in assemblies accurately with a likelihood-based selection of contig or long-read alignments, Bioinformatics, Volume 31 (2015) no. 23, pp. 1733-3741 | DOI
[225] LR_Gapcloser: a tiling path-based gap closer that uses long reads to complete genome assembly, GigaScience, Volume 8 (2019) no. 1 | DOI
[226] PGcloser: Fast Parallel Gap-Closing Tool Using Long-Reads or Contigs to Fill Gaps in Genomes, Evolutionary Bioinformatics, Volume 16 (2020) | DOI
[227] TGS-GapCloser: A fast and accurate gap closer for large genomes with low coverage of error-prone long reads, GigaScience, Volume 9 (2020) no. 9 | DOI
[228] Sequencing DNA with nanopores: Troubles and biases, PLOS ONE, Volume 16 (2021) no. 10 | DOI
[229] Long-read error correction: a survey and qualitative comparison, bioRxiv, 2020 | DOI
[230] Widespread false gene gains caused by duplication errors in genome assemblies, 2021 | DOI
[231] Overcoming uncollapsed haplotypes in long-read assemblies of non-model organisms, BMC Bioinformatics, Volume 22 (2021) no. 1 | DOI
[232] Assembly of the Working Draft of the Human Genome with GigAssembler, Genome Research, Volume 11 (2001) no. 9, pp. 1541-1548 | DOI
[233] Modern technologies and algorithms for scaffolding assembled genomes, PLOS Computational Biology, Volume 15 (2019) no. 6 | DOI
[234] Using linkage maps to correct and scaffold de novo genome assemblies: methods, challenges, and computational tools, Frontiers in Genetics, Volume 6 (2015) | DOI
[235] Ordered Restriction Maps of Saccharomyces cerevisiae Chromosomes Constructed by Optical Mapping, Science, Volume 262 (1993) no. 5130, pp. 110-114 | DOI
[236] A Systematically Improved High Quality Genome and Transcriptome of the Human Blood Fluke Schistosoma mansoni, PLoS Neglected Tropical Diseases, Volume 6 (2012) no. 1 | DOI
[237] The genome of the harpacticoid copepod Tigriopus japonicus: Potential for its use in marine molecular ecotoxicology, Aquatic Toxicology, Volume 222 (2020) | DOI
[238] Advances in optical mapping for genomic research, Computational and Structural Biotechnology Journal, Volume 18 (2020), pp. 2051-2062 | DOI
[239] The genome of Onchocerca volvulus, agent of river blindness, Nature Microbiology, Volume 2 (2016) no. 2 | DOI
[240] Comparative genome analysis of programmed DNA elimination in nematodes, Genome Research, Volume 27 (2017) no. 12, pp. 2001-2014 | DOI
[241] The genomes of four tapeworm species reveal adaptations to parasitism, Nature, Volume 496 (2013) no. 7443, pp. 57-63 | DOI
[242] Complete representation of a tapeworm genome reveals chromosomes capped by centromeres, necessitating a dual role in segregation and protection, BMC Biology, Volume 18 (2020) no. 1 | DOI
[243] The Iron-Responsive Genome of the Chiton Acanthopleura granulata, Genome Biology and Evolution, Volume 13 (2020) no. 1 | DOI
[244] Haplotype tagging reveals parallel formation of hybrid races in two butterfly species, Proceedings of the National Academy of Sciences, Volume 118 (2021) no. 25 | DOI
[245] Ultralow-input single-tube linked-read library method enables short-read second-generation sequencing systems to routinely generate highly accurate and economical long-range sequencing information, Genome Research, Volume 30 (2020) no. 6, pp. 898-909 | DOI
[246] The genetic basis of a social polymorphism in halictid bees, Nature Communications, Volume 9 (2018) no. 1 | DOI
[247] SuperNova, https://github.com/10XGenomics/supernova, 2016
[248] A chromosome-scale assembly of the major African malaria vector Anopheles funestus, GigaScience, Volume 8 (2019) no. 6 | DOI
[249] Capturing Chromosome Conformation, Science, Volume 295 (2002) no. 5558, pp. 1306-1311 | DOI
[250] Exploring the three-dimensional organization of genomes: interpreting chromatin interaction data, Nature Reviews Genetics, Volume 14 (2013) no. 6, pp. 390-403 | DOI
[251] Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome, Science, Volume 326 (2009) no. 5950, pp. 289-293 | DOI
[252] Contact genomics: scaffolding and phasing (meta)genomes using chromosome 3D physical signatures, FEBS Letters, Volume 589 (2015) no. 20PartA, pp. 2966-2974 | DOI
[253] Three invariant Hi-C interaction patterns: Applications to genome assembly, Methods, Volume 142 (2018), pp. 89-99 | DOI
[254] Divergent evolutionary trajectories following speciation in two ectoparasitic honey bee mites, Communications Biology, Volume 2 (2019) no. 1 | DOI
[255] Chromosome-level assembly of the horseshoe crab genome provides insights into its genome evolution, Nature Communications, Volume 11 (2020) no. 1 | DOI
[256] Chromosome-level genome assembly and annotation of two lineages of the ant Cataglyphis hispanica: steppingstones towards genomic studies of hybridoge- nesis and thermal adaptation in desert ants, bioRxiv (2022) | DOI
[257] Lineage dynamics of the endosymbiotic cell type in the soft coral Xenia, Nature, Volume 582 (2020) no. 7813, pp. 534-538 | DOI
[258] Chromosome-level reference genome of the jellyfish Rhopilema esculentum, GigaScience, Volume 9 (2020) no. 4 | DOI
[259] Chromosomal-Level Genome Assembly of the Sea Urchin Lytechinus variegatus Substantially Improves Functional Genomic Analyses, Genome Biology and Evolution, Volume 12 (2020) no. 7, pp. 1080-1086 | DOI
[260] An initial comparative genomic autopsy of wasting disease in sea stars, Molecular Ecology, Volume 29 (2020) no. 6, pp. 1087-1102 | DOI
[261] Chromosomal-level assembly of the blood clam, Scapharca (Anadara) broughtonii, using long sequence reads and Hi-C, GigaScience, Volume 8 (2019) no. 7 | DOI
[262] The Scaly-foot Snail genome and implications for the origins of biomineralised armour, Nature Communications, Volume 11 (2020) no. 1 | DOI
[263] Comparative analysis of the Mercenaria mercenaria genome provides insights into the diversity of transposable elements and immune molecules in bivalve mollusks, BMC Genomics, Volume 23 (2022) no. 1 | DOI
[264] Chromosome-Level Assembly of the Caenorhabditis remanei Genome Reveals Conserved Patterns of Nematode Genome Organization, Genetics, Volume 214 (2020) no. 4, pp. 769-780 | DOI
[265] Chromosome‐level reference genome of X12, a highly virulent race of the soybean cyst nematode Heterodera glycines, Molecular Ecology Resources, Volume 19 (2019) no. 6, pp. 1637-1646 | DOI
[266] High-quality Schistosoma haematobium genome achieved by single-molecule and long-range sequencing, GigaScience, Volume 8 (2019) no. 9 | DOI
[267] Tracing animal genomic evolution with the chromosomal-level assembly of the freshwater sponge Ephydatia muelleri, Nature Communications, Volume 11 (2020) no. 1 | DOI
[268] Acoel genome reveals the regulatory landscape of whole-body regeneration, Science, Volume 363 (2019) no. 6432 | DOI
[269] HASLR: Fast Hybrid Assembly of Long Reads, iScience, Volume 23 (2020) no. 8 | DOI
[270] The MaSuRCA genome assembler, Bioinformatics, Volume 29 (2013) no. 21, pp. 2669-2677 | DOI
[271] Efficient hybrid de novo assembly of human genomes with WENGAN, Nature Biotechnology, Volume 39 (2020) no. 4, pp. 422-430 | DOI
[272] First estimates of genome size in ribbon worms (phylum Nemertea) using flow cytometry and Feulgen image analysis densitometry, Canadian Journal of Zoology, Volume 92 (2014) no. 10, pp. 847-851 | DOI
[273] BBtools, https://sourceforge.net/projects/bbmap/, 2013
[274] GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes, Nature Communications, Volume 11 (2020) no. 1 | DOI
[275] KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies, Bioinformatics (2016) | DOI
[276] Mapping-based genome size estimation, bioRxiv, 2019 | DOI
[277] QUAST: quality assessment tool for genome assemblies, Bioinformatics, Volume 29 (2013) no. 8, pp. 1072-1075 | DOI
[278] A chromosomal-level genome assembly for the giant African snail Achatina fulica, GigaScience, Volume 8 (2019) no. 10 | DOI
[279] Factorial estimating assembly base errors using k-mer abundance difference (KAD) between short reads and genome assembled sequences, NAR Genomics and Bioinformatics, Volume 2 (2020) no. 3 | DOI
[280] BlobTools: Interrogation of genome assemblies, F1000Research, Volume 6 (2017) | DOI
[281] BlobToolKit – Interactive Quality Assessment of Genome Assemblies, G3 Genes|Genomes|Genetics, Volume 10 (2020) no. 4, pp. 1361-1374 | DOI
[282] Evidence for extensive horizontal gene transfer from the draft genome of a tardigrade, Proceedings of the National Academy of Sciences, Volume 112 (2015) no. 52, pp. 15976-15981 | DOI
[283] No evidence for extensive horizontal gene transfer in the genome of the tardigrade Hypsibius dujardini, Proceedings of the National Academy of Sciences, Volume 113 (2016) no. 18, pp. 5053-5058 | DOI
[284] Unzipping haplotypes in diploid and polyploid genomes, Computational and Structural Biotechnology Journal, Volume 18 (2020), pp. 66-72 | DOI
[285] De novo assembly of haplotype-resolved genomes with trio binning, Nature Biotechnology, Volume 36 (2018) no. 12, pp. 1174-1182 | DOI
[286] HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies, Genome Research, Volume 27 (2016) no. 5, pp. 801-812 | DOI
[287] WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads, Journal of Computational Biology, Volume 22 (2015) no. 6, pp. 498-509 | DOI
[288] Novel approaches for the exploitation of high throughput sequencing data. PhD thesis., Université Rennes (2017)
[289] Platanus-allee is a de novo haplotype assembler enabling a comprehensive access to divergent heterozygous regions, Nature Communications, Volume 10 (2019) no. 1 | DOI
[290] Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads, Nature Biotechnology, Volume 39 (2020) no. 3, pp. 302-308 | DOI
[291] Haplotype-resolved genome analyses of a heterozygous diploid potato, Nature Genetics, Volume 52 (2020) no. 10, pp. 1018-1023 | DOI
[292] Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly, Genome Biology, Volume 22 (2021) no. 1 | DOI
[293] Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data, Nature Plants, Volume 5 (2019) no. 8, pp. 833-845 | DOI
[294] GraphUnzip: unzipping assembly graphs with long reads and Hi-C, bioRxiv, 2021 | DOI
[295] Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C, Nature Communications, Volume 12 (2021) no. 1 | DOI
[296] Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies, Genome Biology, Volume 21 (2020) no. 1 | DOI
[297] Next-generation genome annotation: we still struggle to get it right, Genome Biology, Volume 20 (2019) no. 1 | DOI
[298] The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data, Volume 3 (2016) no. 1 | DOI
[299] VGP assembly pipeline (Galaxy Training Materials), training.galaxyproject.org/training-material/topics/assembly/tutorials/vgp_genome_assembly/tutorial.html
[300] Supplementary table to "A deep dive into genome assemblies of non-vertebrate animals": https://figshare.com/articles/dataset/a_deep_dive_into_genome_ assemblies_of_non-vertebrates_tsv/19672440, 2022
[301] Genome assembly tools, https://github.com/nadegeguiglielmoni/genome_assembly_tools, 2022
Cited by Sources: