Accurate long-read de novo assembly evaluation with Inspector
- PMID: 34775997
- PMCID: PMC8590762
- DOI: 10.1186/s13059-021-02527-4
Accurate long-read de novo assembly evaluation with Inspector
Abstract
Long-read de novo genome assembly continues to advance rapidly. However, there is a lack of effective tools to accurately evaluate the assembly results, especially for structural errors. We present Inspector, a reference-free long-read de novo assembly evaluator which faithfully reports types of errors and their precise locations. Notably, Inspector can correct the assembly errors based on consensus sequences derived from raw reads covering erroneous regions. Based on in silico and long-read assembly results from multiple long-read data and assemblers, we demonstrate that in addition to providing generic metrics, Inspector can accurately identify both large-scale and small-scale assembly errors.
Keywords: Assembly error; Assembly evaluation; De novo assembly; Genome assembly; Long reads.
© 2021. The Author(s).
Conflict of interest statement
The authors declare that they have no competing interests.
Figures
Similar articles
-
The present and future of de novo whole-genome assembly.Brief Bioinform. 2018 Jan 1;19(1):23-40. doi: 10.1093/bib/bbw096. Brief Bioinform. 2018. PMID: 27742661 Review.
-
Detection of simple and complex de novo mutations with multiple reference sequences.Genome Res. 2020 Aug;30(8):1154-1169. doi: 10.1101/gr.255505.119. Epub 2020 Aug 19. Genome Res. 2020. PMID: 32817236 Free PMC article.
-
SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome.BMC Bioinformatics. 2015 Sep 16;16(1):295. doi: 10.1186/s12859-015-0726-6. BMC Bioinformatics. 2015. PMID: 26377912 Free PMC article.
-
phasebook: haplotype-aware de novo assembly of diploid genomes from long reads.Genome Biol. 2021 Oct 27;22(1):299. doi: 10.1186/s13059-021-02512-x. Genome Biol. 2021. PMID: 34706745 Free PMC article.
-
Genome sequence assembly algorithms and misassembly identification methods.Mol Biol Rep. 2022 Nov;49(11):11133-11148. doi: 10.1007/s11033-022-07919-8. Epub 2022 Sep 23. Mol Biol Rep. 2022. PMID: 36151399 Review.
Cited by
-
A pooled-sample draft genome assembly provides insights into host plant-specific transcriptional responses of a Solanaceae-specializing pest, Tupiocoris notatus (Hemiptera: Miridae).Ecol Evol. 2024 Mar 11;14(3):e10979. doi: 10.1002/ece3.10979. eCollection 2024 Mar. Ecol Evol. 2024. PMID: 38476697 Free PMC article.
-
Sequencing and characterizing short tandem repeats in the human genome.Nat Rev Genet. 2024 Feb 16. doi: 10.1038/s41576-024-00692-3. Online ahead of print. Nat Rev Genet. 2024. PMID: 38366034 Review.
-
RUBICON: a framework for designing efficient deep learning-based genomic basecallers.Genome Biol. 2024 Feb 16;25(1):49. doi: 10.1186/s13059-024-03181-2. Genome Biol. 2024. PMID: 38365730 Free PMC article.
-
Two chromosome-level genomes of Smittia aterrima and Smittia pratorum (Diptera, Chironomidae).Sci Data. 2024 Feb 3;11(1):165. doi: 10.1038/s41597-024-03010-y. Sci Data. 2024. PMID: 38310146 Free PMC article.
-
Chromosome-level genome assembly of Hippophae gyantsensis.Sci Data. 2024 Jan 25;11(1):126. doi: 10.1038/s41597-024-02909-w. Sci Data. 2024. PMID: 38272931 Free PMC article.
References
-
- Chaisson MJP, Sanders AD, Zhao X, Malhotra A, Porubsky D, Rausch T, Gardner EJ, Rodriguez OL, Guo L, Collins RL, Fan X, Wen J, Handsaker RE, Fairley S, Kronenberg ZN, Kong X, Hormozdiari F, Lee D, Wenger AM, Hastie AR, Antaki D, Anantharaman T, Audano PA, Brand H, Cantsilieris S, Cao H, Cerveira E, Chen C, Chen X, Chin CS, Chong Z, Chuang NT, Lambert CC, Church DM, Clarke L, Farrell A, Flores J, Galeev T, Gorkin DU, Gujral M, Guryev V, Heaton WH, Korlach J, Kumar S, Kwon JY, Lam ET, Lee JE, Lee J, Lee WP, Lee SP, Li S, Marks P, Viaud-Martinez K, Meiers S, Munson KM, Navarro FCP, Nelson BJ, Nodzak C, Noor A, Kyriazopoulou-Panagiotopoulou S, Pang AWC, Qiu Y, Rosanio G, Ryan M, Stütz A, Spierings DCJ, Ward A, Welch AME, Xiao M, Xu W, Zhang C, Zhu Q, Zheng-Bradley X, Lowy E, Yakneen S, McCarroll S, Jun G, Ding L, Koh CL, Ren B, Flicek P, Chen K, Gerstein MB, Kwok PY, Lansdorp PM, Marth GT, Sebat J, Shi X, Bashir A, Ye K, Devine SE, Talkowski ME, Mills RE, Marschall T, Korbel JO, Eichler EE, Lee C. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat Commun. 2019;10(1):1784. doi: 10.1038/s41467-018-08148-z. - DOI - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
