Pandora: nucleotide-resolution bacterial pan-genomics with reference graphs
- PMID: 34521456
- PMCID: PMC8442373
- DOI: 10.1186/s13059-021-02473-1
Pandora: nucleotide-resolution bacterial pan-genomics with reference graphs
Abstract
We present pandora, a novel pan-genome graph structure and algorithms for identifying variants across the full bacterial pan-genome. As much bacterial adaptability hinges on the accessory genome, methods which analyze SNPs in just the core genome have unsatisfactory limitations. Pandora approximates a sequenced genome as a recombinant of references, detects novel variation and pan-genotypes multiple samples. Using a reference graph of 578 Escherichia coli genomes, we compare 20 diverse isolates. Pandora recovers more rare SNPs than single-reference-based tools, is significantly better than picking the closest RefSeq reference, and provides a stable framework for analyzing diverse samples without reference bias.
Keywords: Accessory genome; Genome graph; Nanopore; Pan-genome.
© 2021. The Author(s).
Conflict of interest statement
The authors declare that they have no competing interests
Figures
Similar articles
-
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8. BMC Genomics. 2016. PMID: 27556636 Free PMC article.
-
SplitMEM: a graphical algorithm for pan-genome analysis with suffix skips.Bioinformatics. 2014 Dec 15;30(24):3476-83. doi: 10.1093/bioinformatics/btu756. Epub 2014 Nov 13. Bioinformatics. 2014. PMID: 25398610 Free PMC article.
-
Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions.BMC Bioinformatics. 2010 Sep 15;11:461. doi: 10.1186/1471-2105-11-461. BMC Bioinformatics. 2010. PMID: 20843356 Free PMC article.
-
Pan-Genome Storage and Analysis Techniques.Methods Mol Biol. 2018;1704:29-53. doi: 10.1007/978-1-4939-7463-4_2. Methods Mol Biol. 2018. PMID: 29277862 Review.
-
Building pan-genome infrastructures for crop plants and their use in association genetics.DNA Res. 2021 Jan 19;28(1):dsaa030. doi: 10.1093/dnares/dsaa030. DNA Res. 2021. PMID: 33484244 Free PMC article. Review.
Cited by
-
Drug resistance prediction for Mycobacterium tuberculosis with reference graphs.Microb Genom. 2023 Aug;9(8):mgen001081. doi: 10.1099/mgen.0.001081. Microb Genom. 2023. PMID: 37552534 Free PMC article.
-
Methods and Developments in Graphical Pangenomics.J Indian Inst Sci. 2021;101(3):485-498. doi: 10.1007/s41745-021-00255-z. Epub 2021 Aug 24. J Indian Inst Sci. 2021. PMID: 34456520 Free PMC article. Review.
-
Extra-intestinal pathogenic lineages of extended-spectrum β-lactamase (ESBL)-producing Escherichia coli are associated with prolonged ESBL gene carriage.Access Microbiol. 2024 Feb 12;6(2):000541.v4. doi: 10.1099/acmi.0.000541.v4. eCollection 2024. Access Microbiol. 2024. PMID: 38482367 Free PMC article.
-
RecGraph: recombination-aware alignment of sequences to variation graphs.Bioinformatics. 2024 May 2;40(5):btae292. doi: 10.1093/bioinformatics/btae292. Bioinformatics. 2024. PMID: 38676570 Free PMC article.
-
Mycobacterium tuberculosis complex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type strain H37Rv.Microb Genom. 2021 Jul;7(7):000437. doi: 10.1099/mgen.0.000437. Microb Genom. 2021. PMID: 34241588 Free PMC article.
References
-
- Lynch M, Ackerman MS, Gout J-F, Long H, Sung W, Thomas WK, et al. Genetic drift, selection and the evolution of the mutation rate. Nat Rev Genet. Nature Publishing Group. 2016;17(11):704–14. 10.1038/nrg.2016.104. - PubMed
Publication types
MeSH terms
Substances
Associated data
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials
