Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement
- PMID: 25409509
- PMCID: PMC4237348
- DOI: 10.1371/journal.pone.0112963
Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement
Abstract
Advances in modern sequencing technologies allow us to generate sufficient data to analyze hundreds of bacterial genomes from a single machine in a single day. This potential for sequencing massive numbers of genomes calls for fully automated methods to produce high-quality assemblies and variant calls. We introduce Pilon, a fully automated, all-in-one tool for correcting draft assemblies and calling sequence variants of multiple sizes, including very large insertions and deletions. Pilon works with many types of sequence data, but is particularly strong when supplied with paired end data from two Illumina libraries with small e.g., 180 bp and large e.g., 3-5 Kb inserts. Pilon significantly improves draft genome assemblies by correcting bases, fixing mis-assemblies and filling gaps. For both haploid and diploid genomes, Pilon produces more contiguous genomes with fewer errors, enabling identification of more biologically relevant genes. Furthermore, Pilon identifies small variants with high accuracy as compared to state-of-the-art tools and is unique in its ability to accurately identify large sequence variants including duplications and resolve large insertions. Pilon is being used to improve the assemblies of thousands of new genomes and to identify variants from thousands of clinically relevant bacterial strains. Pilon is freely available as open source software.
Conflict of interest statement
Figures
Similar articles
-
Blue: correcting sequencing errors using consensus and context.Bioinformatics. 2014 Oct;30(19):2723-32. doi: 10.1093/bioinformatics/btu368. Epub 2014 Jun 11. Bioinformatics. 2014. PMID: 24919879
-
NucBreak: location of structural errors in a genome assembly by using paired-end Illumina reads.BMC Bioinformatics. 2020 Feb 21;21(1):66. doi: 10.1186/s12859-020-3414-0. BMC Bioinformatics. 2020. PMID: 32085722 Free PMC article.
-
Pollux: platform independent error correction of single and mixed genomes.BMC Bioinformatics. 2015 Jan 16;16(1):10. doi: 10.1186/s12859-014-0435-6. BMC Bioinformatics. 2015. PMID: 25592313 Free PMC article.
-
The fast changing landscape of sequencing technologies and their impact on microbial genome assemblies and annotation.PLoS One. 2012;7(12):e48837. doi: 10.1371/journal.pone.0048837. Epub 2012 Dec 12. PLoS One. 2012. PMID: 23251337 Free PMC article.
-
NextPolish: a fast and efficient genome polishing tool for long-read assembly.Bioinformatics. 2020 Apr 1;36(7):2253-2255. doi: 10.1093/bioinformatics/btz891. Bioinformatics. 2020. PMID: 31778144
Cited by
-
The reference genome and transcriptome of the limestone langur, Trachypithecus leucocephalus, reveal expansion of genes related to alkali tolerance.BMC Biol. 2021 Apr 8;19(1):67. doi: 10.1186/s12915-021-00998-2. BMC Biol. 2021. PMID: 33832502 Free PMC article.
-
Methylacidimicrobium thermophilum AP8, a Novel Methane- and Hydrogen-Oxidizing Bacterium Isolated From Volcanic Soil on Pantelleria Island, Italy.Front Microbiol. 2021 Feb 12;12:637762. doi: 10.3389/fmicb.2021.637762. eCollection 2021. Front Microbiol. 2021. PMID: 33643272 Free PMC article.
-
Isolation and Molecular Characterization of Two Novel Lytic Bacteriophages for the Biocontrol of Escherichia coli in Uterine Infections: In Vitro and Ex Vivo Preliminary Studies in Veterinary Medicine.Pharmaceutics. 2022 Oct 30;14(11):2344. doi: 10.3390/pharmaceutics14112344. Pharmaceutics. 2022. PMID: 36365162 Free PMC article.
-
Wolfberry genomes and the evolution of Lycium (Solanaceae).Commun Biol. 2021 Jun 3;4(1):671. doi: 10.1038/s42003-021-02152-8. Commun Biol. 2021. PMID: 34083720 Free PMC article.
-
A method for achieving complete microbial genomes and improving bins from metagenomics data.PLoS Comput Biol. 2021 May 7;17(5):e1008972. doi: 10.1371/journal.pcbi.1008972. eCollection 2021 May. PLoS Comput Biol. 2021. PMID: 33961626 Free PMC article.
References
-
- Chewapreecha C, Harris SR, Croucher NJ, Turner C, Marttinen P, et al. (2014) Dense genomic sampling identifies highways of pneumococcal recombination. Nat Genet 46: 305–309 Available: http://www.ncbi.nlm.nih.gov/pubmed/24509479 Accessed 21 March 2014.. - PMC - PubMed
-
- Comas I, Coscolla M, Luo T, Borrell S, Holt KE, et al. (2013) Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans. Nat Genet 45: 1176–1182 Available: http://www.ncbi.nlm.nih.gov/pubmed/23995134 Accessed 19 March 2014.. - PMC - PubMed
-
- Croucher NJ, Finkelstein J a, Pelton SI, Mitchell PK, Lee GM, et al. (2013) Population genomics of post-vaccine changes in pneumococcal epidemiology. Nat Genet 45: 656–663 Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3725542&tool=p... Accessed 21 March 2014.. - PMC - PubMed
-
- Grad YH, Kirkcaldy RD, Trees D, Dordel J, Harris SR, et al. (2014) Genomic epidemiology of Neisseria gonorrhoeae with reduced susceptibility to cefixime in the USA: a retrospective observational study. Lancet Infect Dis 14: 220–226 Available: http://www.ncbi.nlm.nih.gov/pubmed/24462211 Accessed 21 March 2014.. - PMC - PubMed
-
- Ronen R, Boucher C, Chitsaz H, Pevzner P (2012) SEQuel: improving the accuracy of genome assemblies. Bioinformatics 28: i188–96 Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3371851&tool=p... Accessed 20 January 2014.. - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
