Platon: identification and characterization of bacterial plasmid contigs in short-read draft assemblies exploiting protein sequence-based replicon distribution scores
- PMID: 32579097
- PMCID: PMC7660248
- DOI: 10.1099/mgen.0.000398
Platon: identification and characterization of bacterial plasmid contigs in short-read draft assemblies exploiting protein sequence-based replicon distribution scores
Abstract
Plasmids are extrachromosomal genetic elements that replicate independently of the chromosome and play a vital role in the environmental adaptation of bacteria. Due to potential mobilization or conjugation capabilities, plasmids are important genetic vehicles for antimicrobial resistance genes and virulence factors with huge and increasing clinical implications. They are therefore subject to large genomic studies within the scientific community worldwide. As a result of rapidly improving next-generation sequencing methods, the quantity of sequenced bacterial genomes is constantly increasing, in turn raising the need for specialized tools to (i) extract plasmid sequences from draft assemblies, (ii) derive their origin and distribution, and (iii) further investigate their genetic repertoire. Recently, several bioinformatic methods and tools have emerged to tackle this issue; however, a combination of high sensitivity and specificity in plasmid sequence identification is rarely achieved in a taxon-independent manner. In addition, many software tools are not appropriate for large high-throughput analyses or cannot be included in existing software pipelines due to their technical design or software implementation. In this study, we investigated differences in the replicon distributions of protein-coding genes on a large scale as a new approach to distinguish plasmid-borne from chromosome-borne contigs. We defined and computed statistical discrimination thresholds for a new metric: the replicon distribution score (RDS), which achieved an accuracy of 96.6 %. The final performance was further improved by the combination of the RDS metric with heuristics exploiting several plasmid-specific higher-level contig characterizations. We implemented this workflow in a new high-throughput taxon-independent bioinformatics software tool called Platon for the recruitment and characterization of plasmid-borne contigs from short-read draft assemblies. Compared to PlasFlow, Platon achieved a higher accuracy (97.5 %) and more balanced predictions (F1=82.6 %) tested on a broad range of bacterial taxa and better or equal performance against the targeted tools PlasmidFinder and PlaScope on sequenced Escherichia coli isolates. Platon is available at: http://platon.computational.bio/.
Keywords: whole-genome sequencing; NGS; bacteria; plasmids.
Conflict of interest statement
The authors declare that there are no conflicts of interest.
Figures
Similar articles
-
Detection of plasmid contigs in draft genome assemblies using customized Kraken databases.Microb Genom. 2021 Apr;7(4):000550. doi: 10.1099/mgen.0.000550. Microb Genom. 2021. PMID: 33826492 Free PMC article.
-
PlaScope: a targeted approach to assess the plasmidome from genome assemblies at the species level.Microb Genom. 2018 Sep;4(9):e000211. doi: 10.1099/mgen.0.000211. Microb Genom. 2018. PMID: 30265232 Free PMC article.
-
Complete Assembly of Escherichia coli Sequence Type 131 Genomes Using Long Reads Demonstrates Antibiotic Resistance Gene Variation within Diverse Plasmid and Chromosomal Contexts.mSphere. 2019 May 8;4(3):e00130-19. doi: 10.1128/mSphere.00130-19. mSphere. 2019. PMID: 31068432 Free PMC article.
-
PlasmidFinder and In Silico pMLST: Identification and Typing of Plasmid Replicons in Whole-Genome Sequencing (WGS).Methods Mol Biol. 2020;2075:285-294. doi: 10.1007/978-1-4939-9877-7_20. Methods Mol Biol. 2020. PMID: 31584170
-
Plasmer: an Accurate and Sensitive Bacterial Plasmid Prediction Tool Based on Machine Learning of Shared k-mers and Genomic Features.Microbiol Spectr. 2023 Jun 15;11(3):e0464522. doi: 10.1128/spectrum.04645-22. Epub 2023 May 16. Microbiol Spectr. 2023. PMID: 37191574 Free PMC article.
Cited by
-
Diverse plasmid systems and their ecology across human gut metagenomes revealed by PlasX and MobMess.Nat Microbiol. 2024 Mar;9(3):830-847. doi: 10.1038/s41564-024-01610-3. Epub 2024 Mar 4. Nat Microbiol. 2024. PMID: 38443576 Free PMC article.
-
PlasmidEC and gplas2: an optimized short-read approach to predict and reconstruct antibiotic resistance plasmids in Escherichia coli.Microb Genom. 2024 Feb;10(2):001193. doi: 10.1099/mgen.0.001193. Microb Genom. 2024. PMID: 38376388 Free PMC article.
-
Identification of knowledge gaps in whole-genome sequence analysis of multi-resistant thermotolerant Campylobacter spp.BMC Genomics. 2024 Feb 8;25(1):156. doi: 10.1186/s12864-024-10014-w. BMC Genomics. 2024. PMID: 38331708 Free PMC article.
-
Complete genome sequences of 17 Salmonella enterica serovar Schwarzengrund isolates carrying an IncFIB-IncFIC (FII) fusion plasmid.Microbiol Resour Announc. 2024 Feb 15;13(2):e0106223. doi: 10.1128/mra.01062-23. Epub 2024 Jan 17. Microbiol Resour Announc. 2024. PMID: 38231183 Free PMC article.
-
Differential responses of the gut microbiome and resistome to antibiotic exposures in infants and adults.Nat Commun. 2023 Dec 22;14(1):8526. doi: 10.1038/s41467-023-44289-6. Nat Commun. 2023. PMID: 38135681 Free PMC article.
References
-
- Clark DP, Stahl DA, Martinko JM, Madigan MT. Brock biology of microorganisms (13th edition). Benjamin Cummings. 2010 https://www.amazon.com/Brock-Biology-Microorganisms-Michael-Madigan/dp/0...
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Research Materials
