Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 1;37(4):575-577.
doi: 10.1093/bioinformatics/btaa728.

mixtureS: a novel tool for bacterial strain genome reconstruction from reads

Affiliations

mixtureS: a novel tool for bacterial strain genome reconstruction from reads

Xin Li et al. Bioinformatics. .

Abstract

Motivation: It is essential to study bacterial strains in environmental samples. Existing methods and tools often depend on known strains or known variations, cannot work on individual samples, not reliable, or not easy to use, etc. It is thus important to develop more user-friendly tools that can identify bacterial strains more accurately.

Results: We developed a new tool called mixtureS that can de novo identify bacterial strains from shotgun reads of a clonal or metagenomic sample, without prior knowledge about the strains and their variations. Tested on 243 simulated datasets and 195 experimental datasets, mixtureS reliably identified the strains, their numbers and their abundance. Compared with three tools, mixtureS showed better performance in almost all simulated datasets and the vast majority of experimental datasets.

Availability and implementation: The source code and tool mixtureS is available at http://www.cs.ucf.edu/˜xiaoman/mixtureS/.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
The mixtureS tool and its performance. (A) The three main steps in mixtureS. (B) Performance of mixtureS and other tools on simulated data. (C) Performance of mixtureS and other tools on experimental data. MAE on the y-axis is the average absolute difference between the predicted abundance of a predicted strain and the corresponding known abundance of the corresponding known strain across strains and samples

Similar articles

Cited by

References

    1. Ahn T.-H. et al. (2015) Sigma: strain-level inference of genomes from metagenomic analysis for biosurveillance. Bioinformatics, 31, 170–177. - PMC - PubMed
    1. Albanese D., Donati C. (2017) Strain profiling and epidemiology of bacterial species from metagenomic sequencing. Nat. Commun., 8, 1–14. - PMC - PubMed
    1. Foster J.T. et al. (2020) Ricin forensics: comparisons to microbial forensics. In: Budowle, S. (eds) Microbial Forensics. Acedemia Press, pp. 315–326.
    1. Hong C. et al. (2014) Pathoscope 2.0: a complete computational framework for strain identification in environmental or clinical sequencing samples. Microbiome, 2, 33. - PMC - PubMed
    1. Li X., Waterman M.S. (2003) Estimating the repeat structure and length of dna sequences using -tuples. Genome Res., 13, 1916–1922. - PMC - PubMed

Publication types