Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes
- PMID: 24997787
- DOI: 10.1038/nbt.2939
Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes
Abstract
Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the means for comprehensive profiling of the diversity within complex metagenomic samples.
Similar articles
-
MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data.Bioinformatics. 2019 May 1;35(9):1544-1552. doi: 10.1093/bioinformatics/bty830. Bioinformatics. 2019. PMID: 30252023 Free PMC article.
-
Clustering co-abundant genes identifies components of the gut microbiome that are reproducibly associated with colorectal cancer and inflammatory bowel disease.Microbiome. 2019 Aug 1;7(1):110. doi: 10.1186/s40168-019-0722-6. Microbiome. 2019. PMID: 31370880 Free PMC article.
-
ReprDB and panDB: minimalist databases with maximal microbial representation.Microbiome. 2018 Jan 18;6(1):15. doi: 10.1186/s40168-018-0399-2. Microbiome. 2018. PMID: 29347966 Free PMC article.
-
Classification of metagenomic sequences: methods and challenges.Brief Bioinform. 2012 Nov;13(6):669-81. doi: 10.1093/bib/bbs054. Epub 2012 Sep 8. Brief Bioinform. 2012. PMID: 22962338 Review.
-
Application of computational approaches to analyze metagenomic data.J Microbiol. 2021 Mar;59(3):233-241. doi: 10.1007/s12275-021-0632-8. Epub 2021 Feb 10. J Microbiol. 2021. PMID: 33565054 Review.
Cited by
-
Draft genome sequences of 24 microbial strains assembled from direct sequencing from 4 stool samples.Genome Announc. 2015 May 28;3(3):e00526-15. doi: 10.1128/genomeA.00526-15. Genome Announc. 2015. PMID: 26021920 Free PMC article.
-
MAGICIAN: MAG simulation for investigating criteria for bioinformatic analysis.BMC Genomics. 2024 Jan 12;25(1):55. doi: 10.1186/s12864-023-09912-2. BMC Genomics. 2024. PMID: 38216924 Free PMC article.
-
Salinity affects microbial function genes related to nutrient cycling in arid regions.Front Microbiol. 2024 Jun 14;15:1407760. doi: 10.3389/fmicb.2024.1407760. eCollection 2024. Front Microbiol. 2024. PMID: 38946896 Free PMC article.
-
AFITbin: a metagenomic contig binning method using aggregate l-mer frequency based on initial and terminal nucleotides.BMC Bioinformatics. 2024 Jul 16;25(1):241. doi: 10.1186/s12859-024-05859-7. BMC Bioinformatics. 2024. PMID: 39014300 Free PMC article.
-
The evolution of bacterial genome assemblies - where do we need to go next?Microbiome Res Rep. 2022 Apr 12;1(3):15. doi: 10.20517/mrr.2022.02. eCollection 2022. Microbiome Res Rep. 2022. PMID: 38046358 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
