Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 May 3;12(5):e0176469.
doi: 10.1371/journal.pone.0176469. eCollection 2017.

MGmapper: Reference Based Mapping and Taxonomy Annotation of Metagenomics Sequence Reads

Affiliations
Free PMC article

MGmapper: Reference Based Mapping and Taxonomy Annotation of Metagenomics Sequence Reads

Thomas Nordahl Petersen et al. PLoS One. .
Free PMC article

Erratum in

Abstract

An increasing amount of species and gene identification studies rely on the use of next generation sequence analysis of either single isolate or metagenomics samples. Several methods are available to perform taxonomic annotations and a previous metagenomics benchmark study has shown that a vast number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations. MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post-processing analysis to produce reliable taxonomy annotation at species and strain level resolution. An in-vitro bacterial mock community sample comprised of 8 genuses, 11 species and 12 strains was previously used to benchmark metagenomics classification methods. After applying a post-processing filter, we obtained 100% correct taxonomy assignments at species and genus level. A sensitivity and precision at 75% was obtained for strain level annotations. A comparison between MGmapper and Kraken at species level, shows MGmapper assigns taxonomy at species level using 84.8% of the sequence reads, compared to 70.5% for Kraken and both methods identified all species with no false positives. Extensive read count statistics are provided in plain text and excel sheets for both rejected and accepted taxonomy annotations. The use of custom databases is possible for the command-line version of MGmapper, and the complete pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper). A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets.

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. A schematic flowchart for processing of paired-end sequences with MGmapper.
MGmapper processes fastq reads in four steps. These consist of: (I) Trimming and mapping reads against a phiX bacteriophage to remove potential positive control reads. (II) Mapping to specified reference databases, post-processing of BWA-mem alignments to remove reads with low alignment score or insufficient alignment coverage. (III) Identification of best hits in bestmode: Assignment of a read-pairs to only one specific reference sequence based on the highest sum of alignment scores. In fullmode, assigned a read-pair to a reference sequence even if a higher alignment score is found when mapping to another reference sequence database. This will provide best target match, considering only the sequences present in one particular reference database. (IV) Compilation of abundance statistics, read and nucleotide counts, depth, coverage, and summary reports.

Similar articles

See all similar articles

Cited by 20 articles

See all "Cited by" articles

References

    1. Mackelprang R, Waldrop MP, DeAngelis KM, David MM, Chavarria KL, Blazewicz SJ, et al. Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw. Nature. 2011. pp. 368–371. - PubMed
    1. Mason OU, Scott NM, Gonzalez A, Robbins-Pianka A, Bælum J, Kimbrel J, et al. Metagenomics reveals sediment microbial community response to Deepwater Horizon oil spill. ISME J. 2014;8: 1464–75. 10.1038/ismej.2013.254 - DOI - PMC - PubMed
    1. Hazen TC, Dubinsky EA, DeSantis TZ, Andersen GL, Piceno YM, Singh N, et al. Deep-sea oil plume enriches indigenous oil-degrading bacteria. Science. 2010;330: 204–208. 10.1126/science.1195979 - DOI - PubMed
    1. Hauptmann AL, Stibal M, Bælum J, Sicheritz-Pontén T, Brunak S, Bowman JS, et al. Bacterial diversity in snow on North Pole ice floes. Extremophiles. 2014. - PMC - PubMed
    1. Arumugam M, Raes J, Pelletier E, Le Paslier D, Yamada T, Mende DR, et al. Enterotypes of the human gut microbiome. Nature. 2011;473: 174–180. 10.1038/nature09944 - DOI - PMC - PubMed

MeSH terms

Grant support

This study was funded by the European Community's Seventh Framework Programme [FP7/2007–2013], under grant agreement n°613754, and by The Villum Foundation (VKR023052). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Feedback