Shotgun metagenome data of a defined mock community using Oxford Nanopore, PacBio and Illumina technologies

Sci Data. 2019 Nov 26;6(1):285. doi: 10.1038/s41597-019-0287-z.


Metagenomic sequence data from defined mock communities is crucial for the assessment of sequencing platform performance and downstream analyses, including assembly, binning and taxonomic assignment. We report a comparison of shotgun metagenome sequencing and assembly metrics of a defined microbial mock community using the Oxford Nanopore Technologies (ONT) MinION, PacBio and Illumina sequencing platforms. Our synthetic microbial community BMock12 consists of 12 bacterial strains with genome sizes spanning 3.2-7.2 Mbp, 40-73% GC content, and 1.5-7.3% repeats. Size selection of both PacBio and ONT sequencing libraries prior to sequencing was essential to yield comparable relative abundances of organisms among all sequencing technologies. While the Illumina-based metagenome assembly yielded good coverage with few misassemblies, contiguity was greatly improved by both, Illumina + ONT and Illumina + PacBio hybrid assemblies but increased misassemblies, most notably in genomes with high sequence similarity to each other. Our resulting datasets allow evaluation and benchmarking of bioinformatics software on Illumina, PacBio and ONT platforms in parallel.

Publication types

  • Dataset
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bacteria / classification
  • High-Throughput Nucleotide Sequencing
  • Metagenome*
  • Microbiota*
  • Sequence Analysis, DNA / methods*