Assembly of the Complete Sitka Spruce Chloroplast Genome Using 10X Genomics' GemCode Sequencing Data

PLoS One. 2016 Sep 15;11(9):e0163059. doi: 10.1371/journal.pone.0163059. eCollection 2016.

Abstract

The linked read sequencing library preparation platform by 10X Genomics produces barcoded sequencing libraries, which are subsequently sequenced using the Illumina short read sequencing technology. In this new approach, long fragments of DNA are partitioned into separate micro-reactions, where the same index sequence is incorporated into each of the sequencing fragment inserts derived from a given long fragment. In this study, we exploited this property by using reads from index sequences associated with a large number of reads, to assemble the chloroplast genome of the Sitka spruce tree (Picea sitchensis). Here we report on the first Sitka spruce chloroplast genome assembled exclusively from P. sitchensis genomic libraries prepared using the 10X Genomics protocol. We show that the resulting 124,049 base pair long genome shares high sequence similarity with the related white spruce and Norway spruce chloroplast genomes, but diverges substantially from a previously published P. sitchensis- P. thunbergii chimeric genome. The use of reads from high-frequency indices enabled separation of the nuclear genome reads from that of the chloroplast, which resulted in the simplification of the de Bruijn graphs used at the various stages of assembly.

MeSH terms

  • Chloroplasts / genetics*
  • Genome, Plant*
  • Phylogeny
  • Picea / classification
  • Picea / genetics*

Grants and funding

This study was supported by the BC Cancer Foundation, Genome BC and Genome Canada under grant award 212SEQ. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.