PacBio SMRT assembly of a complex multi-replicon genome reveals chlorocatechol degradative operon in a region of genome plasticity

Gene. 2016 Jul 25;586(2):239-47. doi: 10.1016/j.gene.2016.04.018. Epub 2016 Apr 7.

Abstract

We have sequenced a Burkholderia genome that contains multiple replicons and large repetitive elements that would make it inherently difficult to assemble by short read sequencing technologies. We illustrate how the integrated long read correction algorithms implemented through the PacBio Single Molecule Real-Time (SMRT) sequencing technology successfully provided a de novo assembly that is a reasonable estimate of both the gene content and genome organization without making any further modifications. This assembly is comparable to related organisms assembled by more labour intensive methods. Our assembled genome revealed regions of genome plasticity for further investigation, one of which harbours a chlorocatechol degradative operon highly homologous to those previously identified on globally ubiquitous plasmids. In an ideal world, this assembly would still require experimental validation to confirm gene order and copy number of repeated elements. However, we submit that particularly in instances where a polished genome is not the primary goal of the sequencing project, PacBio SMRT sequencing provides a financially viable option for generating a biologically relevant genome estimate that can be utilized by other researchers for comparative studies.

Keywords: Burkholderia; Chlorocatechol degradation genes; Genome assembly; Mobile elements; Next generation sequencing; Plasmid.

MeSH terms

  • Burkholderia / genetics*
  • Catechols / metabolism*
  • Genome, Bacterial*
  • High-Throughput Nucleotide Sequencing
  • Operon*
  • Repetitive Sequences, Nucleic Acid
  • Replicon
  • Sequence Analysis, DNA
  • Species Specificity

Substances

  • Catechols