A pipeline for completing bacterial genomes using in silico and wet lab approaches

BMC Genomics. 2015;16 Suppl 3(Suppl 3):S7. doi: 10.1186/1471-2164-16-S3-S7. Epub 2015 Jan 29.

Abstract

Background: Despite the large volume of genome sequencing data produced by next-generation sequencing technologies and the highly sophisticated software dedicated to handling these types of data, gaps are commonly found in draft genome assemblies. The existence of gaps compromises our ability to take full advantage of the genome data. This study aims to identify a practical approach for biologists to complete their own genome assemblies using commonly available tools and resources.

Results: A pipeline was developed to assemble complete genomes primarily from the next generation sequencing (NGS) data. The input of the pipeline is paired-end Illumina sequence reads, and the output is a high quality complete genome sequence. The pipeline alternates the employment of computational and biological methods in seven steps. It combines the strengths of de novo assembly, reference-based assembly, customized programming, public databases utilization, and wet lab experimentation. The application of the pipeline is demonstrated by the completion of a bacterial genome, Thermotoga sp. strain RQ7, a hydrogen-producing strain.

Conclusions: The developed pipeline provides an example of effective integration of computational and biological principles. It highlights the complementary roles that in silico and wet lab methodologies play in bioinformatical studies. The constituting principles and methods are applicable to similar studies on both prokaryotic and eukaryotic genomes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Genome, Bacterial*
  • Gram-Negative Anaerobic Straight, Curved, and Helical Rods / classification*
  • Gram-Negative Anaerobic Straight, Curved, and Helical Rods / genetics*
  • High-Throughput Nucleotide Sequencing*
  • Sequence Analysis, DNA*
  • Software*
  • Thermotoga maritima / genetics
  • Thermotoga neapolitana / genetics