A combined de novo assembly approach increases the quality of prokaryotic draft genomes

Folia Microbiol (Praha). 2022 Oct;67(5):801-810. doi: 10.1007/s12223-022-00980-7. Epub 2022 Jun 6.

Abstract

Next-generation sequencing methods provide comprehensive data for the analysis of structural and functional analysis of the genome. The draft genomes with low contig number and high N50 value can give insight into the structure of the genome as well as provide information on the annotation of the genome. In this study, we designed a pipeline that can be used to assemble prokaryotic draft genomes with low number of contigs and high N50 value. We aimed to use combination of two de novo assembly tools (SPAdes and IDBA-Hybrid) and evaluate the impact of this approach on the quality metrics of the assemblies. The followed pipeline was tested with the raw sequence data with short reads (< 300) for a total of 10 species from four different genera. To obtain the final draft genomes, we firstly assembled the sequences using SPAdes to find closely related organism using the extracted 16 s rRNA from it. IDBA-Hybrid assembler was used to obtain the second assembly data using the closely related organism genome. SPAdes assembler tool was implemented using the second assembly, produced by IDBA-hybrid as a hint. The results were evaluated using QUAST and BUSCO. The pipeline was successful for the reduction of the contig numbers and increasing the N50 statistical values in the draft genome assemblies while preserving the coverage of the draft genomes.

Keywords: Bacteria; De novo assembly; Draft genome; NGS; Prokaryotes; Short reads.

MeSH terms

  • High-Throughput Nucleotide Sequencing* / methods
  • Sequence Analysis, DNA / methods