Improved de novo genome assembly and analysis of the Chinese cucurbit Siraitia grosvenorii, also known as monk fruit or luo-han-guo

Gigascience. 2018 Jun 1;7(6):giy067. doi: 10.1093/gigascience/giy067.

Abstract

Background: Luo-han-guo (Siraitia grosvenorii), also called monk fruit, is a member of the Cucurbitaceae family. Monk fruit has become an important area for research because of the pharmacological and economic potential of its noncaloric, extremely sweet components (mogrosides). It is also commonly used in traditional Chinese medicine for the treatment of lung congestion, sore throat, and constipation. Recently, a single reference genome became available for monk fruit, assembled from 36.9x genome coverage reads via Illumina sequencing platforms. This genome assembly has a relatively short (34.2 kb) contig N50 length and lacks integrated annotations. These drawbacks make it difficult to use as a reference in assembling transcriptomes and discovering novel functional genes.

Findings: Here, we offer a new high-quality draft of the S. grosvenorii genome assembled using 31 Gb (∼73.8x) long single molecule real time sequencing reads and polished with ∼50 Gb Illumina paired-end reads. The final genome assembly is approximately 469.5 Mb, with a contig N50 length of 432,384 bp, representing a 12.6-fold improvement. We further annotated 237.3 Mb of repetitive sequence and 30,565 consensus protein coding genes with combined evidence. Phylogenetic analysis showed that S. grosvenorii diverged from members of the Cucurbitaceae family approximately 40.9 million years ago. With comprehensive transcriptomic analysis and differential expression testing, we identified 4,606 up-regulated genes in the early fruit compared to the leaf, a number of which were linked to metabolic pathways regulating fruit development and ripening.

Conclusions: The availability of this new monk fruit genome assembly, as well as the annotations, will facilitate the discovery of new functional genes and the genetic improvement of monk fruit.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biosynthetic Pathways / genetics
  • Cucurbitaceae / anatomy & histology
  • Cucurbitaceae / genetics*
  • Fruit / anatomy & histology
  • Fruit / genetics*
  • Genome, Plant*
  • Molecular Sequence Annotation
  • Multigene Family
  • Transcriptome / genetics
  • Triterpenes / chemistry
  • Whole Genome Sequencing / methods*

Substances

  • Triterpenes