Developmental gene discovery in a hemimetabolous insect: de novo assembly and annotation of a transcriptome for the cricket Gryllus bimaculatus

PLoS One. 2013 May 6;8(5):e61479. doi: 10.1371/journal.pone.0061479. Print 2013.


Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects), representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket), a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts) and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr) identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in Gryllus.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Databases, Genetic
  • Embryonic Development / genetics
  • Female
  • Genes, Insect
  • Gryllidae / embryology
  • Gryllidae / genetics*
  • Gryllidae / metabolism
  • High-Throughput Nucleotide Sequencing
  • Insect Proteins / genetics
  • Insect Proteins / metabolism
  • Male
  • Molecular Sequence Annotation*
  • Oogenesis / genetics
  • Open Reading Frames
  • Phylogeny
  • Sequence Analysis, DNA
  • Signal Transduction
  • Transcriptome*


  • Insect Proteins

Grant support

This work was partially supported by Harvard Stem Cell Institute Seed Grant SG-0057-10-00, Ellison Medical Foundation New Scholar Award AG-NS-07010-10, and National Science Foundation (NSF) Grant IOS-0817678 to CE, an NSF Predoctoral Fellowship to BEC, a Fletcher Family award from Bowdoin College to HH, DFG Collaborative Research Centre 680 funds to SR, and JSPS KAKENHI 22124003/22370080/23687033 to TM. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.