Global identification of transcription start sites in the genome of Apis mellifera using 5'LongSAGE

J Exp Zool B Mol Dev Evol. 2011 Nov 15;316(7):500-14. doi: 10.1002/jez.b.21421. Epub 2011 Jun 21.

Abstract

The precise identification of the transcription start sites (TSSs) of genes in the honeybee genome will be helpful for inferring start codons and for determining promoter elements. The 5'SAGE approach provides a powerful tool for identifying TSSs in the sequenced genome. The main purpose of this study is to identify the actual TSSs of expressed genes as well as the usage of different TSSs in the Apis mellifera genome. We performed a 5'LongSAGE (5'LS) analysis for the adult drone head, and the TSSs of the expressed genes were determined by mapping the 5'LS tag sequences to the honeybee genome. A total of 8,280 unique 19 bp 5'LS tag sequences were identified that corresponded to 3,655 predicted genes. Out of these tags, 4,998 tags (60.4%) were mapped to a region from -1,000 bp to +100 bp of the start codon of 2,301 reference coding sequences. Notably, we observed that 28-47% of the 3,655 honeybee genes initiated transcription from alternative TSSs. The TSS consensus pattern of the honeybee genes, DT(rich) PyPu(G(rich))(T/A)(T(rich))(3), was obtained by aligning the sequences flanking the 5'LS-TSSs. We also identified three new genes in the regions downstream of 5'LS tags and validated 21 TSSs using RT-PCR amplification. Additionally, 17 genes identified by the 5'LS tags were associated with the Gene Ontology term "behavior." Mapping of the 5'LS tags on the genome not only provided direct evidence of expression for in silico predicted genes but also allowed for the identification of previously unrecognized, novel exons and alternative TSSs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • Bees / genetics*
  • Bees / metabolism
  • Chromosome Mapping
  • Databases, Nucleic Acid
  • Exons / genetics
  • Expressed Sequence Tags / metabolism
  • Gene Expression / genetics
  • Gene Expression Profiling / methods*
  • Genes, Insect / genetics
  • Genome, Insect / genetics*
  • Molecular Sequence Data
  • Promoter Regions, Genetic
  • RNA, Messenger / isolation & purification
  • Sequence Tagged Sites
  • Transcription Initiation Site*

Substances

  • RNA, Messenger