Genomic sequence around butterfly wing development genes: annotation and comparative analysis

PLoS One. 2011;6(8):e23778. doi: 10.1371/journal.pone.0023778. Epub 2011 Aug 31.


Background: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions.

Methodology/principal findings: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes).

Conclusions: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Alcohol Dehydrogenase / genetics
  • Animals
  • Base Composition / genetics
  • Base Sequence
  • Bombyx / genetics
  • Butterflies / genetics*
  • Butterflies / growth & development*
  • Chromosomes, Artificial, Bacterial / genetics
  • Computational Biology
  • Conserved Sequence / genetics
  • DNA Transposable Elements / genetics
  • DNA, Intergenic / genetics
  • Databases, Genetic
  • Expressed Sequence Tags
  • Gene Order / genetics
  • Genes, Developmental / genetics*
  • Genes, Insect / genetics*
  • MicroRNAs / genetics
  • Molecular Sequence Annotation*
  • Molecular Sequence Data
  • Open Reading Frames / genetics
  • Phylogeny
  • Repetitive Sequences, Nucleic Acid / genetics
  • Reproducibility of Results
  • Sequence Homology, Nucleic Acid
  • Synteny / genetics
  • Wings, Animal / growth & development*
  • Wings, Animal / metabolism*


  • DNA Transposable Elements
  • DNA, Intergenic
  • MicroRNAs
  • Alcohol Dehydrogenase

Associated data

  • GENBANK/AC239114
  • GENBANK/AC239115
  • GENBANK/AC239116
  • GENBANK/AC239117
  • GENBANK/AC239118
  • GENBANK/AC239119
  • GENBANK/AC239120
  • GENBANK/AC239121
  • GENBANK/AC239122
  • GENBANK/AC239123
  • GENBANK/AC239124