Sequence context of indel mutations and their effect on protein evolution in a bacterial endosymbiont

Genome Biol Evol. 2013;5(3):599-605. doi: 10.1093/gbe/evt033.

Abstract

Indel mutations play key roles in genome and protein evolution, yet we lack a comprehensive understanding of how indels impact evolutionary processes. Genome-wide analyses enabled by next-generation sequencing can clarify the context and effect of indels, thereby integrating a more detailed consideration of indels with our knowledge of nucleotide substitutions. To this end, we sequenced Blochmannia chromaiodes, an obligate bacterial endosymbiont of carpenter ants, and compared it with the close relative, B. pennsylvanicus. The genetic distance between these species is small enough for accurate whole genome alignment but large enough to provide a meaningful spectrum of indel mutations. We found that indels are subjected to purifying selection in coding regions and even intergenic regions, which show a reduced rate of indel base pairs per kilobase compared with nonfunctional pseudogenes. Indels occur almost exclusively in repeat regions composed of homopolymers and multimeric simple sequence repeats, demonstrating the importance of sequence context for indel mutations. Despite purifying selection, some indels occur in protein-coding genes. Most are multiples of three, indicating selective pressure to maintain the reading frame. The deleterious effect of frameshift-inducing indels is minimized by either compensation from a nearby indel to restore reading frame or the indel's location near the 3'-end of the gene. We observed amino acid divergence exceeding nucleotide divergence in regions affected by frameshift-inducing indels, suggesting that these indels may either drive adaptive protein evolution or initiate gene degradation. Our results shed light on how indel mutations impact processes of molecular evolution underlying endosymbiont genome evolution.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Ants / microbiology*
  • Ants / physiology
  • Bacterial Proteins / genetics*
  • Base Sequence
  • DNA, Intergenic
  • Enterobacteriaceae / classification
  • Enterobacteriaceae / genetics*
  • Evolution, Molecular*
  • Genetic Variation
  • INDEL Mutation*
  • Microsatellite Repeats
  • Molecular Sequence Data
  • Phylogeny
  • Selection, Genetic
  • Symbiosis*

Substances

  • Bacterial Proteins
  • DNA, Intergenic

Associated data

  • GENBANK/JX966368
  • RefSeq/NC_020075