Homopolymer length variation in the Drosophila gene mastermind

J Mol Evol. 1993 Nov;37(5):483-95. doi: 10.1007/BF00160429.


Runs of identical amino acids encoded by triplet repeats (homopolymers) are components of numerous proteins, yet their role is poorly understood. Large numbers of homopolymers are present in the Drosophila melanogaster mastermind (mam) protein surrounding several unique charged amino acid clusters. Comparison of mam sequences from D. virilis and D. melanogaster reveals a high level of amino acid conservation in the charged clusters. In contrast, significant divergence is found in repetitive regions resulting from numerous amino acid replacements and large insertions and deletions. It appears that repetitive regions are under less selective pressure than unique regions, consistent with the idea that homopolymers act as flexible spacers separating functional domains in proteins. Notwithstanding extensive length variation in intervening homopolymers, there is extreme conservation of the amino acid spacing of specific charge clusters. The results support a model where homopolymer length variability is constrained by natural selection.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Chromosome Banding
  • Codon
  • Conserved Sequence
  • Drosophila / embryology
  • Drosophila / genetics*
  • Drosophila Proteins*
  • Drosophila melanogaster / embryology
  • Drosophila melanogaster / genetics
  • Genes, Insect / genetics*
  • Genome
  • Insect Hormones / genetics*
  • Molecular Sequence Data
  • Nuclear Proteins / genetics*
  • Polynucleotides / genetics*
  • RNA, Messenger / isolation & purification
  • Repetitive Sequences, Nucleic Acid*
  • Restriction Mapping
  • Sequence Homology, Amino Acid
  • Tissue Distribution


  • Codon
  • Drosophila Proteins
  • Insect Hormones
  • Nuclear Proteins
  • Polynucleotides
  • RNA, Messenger
  • mam protein, Drosophila

Associated data

  • GENBANK/M92914
  • GENBANK/X54251