Retroposed copies of the HMG genes: a window to genome dynamics

Genome Res. 2003 May;13(5):800-12. doi: 10.1101/gr.893803.


Retroposed copies (RPCs) of genes are functional (intronless paralogs) or nonfunctional (processed pseudogenes) copies derived from mRNA through a process of retrotransposition. Previous studies found that gene families involved in mRNA translation or nuclear function were more likely to have large numbers of RPCs. Here we characterize RPCs of the few families coding for the abundant high-mobility-group (HMG) proteins in humans. Using an algorithm we developed, we identified and studied 219 HMG RPCs. For slightly more than 10% of these RPCs, we found evidence indicating expression. Furthermore, eight of these are potentially new members of the HMG families of proteins. For three RPCs, the evidence indicated expression as part of other transcripts; in all of these, we found the presence of alternative splicing or multiple polyadenylation signals. RPC distribution among the HMGs was not even, with 33-65 each for HMGB1, HMGB3, HMGN1, and HMGN2, and 0-6 each for HMGA1, HMGA2, HMGB2, and HMGN3. Analysis of the sequences flanking the RPCs revealed that the junction between the target site duplications and the 5'-flanking sequences exhibited the same TT/AAAA consensus found for the L1 endonuclease, supporting an L1-mediated retrotransposition mechanism. Finally, because our algorithm included aligning RPC flanking sequences with the corresponding HMG genomic sequence, we were able to identify transcribed regions of HMG genes that were not part of the published mRNA sequences.

MeSH terms

  • Chromosome Mapping / methods
  • Computational Biology
  • Databases, Genetic
  • Evolution, Molecular
  • Gene Expression Regulation / genetics
  • Genes, Duplicate / genetics*
  • Genome, Human*
  • HMGA Proteins / biosynthesis
  • HMGA Proteins / genetics
  • HMGB Proteins / biosynthesis
  • HMGB Proteins / genetics
  • HMGN Proteins / biosynthesis
  • HMGN Proteins / genetics
  • High Mobility Group Proteins / biosynthesis
  • High Mobility Group Proteins / genetics*
  • Humans
  • Multigene Family
  • Pseudogenes / genetics
  • RNA, Messenger / biosynthesis
  • RNA, Messenger / genetics
  • Retroelements / genetics*
  • Sequence Homology, Nucleic Acid


  • HMGA Proteins
  • HMGB Proteins
  • HMGN Proteins
  • High Mobility Group Proteins
  • RNA, Messenger
  • Retroelements