The completion of the Mammalian Gene Collection (MGC)

Genome Res. 2009 Dec;19(12):2324-33. doi: 10.1101/gr.095976.109. Epub 2009 Sep 18.


Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis. The MGC now contains clones with the entire protein-coding sequence for 92% of human and 89% of mouse genes with curated RefSeq (NM-accession) transcripts, and for 97% of human and 96% of mouse genes with curated RefSeq transcripts that have one or more PubMed publications, in addition to clones for more than 6300 rat genes. These high-quality MGC clones and their sequences are accessible without restriction to researchers worldwide.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cloning, Molecular / methods*
  • Computational Biology / methods*
  • DNA / biosynthesis
  • DNA, Complementary / genetics*
  • Gene Library*
  • Genes / genetics*
  • Humans
  • Mammals / genetics*
  • Mice
  • National Institutes of Health (U.S.)
  • Rats
  • Reverse Transcriptase Polymerase Chain Reaction
  • United States


  • DNA, Complementary
  • DNA