From genome to proteome: developing expression clone resources for the human genome

Hum Mol Genet. 2006 Apr 15;15 Spec No 1:R31-43. doi: 10.1093/hmg/ddl048.


cDNA clones have long been valuable reagents for studying the structure and function of proteins. With recent access to the entire human genome sequence, it has become possible and highly productive to compare the sequences of mRNAs to their genes, in order to validate the sequences and protein-coding annotations of each (1,2). Thus, well-characterized collections of human cDNAs are now playing an essential role in defining the structure and function of human genes and proteins. In this review, we will summarize the major collections of human cDNA clones, discuss some limitations common to most of these collections and describe several noteworthy proteomics applications, focusing on the detection and analysis of protein-protein interactions (PPI). These human cDNA collections contain principally two types of cDNA clones. The largest collections comprise cDNAs with full-length protein coding sequences (FL-CDS). Some but not all of these cDNA clones may represent the entire mRNA sequence, but many are missing considerable non-coding UTR sequence, usually at the 5' end. A second type of cDNA clone, a 'full-ORF' (F-ORF) expression clone, is one where the annotated protein-coding sequence, excised of 5' UTR and 3' UTR sequence, has been transferred to a vector designed to facilitate transfer to other vectors for protein expression.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Cloning, Molecular / methods*
  • DNA, Complementary / chemistry
  • DNA, Complementary / genetics
  • Gene Expression*
  • Genome, Human*
  • Humans
  • Models, Biological
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis / methods
  • Open Reading Frames
  • Proteome / genetics
  • Proteome / metabolism*


  • DNA, Complementary
  • Proteome