Integrating gene and protein expression data: pattern analysis and profile mining

Methods. 2005 Mar;35(3):303-14. doi: 10.1016/j.ymeth.2004.08.021. Epub 2005 Jan 12.


Proteomics and functional genomics are emerging new research fields devoted to the study of the entire collection of proteins and mRNA transcripts (collectively known as gene products) that define a biological system. DNA microarrays are now a popular platform for measuring changes in messenger RNA transcript levels on a genome-wide scale, while gel-free shotgun profiling methods based on tandem mass spectrometry are increasingly being used to determine the identity, modification states, and relative abundance of large numbers of proteins. By defining the behavior of entire biological pathways and networks under various physiological states, these studies aim to extend traditional reductionist molecular genetic approaches regarding the biological roles of the vast array of uncharacterized gene products. A key goal is to determine how the information encoded by the myriad of expressed gene products is integrated at the molecular, cellular, and even whole organism level to create the dynamic biochemical processes and complex physiological controls that sustain life. While comparison of the complementary information contained in proteomic and mRNA data sets poses considerable analytical challenges, these efforts should provide added insight into the fundamental mechanisms underlying physiology, development, and the emergence of disease. Here, we outline several analytical approaches, methods, and tools that have proven to be helpful in the face of this important challenge.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Animals
  • Computational Biology / methods*
  • Databases, Genetic
  • Databases, Protein
  • Genome
  • Genomics / methods*
  • Humans
  • Mass Spectrometry
  • Multigene Family
  • Oligonucleotide Array Sequence Analysis
  • Proteins / chemistry*
  • Proteomics / methods*
  • RNA, Messenger / metabolism
  • Regression Analysis
  • Statistics as Topic / methods*


  • Proteins
  • RNA, Messenger