Molecular classification of human carcinomas by use of gene expression signatures

Cancer Res. 2001 Oct 15;61(20):7388-93.


Classification of human tumors according to their primary anatomical site of origin is fundamental for the optimal treatment of patients with cancer. Here we describe the use of large-scale RNA profiling and supervised machine learning algorithms to construct a first-generation molecular classification scheme for carcinomas of the prostate, breast, lung, ovary, colorectum, kidney, liver, pancreas, bladder/ureter, and gastroesophagus, which collectively account for approximately 70% of all cancer-related deaths in the United States. The classification scheme was based on identifying gene subsets whose expression typifies each cancer class, and we quantified the extent to which these genes are characteristic of a specific tumor type by accurately and confidently predicting the anatomical site of tumor origin for 90% of 175 carcinomas, including 9 of 12 metastatic lesions. The predictor gene subsets include those whose expression is typical of specific types of normal epithelial differentiation, as well as other genes whose expression is elevated in cancer. This study demonstrates the feasibility of predicting the tissue origin of a carcinoma in the context of multiple cancer classes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Carcinoma / classification*
  • Carcinoma / genetics*
  • Carcinoma / metabolism
  • Female
  • Gene Expression Profiling*
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Male
  • Neoplasms / classification*
  • Neoplasms / genetics*
  • Neoplasms / metabolism
  • Oligonucleotide Array Sequence Analysis
  • Predictive Value of Tests
  • RNA, Neoplasm / genetics


  • RNA, Neoplasm