Feature selection for DNA methylation based cancer classification

Bioinformatics. 2001;17 Suppl 1:S157-64. doi: 10.1093/bioinformatics/17.suppl_1.s157.


Molecular portraits, such as mRNA expression or DNA methylation patterns, have been shown to be strongly correlated with phenotypical parameters. These molecular patterns can be revealed routinely on a genomic scale. However, class prediction based on these patterns is an under-determined problem, due to the extreme high dimensionality of the data compared to the usually small number of available samples. This makes a reduction of the data dimensionality necessary. Here we demonstrate how phenotypic classes can be predicted by combining feature selection and discriminant analysis. By comparing several feature selection methods we show that the right dimension reduction strategy is of crucial importance for the classification performance. The techniques are demonstrated by methylation pattern based discrimination between acute lymphoblastic leukemia and acute myeloid leukemia.

Publication types

  • Comparative Study

MeSH terms

  • Computational Biology
  • CpG Islands
  • DNA Methylation*
  • DNA, Neoplasm / chemistry
  • Humans
  • Leukemia, Myeloid, Acute / metabolism
  • Neoplasms / chemistry*
  • Neoplasms / classification*
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data
  • Precursor Cell Lymphoblastic Leukemia-Lymphoma / metabolism
  • Principal Component Analysis


  • DNA, Neoplasm