Genomic outlier detection in high-throughput data analysis

Methods Mol Biol. 2013:972:141-53. doi: 10.1007/978-1-60327-337-4_9.


In the analysis of high-throughput data, a very common goal is the detection of genes or of differential expression between two groups or classes. A recent finding from the scientific literature in prostate cancer demonstrates that by searching for a different pattern of differential expression, new candidate oncogenes might be found. In this chapter, we discuss the statistical problem, termed oncogene outlier detection, and discuss a variety of proposals to this problem. A statistical model in the multiclass situation is described; links with multiple testing concepts are established. Some new nonparametric procedures are described and compared to existing methods using simulation studies.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Gene Expression Profiling / methods*
  • Genetic Association Studies / methods
  • Genomics
  • Humans
  • Models, Statistical
  • Multivariate Analysis
  • Neoplasms / genetics
  • Neoplasms / metabolism
  • Normal Distribution
  • Oligonucleotide Array Sequence Analysis
  • Oncogenes*
  • ROC Curve
  • Statistics, Nonparametric