Nonparametric methods for molecular biology

Methods Mol Biol. 2010;620:105-53. doi: 10.1007/978-1-60761-580-4_2.


In 2003, the completion of the Human Genome Project (1) together with advances in computational resources (2) were expected to launch an era where the genetic and genomic contributions to many common diseases would be found. In the years following, however, researchers became increasingly frustrated as most reported 'findings' could not be replicated in independent studies (3). To improve the signal/noise ratio, it was suggested to increase the number of cases to be included to tens of thousands (4), a requirement that would dramatically restrict the scope of personalized medicine. Similarly, there was little success in elucidating the gene-gene interactions involved in complex diseases or even in developing criteria for assessing their phenotypes. As a partial solution to these enigmata, we here introduce a class of statistical methods as the 'missing link' between advances in genetics and informatics. As a first step, we provide a unifying view of a plethora of nonparametric tests developed mainly in the 1940s, all of which can be expressed as u-statistics. Then, we will extend this approach to reflect categorical and ordinal relationships between variables, resulting in a flexible and powerful approach to deal with the impact of (1) multiallelic genetic loci, (2) poly-locus genetic regions, and (3) oligo-genetic and oligo-genomic collaborative interactions on complex phenotypes.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Analysis of Variance
  • Animals
  • Biostatistics / methods*
  • Genome-Wide Association Study
  • Humans
  • Molecular Biology / methods*
  • Phenotype
  • Precision Medicine
  • Statistics, Nonparametric