Discovery of meaningful associations in genomic data using partial correlation coefficients

Bioinformatics. 2004 Dec 12;20(18):3565-74. doi: 10.1093/bioinformatics/bth445. Epub 2004 Jul 29.


Motivation: A major challenge of systems biology is to infer biochemical interactions from large-scale observations, such as transcriptomics, proteomics and metabolomics. We propose to use a partial correlation analysis to construct approximate Undirected Dependency Graphs from such large-scale biochemical data. This approach enables a distinction between direct and indirect interactions of biochemical compounds, thereby inferring the underlying network topology.

Results: The method is first thoroughly evaluated with a large set of simulated data. Results indicate that the approach has good statistical power and a low False Discovery Rate even in the presence of noise in the data. We then applied the method to an existing data set of yeast gene expression. Several small gene networks were inferred and found to contain genes known to be collectively involved in particular biochemical processes. In some of these networks there are also uncharacterized ORFs present, which lead to hypotheses about their functions.

Availability: Programs running in MS-Windows and Linux for applying zeroth, first, second and third order partial correlation analysis can be downloaded at:

Supplementary information: Supplementary information can be found at: URL to be decided.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms*
  • Chromosome Mapping / methods*
  • Gene Expression Regulation / physiology*
  • Models, Genetic*
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis / methods*
  • Open Reading Frames
  • Saccharomyces cerevisiae / physiology
  • Saccharomyces cerevisiae Proteins / genetics
  • Saccharomyces cerevisiae Proteins / metabolism*
  • Signal Transduction / physiology*
  • Statistics as Topic


  • Saccharomyces cerevisiae Proteins