Reconstructing the temporal ordering of biological samples using microarray data

Paul M Magwene; Paul Lizardi; Junhyong Kim

doi:10.1093/bioinformatics/btg081

Reconstructing the temporal ordering of biological samples using microarray data

Bioinformatics. 2003 May 1;19(7):842-50. doi: 10.1093/bioinformatics/btg081.

Authors

Paul M Magwene¹, Paul Lizardi, Junhyong Kim

Affiliation

¹ Department of Ecology and Evolutionary Biology, Yale University School of Medicine, New Haven, CT, USA.

PMID: 12724294
DOI: 10.1093/bioinformatics/btg081

Abstract

Motivation: Accurate time series for biological processes are difficult to estimate due to problems of synchronization, temporal sampling and rate heterogeneity. Methods are needed that can utilize multi-dimensional data, such as those resulting from DNA microarray experiments, in order to reconstruct time series from unordered or poorly ordered sets of observations.

Results: We present a set of algorithms for estimating temporal orderings from unordered sets of sample elements. The techniques we describe are based on modifications of a minimum-spanning tree calculated from a weighted, undirected graph. We demonstrate the efficacy of our approach by applying these techniques to an artificial data set as well as several gene expression data sets derived from DNA microarray experiments. In addition to estimating orderings, the techniques we describe also provide useful heuristics for assessing relevant properties of sample datasets such as noise and sampling intensity, and we show how a data structure called a PQ-tree can be used to represent uncertainty in a reconstructed ordering.

Availability: Academic implementations of the ordering algorithms are available as source code (in the programming language Python) on our web site, along with documentation on their use. The artificial 'jelly roll' data set upon which the algorithm was tested is also available from this web site. The publicly available gene expression data may be found at http://genome-www.stanford.edu/cellcycle/ and http://caulobacter.stanford.edu/CellCycle/.

Publication types

Comparative Study
Evaluation Study
Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.
Validation Study

MeSH terms

Algorithms*
Caulobacter crescentus / genetics
Caulobacter crescentus / metabolism
Computer Simulation
Gene Expression Profiling / methods*
Models, Genetic*
Oligonucleotide Array Sequence Analysis / methods*
Saccharomyces cerevisiae / genetics
Saccharomyces cerevisiae / metabolism
Sample Size
Sequence Analysis, DNA / methods*
Time Factors*
Transcription, Genetic / genetics