Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Nov 20;104(47):18371-6.
doi: 10.1073/pnas.0709146104. Epub 2007 Nov 14.

A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data From Different Studies

Affiliations
Free PMC article

A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data From Different Studies

Larsson Omberg et al. Proc Natl Acad Sci U S A. .
Free PMC article

Abstract

We describe the use of a higher-order singular value decomposition (HOSVD) in transforming a data tensor of genes x "x-settings," that is, different settings of the experimental variable x x "y-settings," which tabulates DNA microarray data from different studies, to a "core tensor" of "eigenarrays" x "x-eigengenes" x "y-eigengenes." Reformulating this multilinear HOSVD such that it decomposes the data tensor into a linear superposition of all outer products of an eigenarray, an x- and a y-eigengene, that is, rank-1 "subtensors," we define the significance of each subtensor in terms of the fraction of the overall information in the data tensor that it captures. We illustrate this HOSVD with an integration of genome-scale mRNA expression data from three yeast cell cycle time courses, two of which are under exposure to either hydrogen peroxide or menadione. We find that significant subtensors represent independent biological programs or experimental phenomena. The picture that emerges suggests that the conserved genes YKU70, MRE11, AIF1, and ZWF1, and the processes of retrotransposition, apoptosis, and the oxidative pentose phosphate pathway that these genes are involved in, may play significant, yet previously unrecognized, roles in the differential effects of hydrogen peroxide and menadione on cell cycle progression. A genome-scale correlation between DNA replication initiation and RNA transcription, which is equivalent to a recently discovered correlation and might be due to a previously unknown mechanism of regulation, is independently uncovered.

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Significant HOSVD subtensors, after rotation of the approximately degenerate subtensor spaces 𝒮(4, 2+3, 1), 𝒮(5+2, 1, 3), 𝒮(8+2, 4, 3), and 𝒮(3+7, 2, 3). (a) Bar chart of the fractions of the 11 most significant subtensors. The higher-order singular values corresponding to subtensors highlighted in gray are <0. The entropy of the data tensor is 0.27. (b) Line-joined graphs of the first (red), second (blue), third (green), and fourth (orange) x-eigengenes and the superposition of the second and third x-eigengenes (violet), which define the expression variation across time in these subtensors. The time points are color-coded according to their cell cycle classification in the control time course: M/G1 (yellow), G1 (green), S (blue), S/G2 (red), and G2/M (orange). The grid lines mark the dissipation of the response to α-factor in the control time course (dashed) and the start of exposure to either HP or MD, at ≈20 and 25 min, respectively. (c) Line-joined graphs of the first y-eigengene (red), and the second (blue) and third (green) rotated y-eigengenes, which define the expression variation across the oxidative stress conditions.
Fig. 2.
Fig. 2.
Associations by annotations of the eigenarrays and superpositions of eigenarrays that define expression variation across genes in all ten most significant subtensors. Bar chart of −log10(P value) for parallel (Right) and antiparallel (Left) enrichments of genes, which are expressed in response to environmental stress (red) or the pheromone (blue) or during the cell cycle (green), or of genes that are binding targets of oxidative stress activators (red), pheromone response (blue), or cell cycle (green) transcription factors, Stb5 (orange) or replication initiation proteins (violet).
Fig. 3.
Fig. 3.
Eigengenes and genes that are significant in the HP vs. MD-induced responses. (a) Raster display of the outer products of the fourth and second x- eigengenes with the third y-eigengene, Vx,4:TVy,3:T and Vx,2:TVy,3:T, which define the expression variations across time and oxidative stress conditions in the ninth and tenth subtensors, 𝒮(8+2, 4, 3) and 𝒮(3+7, 2, 3), respectively. (b) Raster display of the expression of significant genes centered at the time- and condition-invariant expression levels of each gene.

Similar articles

See all similar articles

Cited by 32 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback