Genome holography: deciphering function-form motifs from gene expression data

PLoS One. 2008 Jul 16;3(7):e2708. doi: 10.1371/journal.pone.0002708.

Abstract

Background: DNA chips allow simultaneous measurements of genome-wide response of thousands of genes, i.e. system level monitoring of the gene-network activity. Advanced analysis methods have been developed to extract meaningful information from the vast amount of raw gene-expression data obtained from the microarray measurements. These methods usually aimed to distinguish between groups of subjects (e.g., cancer patients vs. healthy subjects) or identifying marker genes that help to distinguish between those groups. We assumed that motifs related to the internal structure of operons and gene-networks regulation are also embedded in microarray and can be deciphered by using proper analysis.

Methodology/principal findings: The analysis presented here is based on investigating the gene-gene correlations. We analyze a database of gene expression of Bacillus subtilis exposed to sub-lethal levels of 37 different antibiotics. Using unsupervised analysis (dendrogram) of the matrix of normalized gene-gene correlations, we identified the operons as they form distinct clusters of genes in the sorted correlation matrix. Applying dimension-reduction algorithm (Principal Component Analysis, PCA) to the matrices of normalized correlations reveals functional motifs. The genes are placed in a reduced 3-dimensional space of the three leading PCA eigen-vectors according to their corresponding eigen-values. We found that the organization of the genes in the reduced PCA space recovers motifs of the operon internal structure, such as the order of the genes along the genome, gene separation by non-coding segments, and translational start and end regions. In addition to the intra-operon structure, it is also possible to predict inter-operon relationships, operons sharing functional regulation factors, and more. In particular, we demonstrate the above in the context of the competence and sporulation pathways.

Conclusions/significance: We demonstrated that by analyzing gene-gene correlation from gene-expression data it is possible to identify operons and to predict unknown internal structure of operons and gene-networks regulation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Amino Acid Motifs
  • Anti-Bacterial Agents / pharmacology
  • Bacillus subtilis / genetics
  • Computational Biology / methods
  • Data Interpretation, Statistical
  • Gene Expression Profiling
  • Gene Expression Regulation, Bacterial*
  • Gene Regulatory Networks
  • Genetic Markers
  • Holography
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis
  • Operon
  • Time Factors

Substances

  • Anti-Bacterial Agents
  • Genetic Markers