Recurrent transcriptional clusters in the genome of mouse pluripotent stem cells

Nucleic Acids Res. 2012 Oct;40(19):e153. doi: 10.1093/nar/gks663. Epub 2012 Jul 12.


A number of studies have shown that transcriptome analysis in terms of chromosomal location can reveal regions of non-random transcriptional activity within the genome. Genomic clusters of differentially expressed genes can identify genomic patterns of structural organization, underlying copy number variations or long-range epigenetic regulation such as X-chromosome inactivation. Here we apply an integrative bioinformatics analysis to a collection of 315 freely available mouse pluripotent stem cell samples to discover transcriptional clusters in the genome. We show that over half of the analysed samples (56.83%) carry whole or partial-chromosome spanning clusters which recur in genomic regions previously implicated in chromosomal imbalances. Strikingly, we found that the presence of such large-clusters is linked to the differential expression of a limited number of genes, common to all samples carrying clusters irrespectively of the chromosome where the cluster is found. We have used these genes to train and test classification models that can predict samples that carry large-scale clusters on any chromosome with over 90% accuracy. Our findings suggest that there is a common downstream activation in these cells that affects a limited number of nodes. We propose that this effect is linked to selective advantage and identify potential driver genes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Chromosome Mapping
  • Cluster Analysis
  • Gene Expression Profiling
  • Genome
  • Genomics / methods*
  • Induced Pluripotent Stem Cells / metabolism
  • Mice
  • Pluripotent Stem Cells / metabolism*
  • Transcriptome*