The complexity of the mammalian transcriptome

J Physiol. 2006 Sep 1;575(Pt 2):321-32. doi: 10.1113/jphysiol.2006.115568. Epub 2006 Jul 20.


A comprehensive understanding of protein and regulatory networks is strictly dependent on the complete description of the transcriptome of cells. After the determination of the genome sequence of several mammalian species, gene identification is based on in silico predictions followed by evidence of transcription. Conservative estimates suggest that there are about 20,000 protein-encoding genes in the mammalian genome. In the last few years the combination of full-length cDNA cloning, cap-analysis gene expression (CAGE) tag sequencing and tiling arrays experiments have unveiled unexpected additional complexities in the transcriptome. Here we describe the current view of the mammalian transcriptome focusing on transcripts diversity, the growing non-coding RNA world, the organization of transcriptional units in the genome and promoter structures. In-depth analysis of the brain transcriptome has been challenging due to the cellular complexity of this organ. Here we present a computational analysis of CAGE data from different regions of the central nervous system, suggesting distinctive mechanisms of brain-specific transcription.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Alternative Splicing / genetics
  • Alternative Splicing / physiology
  • Animals
  • Brain / cytology
  • Brain / physiology
  • Cloning, Molecular
  • DNA, Complementary
  • Expressed Sequence Tags
  • Gene Expression Profiling
  • Gene Expression Regulation / physiology*
  • Genetic Variation / genetics
  • Genetic Variation / physiology
  • Genome / genetics*
  • Genome / physiology
  • Genomics / methods
  • Humans
  • Mammals / genetics
  • Mammals / physiology
  • Protein Biosynthesis / physiology
  • RNA / genetics
  • Transcription, Genetic / physiology*


  • DNA, Complementary
  • RNA