Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays

Genome Res. 2005 Jul;15(7):987-97. doi: 10.1101/gr.3455305.

Abstract

Recently, we mapped the sites of transcription across approximately 30% of the human genome and elucidated the structures of several hundred novel transcripts. In this report, we describe a novel combination of techniques including the rapid amplification of cDNA ends (RACE) and tiling array technologies that was used to further characterize transcripts in the human transcriptome. This technical approach allows for several important pieces of information to be gathered about each array-detected transcribed region, including strand of origin, start and termination positions, and the exonic structures of spliced and unspliced coding and noncoding RNAs. In this report, the structures of transcripts from 14 transcribed loci, representing both known genes and unannotated transcripts taken from the several hundred randomly selected unannotated transcripts described in our previous work are represented as examples of the complex organization of the human transcriptome. As a consequence of this complexity, it is not unusual that a single base pair can be part of an intricate network of multiple isoforms of overlapping sense and antisense transcripts, the majority of which are unannotated. Some of these transcripts follow the canonical splicing rules, whereas others combine the exons of different genes or represent other types of noncanonical transcripts. These results have important implications concerning the correlation of genotypes to phenotypes, the regulation of complex interlaced transcriptional patterns, and the definition of a gene.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Cell Line
  • Gene Expression Profiling
  • Humans
  • Jurkat Cells
  • Models, Genetic
  • Molecular Sequence Data
  • Nucleic Acid Amplification Techniques* / methods
  • Oligonucleotide Array Sequence Analysis* / methods
  • Protein Isoforms / genetics
  • Transcription, Genetic*
  • Tumor Cells, Cultured

Substances

  • Protein Isoforms

Associated data

  • GENBANK/AY927416
  • GENBANK/AY927417
  • GENBANK/AY927418
  • GENBANK/AY927419
  • GENBANK/AY927420
  • GENBANK/AY927421
  • GENBANK/AY927422
  • GENBANK/AY927423
  • GENBANK/AY927424
  • GENBANK/AY927425
  • GENBANK/AY927426
  • GENBANK/AY927427
  • GENBANK/AY927428
  • GENBANK/AY927429
  • GENBANK/AY927430
  • GENBANK/AY927431
  • GENBANK/AY927432
  • GENBANK/AY927433
  • GENBANK/AY927434
  • GENBANK/AY927435
  • GENBANK/AY927436
  • GENBANK/AY927437
  • GENBANK/AY927438
  • GENBANK/AY927439
  • GENBANK/AY927440
  • GENBANK/AY927441
  • GENBANK/AY927442
  • GENBANK/AY927443
  • GENBANK/AY927444
  • GENBANK/AY927445
  • GENBANK/AY927446
  • GENBANK/AY927447
  • GENBANK/AY927448
  • GENBANK/AY927449
  • GENBANK/AY927450
  • GENBANK/AY927451
  • GENBANK/AY927452
  • GENBANK/AY927453
  • GENBANK/AY927454
  • GENBANK/AY927455
  • GENBANK/AY927456
  • GENBANK/AY927457
  • GENBANK/AY927458
  • GENBANK/AY927459
  • GENBANK/AY927460
  • GENBANK/AY927461
  • GENBANK/AY927462
  • GENBANK/AY927463
  • GENBANK/AY927464
  • GENBANK/AY927465
  • GENBANK/AY927466
  • GENBANK/AY927467