Extensive genomic and transcriptional diversity identified through massively parallel DNA and RNA sequencing of eighteen Korean individuals

Nat Genet. 2011 Jul 3;43(8):745-52. doi: 10.1038/ng.872.

Abstract

Massively parallel sequencing technologies have identified a broad spectrum of human genome diversity. Here we deep sequenced and correlated 18 genomes and 17 transcriptomes of unrelated Korean individuals. This has allowed us to construct a genome-wide map of common and rare variants and also identify variants formed during DNA-RNA transcription. We identified 9.56 million genomic variants, 23.2% of which appear to be previously unidentified. From transcriptome sequencing, we discovered 4,414 transcripts not previously annotated. Finally, we revealed 1,809 sites of transcriptional base modification, where the transcriptional landscape is different from the corresponding genomic sequences, and 580 sites of allele-specific expression. Our findings suggest that a considerable number of unexplored genomic variants still remain to be identified in the human genome, and that the integrated analysis of genome and transcriptome sequencing is powerful for understanding the diversity and functional aspects of human genomic variants.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Female
  • Gene Expression Profiling*
  • Genome, Human*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Korea
  • Male
  • Polymorphism, Single Nucleotide / genetics
  • Sequence Analysis, DNA*
  • Sequence Analysis, RNA*