Inconsistency and features of single nucleotide variants detected in whole exome sequencing versus transcriptome sequencing: A case study in lung cancer

Methods. 2015 Jul 15;83:118-27. doi: 10.1016/j.ymeth.2015.04.016. Epub 2015 Apr 23.

Abstract

Whole exome sequencing (WES) and RNA sequencing (RNA-Seq) are two main platforms used for next-generation sequencing (NGS). While WES is primarily for DNA variant discovery and RNA-Seq is mainly for measurement of gene expression, both can be used for detection of genetic variants, especially single nucleotide variants (SNVs). How consistently variants can be detected from WES and RNA-Seq has not been systematically evaluated. In this study, we examined the technical and biological inconsistencies in SNV detection using WES and RNA-Seq data from 27 pairs of tumor and matched normal samples. We analyzed SNVs in three categories: WES unique - those only detected in WES, RNA-Seq unique - those only detected in RNA-Seq, and shared - those detected in both. We found a small overlap (average ∼14%) between the SNVs called in WES and RNA-Seq. The WES unique SNVs were mainly due to low coverage, low expression, or their location on the non-transcribed strand in RNA-Seq data, while the RNA-Seq unique SNVs were primarily due to their location out of the WES-capture boundary regions (accounting ∼71%), as well as low coverage of the regions, low coverage of the mutant alleles or RNA-editing. The shared SNVs had high locus-specific coverage in both WES and RNA-Seq and high gene expression levels. Additionally, WES unique and RNA-Seq unique SNVs showed different nucleotide substitution patterns, e.g., ∼55% of RNA-Seq unique variants were A:T→G:C, a hallmark of RNA editing. This study provides an important evaluation on the inconsistencies of somatic SNVs called in WES and RNA-Seq data.

Keywords: Allele frequency; RNA editing; RNA-Seq; Single nucleotide variants; Somatic mutations; Whole exome sequencing.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Exome / genetics*
  • Genome, Human
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Lung Neoplasms / genetics*
  • Polymorphism, Single Nucleotide / genetics
  • Transcriptome / genetics*