Comprehensive analysis of transcriptome variation uncovers known and novel driver events in T-cell acute lymphoblastic leukemia

PLoS Genet. 2013;9(12):e1003997. doi: 10.1371/journal.pgen.1003997. Epub 2013 Dec 19.

Abstract

RNA-seq is a promising technology to re-sequence protein coding genes for the identification of single nucleotide variants (SNV), while simultaneously obtaining information on structural variations and gene expression perturbations. We asked whether RNA-seq is suitable for the detection of driver mutations in T-cell acute lymphoblastic leukemia (T-ALL). These leukemias are caused by a combination of gene fusions, over-expression of transcription factors and cooperative point mutations in oncogenes and tumor suppressor genes. We analyzed 31 T-ALL patient samples and 18 T-ALL cell lines by high-coverage paired-end RNA-seq. First, we optimized the detection of SNVs in RNA-seq data by comparing the results with exome re-sequencing data. We identified known driver genes with recurrent protein altering variations, as well as several new candidates including H3F3A, PTK2B, and STAT5B. Next, we determined accurate gene expression levels from the RNA-seq data through normalizations and batch effect removal, and used these to classify patients into T-ALL subtypes. Finally, we detected gene fusions, of which several can explain the over-expression of key driver genes such as TLX1, PLAG1, LMO1, or NKX2-1; and others result in novel fusion transcripts encoding activated kinases (SSBP2-FER and TPM3-JAK2) or involving MLLT10. In conclusion, we present novel analysis pipelines for variant calling, variant filtering, and expression normalization on RNA-seq data, and successfully applied these for the detection of translocations, point mutations, INDELs, exon-skipping events, and expression perturbations in T-ALL.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Base Sequence / genetics*
  • Cell Line, Tumor
  • Child
  • Child, Preschool
  • Exome / genetics
  • Female
  • Gene Expression Regulation, Leukemic*
  • Gene Fusion
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • INDEL Mutation / genetics
  • Infant
  • Male
  • Middle Aged
  • Polymorphism, Single Nucleotide
  • Precursor T-Cell Lymphoblastic Leukemia-Lymphoma / etiology
  • Precursor T-Cell Lymphoblastic Leukemia-Lymphoma / genetics*
  • Precursor T-Cell Lymphoblastic Leukemia-Lymphoma / pathology
  • Transcriptome / genetics*

Grant support

This work was supported by grants from the KU Leuven (PF/10/016 SymBioSys to JCo, SA ; concerted action grant to JCo, PV, IW), the FWO-Vlaanderen (G.0546.11, JCo, PV, SA, AU, FS); the Foundation against Cancer (2010-154 and 2012-168 to SA); an ERC-starting grant (JCo); the Interuniversity Attraction Poles (IAP) granted by the Federal Office for Scientific, Technical and Cultural Affairs, Belgium (JCo); the Ministry of health, Cancer Plan, (JCo, PV, SA); and the European Community's Seventh Framework Programme (FP7, grant NGS-PTL 306242, to JCo and PV). KDK is a postdoctoral researcher of FWO-Vlaanderen and PV is a senior clinical investigator of FWO-Vlaanderen. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.