DE-kupl: exhaustive capture of biological variation in RNA-seq data through k-mer decomposition

Genome Biol. 2017 Dec 28;18(1):243. doi: 10.1186/s13059-017-1372-2.

Abstract

We introduce a k-mer-based computational protocol, DE-kupl, for capturing local RNA variation in a set of RNA-seq libraries, independently of a reference genome or transcriptome. DE-kupl extracts all k-mers with differential abundance directly from the raw data files. This enables the retrieval of virtually all variation present in an RNA-seq data set. This variation is subsequently assigned to biological events or entities such as differential long non-coding RNAs, splice and polyadenylation variants, introns, repeats, editing or mutation events, and exogenous RNA. Applying DE-kupl to human RNA-seq data sets identified multiple types of novel events, reproducibly across independent RNA-seq experiments.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Computational Biology / methods*
  • Gene Expression Profiling
  • Gene Expression Regulation
  • Genetic Variation*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Polyadenylation
  • RNA / genetics*
  • RNA Splicing
  • RNA, Antisense
  • RNA, Long Noncoding / genetics
  • RNA, Messenger / genetics
  • Reproducibility of Results
  • Sequence Analysis, RNA
  • Software*
  • Transcriptome

Substances

  • RNA, Antisense
  • RNA, Long Noncoding
  • RNA, Messenger
  • RNA