Transposable elements (TEs) are interspersed repeat sequences that make up much of the human genome. Their expression has been implicated in development and disease. However, TE-derived RNA-seq reads are difficult to quantify. Past approaches have excluded these reads or aggregated RNA expression to subfamilies shared by similar TE copies, sacrificing quantitative accuracy or the genomic context necessary to understand the basis of TE transcription. As a result, the effects of TEs on gene expression and associated phenotypes are not well understood. Here, we present Software for Quantifying Interspersed Repeat Expression (SQuIRE), the first RNA-seq analysis pipeline that provides a quantitative and locus-specific picture of TE expression (https://github.com/wyang17/SQuIRE). SQuIRE is an accurate and user-friendly tool that can be used for a variety of species. We applied SQuIRE to RNA-seq from normal mouse tissues and a Drosophila model of amyotrophic lateral sclerosis. In both model organisms, we recapitulated previously reported TE subfamily expression levels and revealed locus-specific TE expression. We also identified differences in TE transcription patterns relating to transcript type, gene expression and RNA splicing that would be lost with other approaches using subfamily-level analyses. Altogether, our findings illustrate the importance of studying TE transcription with locus-level resolution.
© The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.