how_are_we_stranded_here: quick determination of RNA-Seq strandedness

BMC Bioinformatics. 2022 Jan 22;23(1):49. doi: 10.1186/s12859-022-04572-7.


Background: Quality control checks are the first step in RNA-Sequencing analysis, which enable the identification of common issues that occur in the sequenced reads. Checks for sequence quality, contamination, and complexity are commonplace, and allow users to implement steps downstream which can account for these issues. Strand-specificity of reads is frequently overlooked and is often unavailable even in published data, yet when unknown or incorrectly specified can have detrimental effects on the reproducibility and accuracy of downstream analyses.

Results: To address these issues, we developed how_are_we_stranded_here, a Python library that helps to quickly infer strandedness of paired-end RNA-Sequencing data. Testing on both simulated and real RNA-Sequencing reads showed that it correctly measures strandedness, and measures outside the normal range may indicate sample contamination.

Conclusions: how_are_we_stranded_here is fast and user friendly, making it easy to implement in quality control pipelines prior to analysing RNA-Sequencing data. how_are_we_stranded_here is freely available at .

Keywords: Bioinformatics; Quality control; RNA-Sequencing.

MeSH terms

  • High-Throughput Nucleotide Sequencing*
  • RNA-Seq
  • Reproducibility of Results
  • Sequence Analysis, DNA
  • Sequence Analysis, RNA
  • Software*