A universal sequencing read interpreter

Sci Adv. 2023 Jan 4;9(1):eadd2793. doi: 10.1126/sciadv.add2793. Epub 2023 Jan 4.

Abstract

Massively parallel DNA sequencing has led to the rapid growth of highly multiplexed experiments in biology. These experiments produce unique sequencing results that require specific analysis pipelines to decode highly structured reads. However, no versatile framework that interprets sequencing reads to extract their encoded information for downstream biological analysis has been developed. Here, we report INTERSTELLAR (interpretation, scalable transformation, and emulation of large-scale sequencing reads) that decodes data values encoded in theoretically any type of sequencing read and translates them into sequencing reads of another structure of choice. We demonstrated that INTERSTELLAR successfully extracted information from a range of short- and long-read sequencing reads and translated those of single-cell (sc)RNA-seq, scATAC-seq, and spatial transcriptomics to be analyzed by different software tools that have been developed for conceptually the same types of experiments. INTERSTELLAR will greatly facilitate the development of sequencing-based experiments and sharing of data analysis pipelines.