RNA proximity sequencing data and analysis pipeline from a human neuroblastoma nuclear transcriptome

Sci Data. 2020 Jan 28;7(1):35. doi: 10.1038/s41597-020-0372-3.

Abstract

We have previously developed and described a method for measuring RNA co-locations within cells, called Proximity RNA-seq, which promises insights into RNA expression, processing, storage and translation. Here, we describe transcriptome-wide proximity RNA-seq datasets obtained from human neuroblastoma SH-SY5Y cell nuclei. To aid future users of this method, we also describe and release our analysis pipeline, CloseCall, which maps cDNA to a custom transcript annotation and allocates cDNA-linked barcodes to barcode groups. CloseCall then performs Monte Carlo simulations on the data to identify pairs of transcripts, which are co-barcoded more frequently than expected by chance. Furthermore, derived co-barcoding frequencies for individual transcripts, dubbed valency, serve as proxies for RNA density or connectivity for that given transcript. We outline how this pipeline was applied to these sequencing datasets and openly share the processed data outputs and access to a virtual machine that runs CloseCall. The resulting data specify the spatial organization of RNAs and builds hypotheses for potential regulatory relationships between RNAs.

Publication types

  • Dataset

MeSH terms

  • Cell Line, Tumor
  • DNA Barcoding, Taxonomic
  • DNA, Complementary
  • Humans
  • Monte Carlo Method
  • Neuroblastoma / genetics*
  • RNA-Seq*
  • Transcriptome*

Substances

  • DNA, Complementary