Exploring dimension-reduced embeddings with Sleepwalk

Genome Res. 2020 May;30(5):749-756. doi: 10.1101/gr.251447.119. Epub 2020 May 19.

Abstract

Dimension-reduction methods, such as t-SNE or UMAP, are widely used when exploring high-dimensional data describing many entities, for example, RNA-seq data for many single cells. However, dimension reduction is commonly prone to introducing artifacts, and we hence need means to see where a dimension-reduced embedding is a faithful representation of the local neighborhood and where it is not. We present Sleepwalk, a simple but powerful tool that allows the user to interactively explore an embedding, using color to depict original or any other distances from all points to the cell under the mouse cursor. We show how this approach not only highlights distortions but also reveals otherwise hidden characteristics of the data, and how Sleepwalk's comparative modes help integrate multisample data and understand differences between embedding and preprocessing methods. Sleepwalk is a versatile and intuitive tool that unlocks the full power of dimension reduction and will be of value not only in single-cell RNA-seq but also in any other area with matrix-shaped big data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cerebellum / embryology
  • Cerebellum / metabolism
  • Gene Expression
  • Mice
  • RNA-Seq / methods*
  • Single-Cell Analysis
  • Software*

Associated data

  • figshare/10.6084/m9.figshare.7908059.v1
  • figshare/10.6084/m9.figshare.7910483.v1
  • figshare/10.6084/m9.figshare.7910504.v2