Tensor-Decomposition-Based Unsupervised Feature Extraction in Single-Cell Multiomics Data Analysis

Genes (Basel). 2021 Sep 18;12(9):1442. doi: 10.3390/genes12091442.

Abstract

Analysis of single-cell multiomics datasets is a novel topic and is considerably challenging because such datasets contain a large number of features with numerous missing values. In this study, we implemented a recently proposed tensor-decomposition (TD)-based unsupervised feature extraction (FE) technique to address this difficult problem. The technique can successfully integrate single-cell multiomics data composed of gene expression, DNA methylation, and accessibility. Although the last two have large dimensions, as many as ten million, containing only a few percentage of nonzero values, TD-based unsupervised FE can integrate three omics datasets without filling in missing values. Together with UMAP, which is used frequently when embedding single-cell measurements into two-dimensional space, TD-based unsupervised FE can produce two-dimensional embedding coincident with classification when integrating single-cell omics datasets. Genes selected based on TD-based unsupervised FE are also significantly related to reasonable biological roles.

Keywords: feature extraction; multiomics data; single-cell; tensor decomposition.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • DNA Methylation
  • Databases, Factual
  • Gene Expression Profiling / methods
  • Histones / genetics
  • Histones / metabolism
  • Single-Cell Analysis / methods*

Substances

  • Histones