Functional module detection through integration of single-cell RNA sequencing data with protein-protein interaction networks

BMC Genomics. 2020 Nov 2;21(1):756. doi: 10.1186/s12864-020-07144-2.


Background: Recent advances in single-cell RNA sequencing have allowed researchers to explore transcriptional function at a cellular level. In particular, single-cell RNA sequencing reveals that there exist clusters of cells with similar gene expression profiles, representing different transcriptional states.

Results: In this study, we present SCPPIN, a method for integrating single-cell RNA sequencing data with protein-protein interaction networks that detects active modules in cells of different transcriptional states. We achieve this by clustering RNA-sequencing data, identifying differentially expressed genes, constructing node-weighted protein-protein interaction networks, and finding the maximum-weight connected subgraphs with an exact Steiner-tree approach. As case studies, we investigate two RNA-sequencing data sets from human liver spheroids and human adipose tissue, respectively. With SCPPIN we expand the output of differential expressed genes analysis with information from protein interactions. We find that different transcriptional states have different subnetworks of the protein-protein interaction networks significantly enriched which represent biological pathways. In these pathways, SCPPIN identifies proteins that are not differentially expressed but have a crucial biological function (e.g., as receptors) and therefore reveals biology beyond a standard differential expressed gene analysis.

Conclusions: The introduced SCPPIN method can be used to systematically analyse differentially expressed genes in single-cell RNA sequencing data by integrating it with protein interaction data. The detected modules that characterise each cluster help to identify and hypothesise a biological function associated to those cells. Our analysis suggests the participation of unexpected proteins in these pathways that are undetectable from the single-cell RNA sequencing data alone. The techniques described here are applicable to other organisms and tissues.

MeSH terms

  • Cluster Analysis
  • Gene Expression Profiling
  • Gene Regulatory Networks
  • Humans
  • Protein Interaction Maps*
  • RNA* / genetics
  • Sequence Analysis, RNA


  • RNA