Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun 19;19(1):232.
doi: 10.1186/s12859-018-2217-z.

Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data

Affiliations
Free PMC article

Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data

Shuonan Chen et al. BMC Bioinformatics. .
Free PMC article

Abstract

Background: A fundamental fact in biology states that genes do not operate in isolation, and yet, methods that infer regulatory networks for single cell gene expression data have been slow to emerge. With single cell sequencing methods now becoming accessible, general network inference algorithms that were initially developed for data collected from bulk samples may not be suitable for single cells. Meanwhile, although methods that are specific for single cell data are now emerging, whether they have improved performance over general methods is unknown. In this study, we evaluate the applicability of five general methods and three single cell methods for inferring gene regulatory networks from both experimental single cell gene expression data and in silico simulated data.

Results: Standard evaluation metrics using ROC curves and Precision-Recall curves against reference sets sourced from the literature demonstrated that most of the methods performed poorly when they were applied to either experimental single cell data, or simulated single cell data, which demonstrates their lack of performance for this task. Using default settings, network methods were applied to the same datasets. Comparisons of the learned networks highlighted the uniqueness of some predicted edges for each method. The fact that different methods infer networks that vary substantially reflects the underlying mathematical rationale and assumptions that distinguish network methods from each other.

Conclusions: This study provides a comprehensive evaluation of network modeling algorithms applied to experimental single cell gene expression data and in silico simulated datasets where the network structure is known. Comparisons demonstrate that most of these assessed network methods are not able to predict network structures from single cell expression data accurately, even if they are specifically developed for single cell methods. Also, single cell methods, which usually depend on more elaborative algorithms, in general have less similarity to each other in the sets of edges detected. The results from this study emphasize the importance for developing more accurate optimized network modeling methods that are compatible for single cell data. Newly-developed single cell methods may uniquely capture particular features of potential gene-gene relationships, and caution should be taken when we interpret these results.

Keywords: Bayesian network; Correlation network; Gene regulatory network; Single cell genomics.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

No ethics approval was required for the study. All input data are publicly available through the citations supplied.

Consent for publication

Not applicable.

Competing interests

The authors declared that they have no competing interests.

Figures

Fig. 1
Fig. 1
Study Workflow. Eight network reconstruction methods – including five general methods: partial correlation (Pcorr), Bayesian network (BN), GENIE3, ARACNE and CLR, and three single cell-specific methods: SCENIC, SCODE and PIDC – were applied to two single cell experimental datasets, and two simulated datasets that resemble single cell data. Evaluation of these methods was based on their ability to reconstruct a reference network, and this was assessed using PR, ROC curves, and other network analysis metrics
Fig. 2
Fig. 2
ROC (top) and PR (bottom) curves for each method applied to the simulated datasets. The results obtained from the Sim1 dataset are shown on the left (a & c) and the Sim2 dataset is shown on the right (b & d). Diagonal black lines on the ROC curves are baselines indicating the prediction level equivalent to a random guess (a & b). ROC curves showed that when the threshold changes and more edges are detected, both false positive and true positive rates increased, but the speed of this increase might not be the same. The PR curves show that when the detection thresholds decreased, the number of detected edges increased, with a corresponding increase in recall (more true edges are detected) but decrease in precision (increasing the number of detected edges that are not in the reference network)
Fig. 3
Fig. 3
ROC (top) and PR (bottom) curves for each method applied to single cell experimental data. The results obtained from the ESC dataset are shown on the left side (a & c) and the HSC dataset is shown on the right side (b & d). Diagonal black lines on the ROC curves are baselines indicating the prediction level equivalent to a random guess (a & b). In (a) and (b), almost all the methods aligned with diagonal black lines in the ROC curve, suggesting the predictions are nearly equivalent to a random guess for all these methods. It is easier to observe the behavior of each method in PR curves. Although the ROC curves indicate similar performance across all methods, for the HSC dataset, the PR curve reveal that the methods have different prediction accuracy (when the total number of detected edges is small), as shown when the curve is close to the y-axis, and thus provides another aspect of evaluation
Fig. 4
Fig. 4
AUROC (top) and AUPR (bottom) scores demonstrate consistently poor performance for most of the methods and datasets. In both panels, the horizontal red lines represent the line of a random guess and baseline, which are the same (= 0.5) for AUROC across datasets (a) but differ for AUPR (b), since the baseline indicates the value of precision when all the reference edges were recovered, and this depends on the total number of genes in the datasets and the number of edges in the reference networks
Fig. 5
Fig. 5
Intersection of reconstructed networks from general methods and reference networks outlines the ability of the different methods to identify the same true positives. These methods detected a core set of interactions in the learned networks for the ESC data (a), HSC data (b), and two simulated datasets, Sim1 (c) and Sim2 (d). In general, each method also detected edges that were unique to the method and the dataset, except for Pcorr in Sim1 (c, see text). Only a small set of edges in the reference networks were recovered by intersection of three methods, and also different methods detected edges that were unique. Moreover, we show that even after combining all the methods, there were still edges in the reference network that were not detected, as indicated in the white section for each panel
Fig. 6
Fig. 6
Intersection of learned networks from single cell methods and the reference network highlights the differences between the edges that are uniquely detected by each method. Although single cell methods detected a common set of interactions, there was far more inconsistency in their detections, compared to the results of general methods. Each method, however, was able to detect some ‘correct’ edges that are unique to each method (as indicated from the overlap between each colored ellipse and the white ellipse, also see Results). Similarly, only a handful of edges were commonly detected by all three methods for the HSC and Sim2 datasets (b & d), and no single edge was commonly detected for the ESC and Sim1 datasets (a & c)
Fig. 7
Fig. 7
Investigating the similarity of the networks produced by the seven network methods. The PCA plots indicate how much each method is similar to each other in terms of edge detection ranking. We show that there is no consistency in the clustering of these methods, and any similarities amongst them vary based on the datasets. SCODE was notably a consistent outlier amongst all the methods

Similar articles

Cited by

References

    1. Buganim Y, et al. Single-cell expression analyses during cellular reprogramming reveal an early stochastic and a late hierarchic phase. Cell. 2012;150(6):1209–1222. doi: 10.1016/j.cell.2012.08.023. - DOI - PMC - PubMed
    1. Lin P, Troup M, Ho JW. CIDR: ultrafast and accurate clustering through imputation for single-cell RNA-seq data. Genome Biol. 2017;18(1):59. doi: 10.1186/s13059-017-1188-0. - DOI - PMC - PubMed
    1. Azizi E, et al. Bayesian inference for single-cell clustering and imputing. Genomics and Computational Biology. 2017;3(1):e46. https://genomicscomputbiol.org/ojs/index.php/GCB/article/view/46.
    1. Finak G, et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 2015;16(1):278. doi: 10.1186/s13059-015-0844-5. - DOI - PMC - PubMed
    1. Korthauer KD, et al. A statistical approach for identifying differential distributions in single-cell RNA-seq experiments. Genome Biol. 2016;17(1):222. doi: 10.1186/s13059-016-1077-y. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources