Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 8;7(7):e157255.
doi: 10.1172/jci.insight.157255.

Temporal transcriptomic analysis using TrendCatcher identifies early and persistent neutrophil activation in severe COVID-19

Affiliations

Temporal transcriptomic analysis using TrendCatcher identifies early and persistent neutrophil activation in severe COVID-19

Xinge Wang et al. JCI Insight. .

Abstract

Studying temporal gene expression shifts during disease progression provides important insights into the biological mechanisms that distinguish adaptive and maladaptive responses. Existing tools for the analysis of time course transcriptomic data are not designed to optimally identify distinct temporal patterns when analyzing dynamic differentially expressed genes (DDEGs). Moreover, there are not enough methods to assess and visualize the temporal progression of biological pathways mapped from time course transcriptomic data sets. In this study, we developed an open-source R package TrendCatcher (https://github.com/jaleesr/TrendCatcher), which applies the smoothing spline ANOVA model and break point searching strategy, to identify and visualize distinct dynamic transcriptional gene signatures and biological processes from longitudinal data sets. We used TrendCatcher to perform a systematic temporal analysis of COVID-19 peripheral blood transcriptomes, including bulk and single-cell RNA-Seq time course data. TrendCatcher uncovered the early and persistent activation of neutrophils and coagulation pathways, as well as impaired type I IFN (IFN-I) signaling in circulating cells as a hallmark of patients who progressed to severe COVID-19, whereas no such patterns were identified in individuals receiving SARS-CoV-2 vaccinations or patients with mild COVID-19. These results underscore the importance of systematic temporal analysis to identify early biomarkers and possible pathogenic therapeutic targets.

Keywords: Bioinformatics; COVID-19; Cell Biology; Immunology; Innate immunity.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest: The authors have declared that no conflict of interest exists.

Figures

Figure 1
Figure 1. Overview and benchmarking of TrendCatcher.
(A) TrendCatcher’s framework. TrendCatcher preprocesses input data, which includes creating cell type–specific “pseudobulk” data sets for temporal analysis when scRNA-Seq data is used. TrendCatcher’s core algorithm is composed of 5 main steps. TrendCatcher’s output includes 4 main types of visualizations and DDEGs identification (numbered 1–5). (B) TrendCatcher’s prediction ROC for a 7 time point–simulated data set compared with DESeq2, DESeq2Spline, and ImpulseDE2, with mixed trajectories. (C) TrendCatcher’s prediction performance (AUC) across different numbers of time points, from 3 to 11 time points. TrendCatcher’s AUC values across time points from 3 to 11 are 0.90, 0.92, 0.90, 0.89, and 0.88.
Figure 2
Figure 2. Dynamic gene expression in peripheral blood following SARS-CoV-2 inoculation in a nonhuman primate model.
(A) Analysis of the 2 predominant trajectory patterns in the nonhuman primate peripheral blood RNA-Seq data from days 0 to 14. The top left figure represents 167 DDEGs following an up-down expression pattern, which peaked at day 2 and then slowly decreased until day 14. The top right figure represents their expression using a traditional Z score–normalized heatmap. The bottom left figure represents 263 DDEGs following a monotonic downregulated trajectory pattern, and their gene expression values were represented in the corresponding heatmap on the right. Gene expression values have been normalized and log2 transformed. (B and C) Top 3 GO enrichment analysis pathways using 167 DDEGs from trajectory pattern “0D-2D Up, 2D-14D Down” and 263 DDEGs from trajectory pattern “0D-14D Down”. The x axis represents the number of genes enriched in GO terms; the y axis represents the enriched GO terms; p.adjust represents adjusted P values using Holm-Bonferroni methods; and P values were generated by Fisher’s exact test. (D) TimeHeatmap of the top 15 dynamic pathways and their dynamic time windows visualizes the temporal patterns. Each column represents a time window. “0D-1D” represents days 0 and 1. The “%GO” column represents the percentage of DDEGs found in the corresponding pathway. The “nDDEG” column represents number of DDEGs found in the corresponding pathway. The number in each grid represents the Avg_log2FC of gene expressions compared with the previous time window. Color represents the Avg_log2FC of the DDEGs within each time window for the corresponding pathway.
Figure 3
Figure 3. Cell type–specific dynamic gene expression in peripheral blood mononuclear cells following SARS-CoV-2 infection in patients.
(A) UMAP visualization of scRNA-Seq PBMC data set (14) with annotated cell types from the original study. (B) TimeHeatmap of top dynamic biological pathway from plasma B cells. Each column represents a time window. Stage 0 represents uninfected baseline. The “%GO” column represents the percentage of DDEGs found in the corresponding pathway. The “nDDEG” column represents number of DDEGs found in the corresponding pathway. The number in each grid represents the Avg_log2FC of gene expressions compared with the previous time window. Color represents the Avg_log2FC of the DDEGs within each time window for the corresponding pathway. (C) Top GO enrichment comparison analysis using DDEGs from each cell type. The x axis represents cell types with the number of DDEGs shown in the brackets; the y axis represents the enriched GO terms; p.adjust represents adjusted P values using Holm-Bonferroni methods; and P values were generated by Fisher’s exact test. Dot size represents gene ratio.
Figure 4
Figure 4. Temporal analysis of whole-blood RNA-Seq data in patients grouped according to disease severity.
(A) Venn diagram of DDEGs identified from 3 COVID-19 severity groups, including mild, moderate, and severe. (B) Top GO enrichment from shared DDEGs across 3 groups compared with top GO enrichment from DDEGs only identified in severe group. The x axis represents comparison groups with the number of DDEGs shown in the brackets; the y axis represents the enriched GO terms; p.adjust represents adjusted P values using Holm-Bonferroni methods; and P values were generated by Fisher’s exact test. Dot size represents gene ratio. (C) TimeHeatmap of the top dynamic pathways from the severe group. Each column represents a time window. “0W-1W” represents week 0 (healthy control) to week 1. The “%GO” column represents the percentage of DDEGs found in the corresponding pathway. The “nDDEG” column represents number of DDEGs found in the corresponding pathway. The number in each grid represents the Avg_log2FC of gene expressions compared with the previous time window. Color represents the Avg_log2FC of the DDEGs within each time window for the corresponding pathway. (DG) LOESS curve fitting of DDEGs identified in the severe COVID-19 group of the neutrophil activation pathway, humoral immune response pathway, blood coagulation pathway, and respiratory burst pathway. Red curves represent the severe group, blue curves represent the moderate group, and green curves represent the mild group. The x axis represents time in weeks; the y axis represents the Avg_log2FC of gene expressions compared with the baseline.
Figure 5
Figure 5. Temporal analysis of scRNA-Seq data of PBMCs from patients with either moderate or severe COVID-19.
(A) Dot plot showing GO enrichment comparison between severe COVID-19 and moderate COVID-19 for each cell type. Each panel represents 1 cell type. The x axis represents severity group; the y axis represents the enriched GO terms; p.adjust represents adjusted P values using Holm-Bonferroni methods; and P values were generated by Fisher’s exact test. Dot size represents gene count. (B) LOESS curve fitting on DDEGs identified from the IFN-I pathway using TrendCatcher from moderate COVID-19 and severe COVID-19. Blue indicates moderate group, and red indicates severe group. The x axis represents time in weeks; the y axis represents the Avg_log2FC of gene expressions compared with the baseline. (CF) TimeHeatmap of NK cells from moderate and severe COVID-19, CD8+T cells from moderate and severe COVID-19. Each column represents a time window. “0W-1W” represents week 0 (healthy control) to week 1. The “%GO” column represents the percentage of DDEGs found in the corresponding pathway. The “nDDEG” column represents number of DDEGs found in the corresponding pathway. The number in each grid represents the Avg_log2FC of gene expressions compared with the previous time window. Color represents the Avg_log2FC of the DDEGs within each time window for the corresponding pathway compared with the previous time window.
Figure 6
Figure 6. Temporal analysis of PBMC scRNA-Seq data from human subjects receiving the SARS-CoV-2 mRNA vaccine.
(A) UMAP of the single-cell transcriptional profile of 1 patient on day 0. Cell types were autoannotated by SingleR. (B) Dot plot of comparison of the top GO terms enriched from cell type–specific DDEGs. The x axis represents cell type with the number of DDEGs shown in the brackets; the y axis represents the enriched GO terms; p.adjust represents adjusted P values using Holm-Bonferroni methods; and P values were generated by Fisher’s exact test. Dot size represents gene ratio. (C) TimeHeatmap of NK cells. Each column represents a time window. “0D-1D” represents day 0 (healthy control) to day 1. The “%GO” column represents the percentage of DDEGs found in the corresponding pathway. The “nDDEG” column represents number of DDEGs found in the corresponding pathway. The number in each grid represents the Avg_log2FC of gene expressions compared with the previous time window. Color represents the Avg_log2FC of the DDEGs within each time window for the corresponding pathway. The first dose was administered on day 1, and the second dose was administered on day 21.

Update of

Similar articles

Cited by

References

    1. Bar-Joseph Z, et al. Studying and modelling dynamic biological processes using time-series gene expression data. Nat Rev Genet. 2012;13(8):552–564. doi: 10.1038/nrg3244. - DOI - PubMed
    1. Hwang B, et al. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp Mol Med. 2018;50(8):1–14. - PMC - PubMed
    1. Love MI, et al. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8. - DOI - PMC - PubMed
    1. McCarthy DJ, et al. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40(10):4288–4297. doi: 10.1093/nar/gks042. - DOI - PMC - PubMed
    1. Ritchie ME, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47. doi: 10.1093/nar/gkv007. - DOI - PMC - PubMed

Publication types