Step-by-Step Construction of Gene Co-expression Networks from High-Throughput Arabidopsis RNA Sequencing Data

Methods Mol Biol. 2018;1761:275-301. doi: 10.1007/978-1-4939-7747-5_21.

Abstract

The rapid increase in the availability of transcriptomics data generated by RNA sequencing represents both a challenge and an opportunity for biologists without bioinformatics training. The challenge is handling, integrating, and interpreting these data sets. The opportunity is to use this information to generate testable hypothesis to understand molecular mechanisms controlling gene expression and biological processes (Fig. 1). A successful strategy to generate tractable hypotheses from transcriptomics data has been to build undirected network graphs based on patterns of gene co-expression. Many examples of new hypothesis derived from network analyses can be found in the literature, spanning different organisms including plants and specific fields such as root developmental biology.In order to make the process of constructing a gene co-expression network more accessible to biologists, here we provide step-by-step instructions using published RNA-seq experimental data obtained from a public database. Similar strategies have been used in previous studies to advance root developmental biology. This guide includes basic instructions for the operation of widely used open source platforms such as Bio-Linux, R, and Cytoscape. Even though the data we used in this example was obtained from Arabidopsis thaliana, the workflow developed in this guide can be easily adapted to work with RNA-seq data from any organism.

Keywords: Bio-Linux; Bioinformatics; Correlation; Cytoscape; DESeq2; Differential gene expression; FastQC; Gene co-expression network; HISAT2; Network generation; RNA-seq; Trimmomatic.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics*
  • Computational Biology / methods
  • Databases, Nucleic Acid
  • Gene Expression Profiling* / methods
  • Gene Regulatory Networks*
  • High-Throughput Nucleotide Sequencing*
  • Sequence Analysis, RNA*
  • Software
  • Systems Biology / methods
  • Transcriptome*