Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan;57(1):e5.
doi: 10.1093/pcp/pcv165. Epub 2015 Nov 6.

ATTED-II in 2016: A Plant Coexpression Database Towards Lineage-Specific Coexpression

Affiliations
Free PMC article

ATTED-II in 2016: A Plant Coexpression Database Towards Lineage-Specific Coexpression

Yuichi Aoki et al. Plant Cell Physiol. .
Free PMC article

Abstract

ATTED-II (http://atted.jp) is a coexpression database for plant species with parallel views of multiple coexpression data sets and network analysis tools. The user can efficiently find functional gene relationships and design experiments to identify gene functions by reverse genetics and general molecular biology techniques. Here, we report updates to ATTED-II (version 8.0), including new and updated coexpression data and analysis tools. ATTED-II now includes eight microarray- and six RNA sequencing-based coexpression data sets for seven dicot species (Arabidopsis, field mustard, soybean, barrel medick, poplar, tomato and grape) and two monocot species (rice and maize). Stand-alone coexpression analyses tend to have low reliability. Therefore, examining evolutionarily conserved coexpression is a more effective approach from the viewpoints of reliability and evolutionary importance. In contrast, the reliability of species-specific coexpression data remains poor. Our assessment scores for individual coexpression data sets indicated that the quality of the new coexpression data sets in ATTED-II is higher than for any previous coexpression data set. In addition, five species (Arabidopsis, soybean, tomato, rice and maize) in ATTED-II are now supported by both microarray- and RNA sequencing-based coexpression data, which has increased the reliability. Consequently, ATTED-II can now provide lineage-specific coexpression information. As an example of the use of ATTED-II to explore lineage-specific coexpression, we demonstrate monocot- and dicot-specific coexpression of cell wall genes. With the expanded coexpression data for multilevel evaluation, ATTED-II provides new opportunities to investigate lineage-specific evolution in plants.

Keywords: Arabidopsis; Comparative transcriptomics; Database; Evolution; Gene coexpression; Gene network.

Figures

Fig. 1
Fig. 1
Hierarchical clustering of coexpression data. Data sets were hierarchically clustered by the complete linkage method. The pairwise similarities among all coexpression data sets are shown in Supplementary Fig. S1. Because COXSIM values are not exactly symmetric, the median of the COXSIM for a pair is not exactly symmetric. Therefore, the average values of the median COXSIM between one target and one reference, and vice versa, were used to represent symmetric similarity between data sets, and 1 – similarity was used to represent the distance between data sets. The coexpression data set version is shown in parentheses under the data set ID. ‘CodonS’ is the codon score from Table 1. ‘Sample’ indicates the number of samples in the data set.
Fig. 2
Fig. 2
Number of guide genes for each supportability level. Supportability levels are represented as stars, where no star is the lowest and three stars is the highest. The numbers in the color-coded bars indicate the percentage of genes in each supportability level in each data set. Genes without any reference genes in the other data sets are shown as blank boxes.
Fig. 3
Fig. 3
Example of lineage-specific coexpression. The genes encoding the four proteins cellulose synthase 6, glucoside hydrolase 3, glucoside hydrolase 9 and the protein of unknown function are represented as a coexpression network (also highlighted in Supplementary Fig. S3). Stronger coexpression (MR <500) is represented in bold. Blue and red edges represent monocot- and dicot-specific coexpression, respectively. All of the coexpression data for the eight data sets, with orthologous information, can be downloaded from ATTED-II (http://atted.jp/top_download.shtml).
Fig. 4
Fig. 4
Update of NetworkDrawer with subnetwork analysis functions. NetworkDrawer output coexpression network for Arabidopsis (Ath-m) for a set of three query genes (white nodes; CS6/At5g64740, GH3/At5g20950 and GH9/At5g49720) with automatically retrieved coexpressed genes (gray nodes). Orange lines indicate highly reliable coexpression supported by data from the other data sets. Red dotted lines indicate protein–protein interaction. After construction of the coexpression network, subnetworks are automatically detected. For each subnetwork, enrichment tests are then conducted for GO annotations and heptamer cis-elements in the proximal promoter region [–300, –1] under Bonferroni correction. The subnetworks having at least one significantly enriched factor are shown on the right operation panel. Genes for a subnetwork of interest are highlighted by the yellow balloon marks, which can be manually selected with the right operating panel.

Similar articles

See all similar articles

Cited by 61 articles

See all "Cited by" articles

References

    1. Altschul S.F., Madden T.L., Schäffer A.A., Zhang J., Zhang Z., Miller W., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: 3389–3402. - PMC - PubMed
    1. Aoki K., Ogata Y., Shibata D. (2007) Approaches for extracting practical information from gene co-expression networks in plant biology. Plant Cell Physiol. 48: 381–390. - PubMed
    1. Barrett T., Wilhite S.E., Ledoux P., Evangelista C., Kim I.F., Tomashevsky M., et al. (2013) NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 41: D991–995. - PMC - PubMed
    1. Ballouz S., Verleyen W., Gillis J. (2015) Guidance for RNA-seq co-expression network construction and analysis: safety in numbers. Bioinformatics 31: 2123–2130. - PubMed
    1. Brown G.R., Hem V., Katz K.S., Ovetsky M., Wallin C., Ermolaeva O., et al. (2015) Gene: a gene-centered information resource at NCBI. Nucleic Acids Res. 43: D36–D42. - PMC - PubMed

Publication types

Feedback