OrthoClust: an orthology-based network framework for clustering data across multiple species

Genome Biol. 2014 Aug 28;15(8):R100. doi: 10.1186/gb-2014-15-8-r100.

Abstract

Increasingly, high-dimensional genomics data are becoming available for many organisms.Here, we develop OrthoClust for simultaneously clustering data across multiple species. OrthoClust is a computational framework that integrates the co-association networks of individual species by utilizing the orthology relationships of genes between species. It outputs optimized modules that are fundamentally cross-species, which can either be conserved or species-specific. We demonstrate the application of OrthoClust using the RNA-Seq expression profiles of Caenorhabditis elegans and Drosophila melanogaster from the modENCODE consortium. A potential application of cross-species modules is to infer putative analogous functions of uncharacterized elements like non-coding RNAs based on guilt-by-association.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Base Sequence
  • Caenorhabditis elegans / genetics
  • Cluster Analysis
  • Computational Biology / methods*
  • Conserved Sequence*
  • Databases, Genetic
  • Drosophila melanogaster / genetics
  • Gene Expression Profiling
  • Sequence Analysis, RNA / methods*
  • Species Specificity