Biosphere: the interoperation of web services in microarray cluster analysis

Appl Bioinformatics. 2004;3(4):253-6. doi: 10.2165/00822942-200403040-00007.


The growing use of DNA microarrays in biomedical research has led to the proliferation of analysis tools. These software programs address different aspects of analysis (e.g. normalisation and clustering within and across individual arrays) as well as extended analysis methods (e.g. clustering, annotation and mining of multiple datasets). Therefore, microarray data analysis typically requires the interoperability of multiple software programs involving different analysis types and methods. Such interoperation is often hampered by the heterogeneity inherent in the software tools (which may function by implementing different interfaces and using different programming languages). To address this problem, we employed the simple object access protocol (SOAP)-based web service approach that provides a uniform programmatic interface to these heterogeneous software components. To demonstrate this approach in the microarray context, we created a web server application, Biosphere, which interoperates a number of web services that are geographically widely distributed. These web services include a clustering web service, which is a suite of different clustering algorithms for analysing microarray data; XEMBL, developed at the European Bioinformatics Institute (EBI) for retrieving EMBL Nucleotide Sequence Database sequence data; and three gene annotation web services: GetGO, GetHAPI and GetUMLS. GetGO allows retrieval of Gene Ontology (GO) annotation, and the other two web services retrieve annotation from the biomedical literature that is indexed based on the Medical Subject Headings (MeSH) terms. With these web services, Biosphere allows the users to do the following: (i) cluster gene expression data using seven different algorithms; (ii) visualise the clustering results that are grouped statistically in colour; and (iii) retrieve sequence, annotation and citation data for the genes of interest.

Availability: Biosphere and its web services described in Web Service Description Language (WSDL) can be accessed at

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Cluster Analysis*
  • Database Management Systems*
  • Databases, Protein*
  • Gene Expression Profiling / methods*
  • Information Storage and Retrieval / methods
  • Internet*
  • Natural Language Processing
  • Oligonucleotide Array Sequence Analysis / methods*
  • Systems Integration
  • User-Computer Interface*