GO-At: in silico prediction of gene function in Arabidopsis thaliana by combining heterogeneous data

Plant J. 2010 Feb;61(4):713-21. doi: 10.1111/j.1365-313X.2009.04097.x. Epub 2009 Nov 27.

Abstract

Despite recent advances, accurate gene function prediction remains an elusive goal, with very few methods directly applicable to the plant Arabidopsis thaliana. In this study, we present GO-At (gene ontology prediction in A. thaliana), a method that combines five data types (co-expression, sequence, phylogenetic profile, interaction and gene neighbourhood) to predict gene function in Arabidopsis. Using a simple, yet powerful two-step approach, GO-At first generates a list of genes ranked in descending order of probability of functional association with the query gene. Next, a prediction score is automatically assigned to each function in this list based on the assumption that functions appearing most frequently at the top of the list are most likely to represent the function of the query gene. In this way, the second step provides an effective alternative to simply taking the 'best hit' from the first list, and achieves success rates of up to 79%. GO-At is applicable across all three GO categories: molecular function, biological process and cellular component, and can assign functions at multiple levels of annotation detail. Furthermore, we demonstrate GO-At's ability to predict functions of uncharacterized genes by identifying ten putative golgins/Golgi-associated proteins amongst 8219 genes of previously unknown cellular component and present independent evidence to support our predictions. A web-based implementation of GO-At (http://www.bioinformatics.leeds.ac.uk/goat) is available, providing a unique resource for plant researchers to make predictions for uncharacterized genes and predict novel functions in Arabidopsis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics*
  • Computational Biology / methods*
  • Databases, Protein
  • Gene Expression Profiling / methods
  • Genes, Plant
  • Internet
  • Phylogeny
  • Protein Interaction Mapping / methods
  • User-Computer Interface