Structure- and sequence-based function prediction for non-homologous proteins

J Struct Funct Genomics. 2012 Jun;13(2):111-23. doi: 10.1007/s10969-012-9126-6. Epub 2012 Jan 22.


The structural genomics projects have been accumulating an increasing number of protein structures, many of which remain functionally unknown. In parallel effort to experimental methods, computational methods are expected to make a significant contribution for functional elucidation of such proteins. However, conventional computational methods that transfer functions from homologous proteins do not help much for these uncharacterized protein structures because they do not have apparent structural or sequence similarity with the known proteins. Here, we briefly review two avenues of computational function prediction methods, i.e. structure-based methods and sequence-based methods. The focus is on our recent developments of local structure-based and sequence-based methods, which can effectively extract function information from distantly related proteins. Two structure-based methods, Pocket-Surfer and Patch-Surfer, identify similar known ligand binding sites for pocket regions in a query protein without using global protein fold similarity information. Two sequence-based methods, protein function prediction and extended similarity group, make use of weakly similar sequences that are conventionally discarded in homology based function annotation. Combined together with experimental methods we hope that computational methods will make leading contribution in functional elucidation of the protein structures.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Binding Sites
  • Computational Biology / methods
  • Databases, Protein*
  • Internet
  • Molecular Sequence Annotation
  • Protein Conformation
  • Proteins / analysis*
  • Proteins / chemistry
  • Reproducibility of Results
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid
  • Software*
  • Structure-Activity Relationship


  • Proteins