Prediction of beta-strand packing interactions using the signature product

J Mol Model. 2006 Feb;12(3):355-61. doi: 10.1007/s00894-005-0052-4. Epub 2005 Dec 7.


The prediction of beta-sheet topology requires the consideration of long-range interactions between beta-strands that are not necessarily consecutive in sequence. Since these interactions are difficult to simulate using ab initio methods, we propose a supplementary method able to assign beta-sheet topology using only sequence information. We envision using the results of our method to reduce the three-dimensional search space of ab initio methods. Our method is based on the signature molecular descriptor, which has been used previously to predict protein-protein interactions successfully, and to develop quantitative structure-activity relationships for small organic drugs and peptide inhibitors. Here, we show how the signature descriptor can be used in a Support Vector Machine to predict whether or not two beta-strands will pack adjacently within a protein. We then show how these predictions can be used to order beta-strands within beta-sheets. Using the entire PDB database with ten-fold cross-validation, we have achieved 74.0% accuracy in packing prediction and 75.6% accuracy in the prediction of edge strands. For the case of beta-strand ordering, we are able to predict the correct ordering accurately for 51.3% of the beta-sheets. Furthermore, using a simple confidence metric, we can determine those sheets for which accurate predictions can be obtained. For the top 25% highest confidence predictions, we are able to achieve 95.7% accuracy in beta-strand ordering. [Figure: see text].

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Computational Biology
  • Databases, Nucleic Acid
  • Molecular Sequence Data
  • Peptides / chemistry
  • Protein Folding*
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Proteins / metabolism*


  • Peptides
  • Proteins