Prediction of loop regions in protein sequence

J Bioinform Comput Biol. 2008 Oct;6(5):1035-47. doi: 10.1142/s0219720008003758.

Abstract

We suggest an algorithm that inputs a protein sequence and outputs a decomposition of the protein chain into a regular part including secondary structures and a nonregular part corresponding to loop regions. We have analyzed loop regions in a protein dataset of 3,769 globular domains and defined the optimal parameters for this prediction: the threshold between regular and nonregular regions and the optimal window size for averaging procedures using the scale of the expected number of contacts in a globular state and entropy scale as the number of degrees of freedom for the angles phi, psi, and chi for each amino acid. Comparison with known methods demonstrates that our method gives the same results as the well-known ALB method based on physical properties of amino acids (the percentage of true predictions is 64% against 66%), and worse prediction for regular and nonregular regions than PSIPRED (Protein Structure Prediction Server) without alignment of homologous proteins (the percentage of true predictions is 73%). The potential advantage of the suggested approach is that the predicted set of loops can be used to find patterns of rigid and flexible loops as possible candidates to play a structure/function role as well as a role of antigenic determinants.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Computer Simulation
  • Models, Chemical*
  • Models, Molecular*
  • Protein Conformation
  • Proteins / chemistry*
  • Proteins / ultrastructure*
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*

Substances

  • Proteins