ParCrys: a Parzen window density estimation approach to protein crystallization propensity prediction

Bioinformatics. 2008 Apr 1;24(7):901-7. doi: 10.1093/bioinformatics/btn055. Epub 2008 Feb 19.

Abstract

The ability to rank proteins by their likely success in crystallization is useful in current Structural Biology efforts and in particular in high-throughput Structural Genomics initiatives. We present ParCrys, a Parzen Window approach to estimate a protein's propensity to produce diffraction-quality crystals. The Protein Data Bank (PDB) provided training data whilst the databases TargetDB and PepcDB were used to define feature selection data as well as test data independent of feature selection and training. ParCrys outperforms the OB-Score, SECRET and CRYSTALP on the data examined, with accuracy and Matthews correlation coefficient values of 79.1% and 0.582, respectively (74.0% and 0.227, respectively, on data with a 'real-world' ratio of positive:negative examples). ParCrys predictions and associated data are available from www.compbio.dundee.ac.uk/parcrys.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Computer Simulation
  • Crystallization / methods*
  • Models, Chemical*
  • Models, Molecular*
  • Molecular Sequence Data
  • Protein Conformation
  • Proteins / chemistry*
  • Proteins / ultrastructure*
  • Sequence Analysis, Protein / methods*
  • Software

Substances

  • Proteins