Targeting novel folds for structural genomics

Proteins. 2002 Jul 1;48(1):44-52. doi: 10.1002/prot.10129.


The ultimate goal of structural genomics is to obtain the structure of each protein coded by each gene within a genome to determine gene function. Because of cost and time limitations, it remains impractical to solve the structure for every gene product experimentally. Up to a point, reasonably accurate three-dimensional structures can be deduced for proteins with homologous sequences by using comparative modeling. Beyond this, fold recognition or threading methods can be used for proteins showing little homology to any known fold, although this is relatively time-consuming and limited by the library of template folds currently available. Therefore, it is appropriate to develop methods that can increase our knowledge base, expanding our fold libraries by earmarking potentially "novel" folds for experimental structure determination. How can we sift through proteomic data rapidly and yet reliably identify novel folds as targets for structural genomics? We have analyzed a number of simple methods that discriminate between "novel" and "known" folds. We propose that simple alignments of secondary structure elements using predicted secondary structure could potentially be a more selective method than both a simple fold recognition method (GenTHREADER) and standard sequence alignment at finding novel folds when sequences show no detectable homology to proteins with known structures.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Genomics / methods
  • Protein Folding
  • Protein Structure, Secondary*
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Sensitivity and Specificity
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*


  • Proteins