Identification of homologous core structures

Proteins. 1999 Apr 1;35(1):70-9.


Using a large database of protein structure-structure alignments, we test a new method for distinguishing homologous and "analogous" structural neighbors. The homologous neighbors included in the test set show no detectable sequence similarity, but they may be well superimposed and show functional similarity or other evidence of evolutionary relationship. Analogous neighbors also show no sequence similarity and may be well superimposed, but they have different functions and their structural similarity may be the result of convergent evolution. Confirming results of other analyses, we find that remote homologs and analogs are not well distinguished by measures of pairwise structural similarity, including the percentage of identical residues and root-mean-square (RMS) superposition residual. We show, however, that with structure-structure alignments of analogous neighbors rarely superimpose the particular substructure that is shared among homologous neighbors. We call this characteristic substructure the homologous core structure (HCS), and we show that a cross-validated test for presence of the HCS correctly identifies 75% of remote homologs with a false-positive rate of 16% analogs, significantly better than discrimination by RMS or other measures of pairwise similarity. The HCS describes conservation of spatial structure within a protein family in much the way that a sequence motif describes sequence conservation. We suggest that it may be used in the same way, to identify homologous neighbors at greater evolutionary distance than is possible by pairwise comparison.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Evolution, Molecular
  • Models, Molecular
  • Molecular Sequence Data
  • Protein Conformation*
  • Sequence Homology, Amino Acid