Is highly approximate knowledge of a protein's backbone structure sufficient to successfully identify its family, superfamily, and tertiary fold? To explore this question, backbone dihedral angles were extracted from the known three-dimensional structure of 2,439 proteins and mapped into 36 labeled, 60 degrees x 60 degrees bins, called mesostates. Using this coarse-grained mapping, protein conformation can be approximated by a linear sequence of mesostates. These linear strings can then be aligned and assessed by conventional sequence-comparison methods. We report that the mesostate sequence is sufficient to recognize a protein's family, superfamily, and fold with good fidelity.
Copyright 2005 Wiley-Liss, Inc.