Background: The task of computing highly accurate structural alignments of proteins in very short computation time is still challenging. This is partly due to the complexity of protein structures. Therefore, instead of manipulating coordinates directly, matrices of inter-atomic distances, sets of vectors between protein backbone atoms, and other reduced representations are used. These decrease the effort of comparing large sets of coordinates, but protein structural alignment still remains computationally expensive.
Results: We represent the topology of a protein structure through a structural profile that expresses the global effective connectivity of each residue. We have shown recently that this representation allows explicitly expressing the relationship between protein structure and protein sequence. Based on this very condensed vectorial representation, we develop a structural alignment framework that recognizes structural similarities with accuracy comparable to established alignment tools. Furthermore, our algorithm has favourable scaling of computation time with chain length. Since the algorithm is independent of the details of the structural representation, our framework can be applied to sequence-to-sequence and sequence-to-structure comparison within the same setup, and it is therefore more general than other existing tools.
Conclusion: We show that protein comparison based on a vectorial representation of protein structure performs comparably to established algorithms based on coordinates. The conceptually new approach presented in this publication might assist to unify the view on protein comparison by unifying structure and sequence descriptions in this context. The framework discussed here is implemented in the 'SABERTOOTH' alignment server, freely accessible at http://www.fkp.tu-darmstadt.de/sabertooth/.