Comparative performance of structural aligners in functional domain annotation

J Struct Biol. 2026 Mar 3;218(2):108308. doi: 10.1016/j.jsb.2026.108308. Online ahead of print.

Abstract

Accurate protein domain annotation is essential for inferring protein function, and databases such as Pfam provide sequence-derived signatures for thousands of domain families. Because protein structure is more evolutionarily conserved than sequence, structure-based searches can detect homologous relationships even at low sequence identity (typically below 30%), where pairwise sequence aligners often lose sensitivity. Here, we leverage AlphaFold-derived structures of Pfam domain instances to systematically evaluate structure-based versus sequence-based methods for Pfam annotation. We benchmarked three structural aligners (Reseek, Foldseek, TM-align) against sequence-based methods (MMseqs, HMMER) using both exhaustive all-against-all searches and a split-family design that enables direct comparison of pairwise and profile-based ranking performance. We also evaluated residue-level alignment accuracy using Pfam multiple sequence alignments as reference and investigated whether profile-derived information can improve structural hit ranking. In all-against-all searches, Reseek achieved the highest sensitivity up to the first false positive (AUC = 0.85), outperforming Foldseek (0.81), TM-align (0.76), and MMseqs (0.46). In split-family evaluation, HMMER remained superior (maximum F1 = 0.991), highlighting the continued strength of sequence-profile approaches for family-level annotation. Performance varied substantially across domain families, with average sequence identity emerging as the strongest predictor of success. Structural aligners consistently produced more accurate residue-level mappings than pairwise sequence methods. Finally, incorporating profile-derived information via rescoring improved structural annotation performance for short domains, suggesting a path toward profile-informed structure-based domain annotation.

Keywords: AlphaFold; Domain annotation; Pfam; Structural alignment.