The function of a protein is often fulfilled via molecular interactions on its surfaces, so identifying the functional surface(s) of a protein is helpful for understanding its function. Here, we introduce the concept of a split pocket, which is a pocket that is split by a cognate ligand. We use a geometric approach that is site-specific. Specifically, we first compute a set of all pockets in the protein with its ligand(s) and a set of all pockets with the ligand(s) removed and then compare the two sets of pockets to identify the split pocket(s) of the protein. To reduce the search space and expedite the process of surface partitioning, we design probe radii according to the physicochemical textures of molecules. Our method achieves a success rate of 96% on a benchmark test set. We conduct a large-scale computation to identify approximately 19,000 split pockets from 11,328 structures (1.16 million potential pockets); for each pocket, we obtain residue composition, solvent-accessible area, and molecular volume. With this database of split pockets, our method can be used to predict the functional surfaces of unbound structures. Indeed, the functional surface of an unbound protein may often be found from its similarity to remotely related bound forms that belong to distinct folds. Finally, we apply our method to identify glucose-binding proteins, including unbound structures. Our study demonstrates the power of geometric and evolutionary matching for studying protein functional evolution and provides a framework for classifying protein functions by local spatial patterns of functional surfaces.
Copyright 2009 Wiley-Liss, Inc.