The three-dimensional fold of a new protein sequence can often be inferred directly from sequence homology to a protein of known structure. The function of a new protein sequence is more difficult to predict, however, since homologues can have different molecular and cellular functions. To develop and automate computational methods for determining molecular function, we have analyzed ligand-binding specificity in two related families of binding proteins. One of these families includes Escherichia coli lactose repressor and ribose-binding protein, and the other includes E. coli sulfate- and phosphate-binding proteins. These proteins have similar folds but varying specificity, binding many different small molecules, including mono- and disaccharides, purines, oxyanions, ferric iron, and polyamines. Starting from template structural alignments, alignments of over 90 sequences per family were generated by iterative database searches with hidden Markov models. Phylogenetic trees were made of full-length sequences and of subsets of residues lining the binding cleft, to determine whether subbranches of the trees correlate with ligand-binding preference. Automated analyses of residues in the binding pocket were also used to predict ligand-binding function for many uncharacterized database sequences and to identify specific side chain-ligand contacts in proteins without solved structures. Our results demonstrate the utility of anchoring functional annotation within a protein family context.