The increasing pace of acquisition of fully sequenced genomes makes desirable a program of discovery and characterization of protein sequences of biologically significant structural classes. An example is protein phosphatases, involved in modulating reversible protein phosphorylation events underlying the whole gamut of cellular biology. The ready availability of software that can be downloaded to run on a personal computer, or accessed on a server via the Web, allows appropriate sequences to be collected and analyzed. A process is outlined here that has been successfully employed in the description of the genomic complement of protein phosphatase catalytic subunits from the model plant Arabidopsis thaliana. However, the methods are general and readily adapted to deal with any desired class of protein, from any organism of interest.