The ankyrin repeat is one of the most common, modular, protein-protein interaction motifs in nature. To understand the structural determinants of this family of proteins and extract the consensus information that defines the architecture of this motif, we have designed a series of idealized ankyrin repeat proteins containing one, two, three, or four repeats by using statistical analysis of approximately 4,000 ankyrin repeat sequences from the PFAM database. Biophysical and x-ray crystallographic studies of the three and four repeat constructs (3ANK and 4ANK) to 1.26 and 1.5 A resolution, respectively, demonstrate that these proteins are well-folded, monomeric, display high thermostability, and adopt a very regular, tightly packed ankyrin repeat fold. Mapping the degree of amino acid conservation at each position on the 4ANK structure shows that most nonconserved residues are clustered on the surface of the molecule that has been designated as the binding site in naturally occurring ankyrin repeat proteins. Thus, the consensus amino acid sequence contains all information required to define the ankyrin repeat fold. Our results suggest that statistical analysis and the consensus sequence approach can be used as an effective method to design proteins with complex topologies. These generic ankyrin repeat proteins can serve as prototypes for dissecting the rules of molecular recognition mediated by ankyrin repeats and for engineering proteins with novel biological functions.