Sporamin, the major soluble protein of the sweet potato tuberous root, is coded for by a multigene family. Fourty-nine essentially full-length sporamin cDNAs isolated from tuberous root cDNA library have been classified by cross hybridization, restriction endonuclease cleavage pattern and ribonuclease cleavage mapping. All the cDNAs fall into one of the two distinct homology groups, subfamilies A and B, which correspond to the polypeptide classes sporamin A and B, respectively. At least 5 different sequences are detected in both of the 22 sporamin A and 27 sporamin B cDNAs. Comparison of the nucleotide sequences of the coding region of three each of sporamin A and B subfamily members, four from cDNAs and two from genomic clones, indicates that intra-subfamily homologies (94 to 98%) are much higher than inter-subfamily homologies (82 to 84%), and there are deletions or insertions of one or two codons at three locations which characterize each subfamily. Large portions of base substitutions in the coding region accompany amino acid substitutions. In contrast to the coding region, most of the structural differences among the members in the 5' and 3' noncoding regions are deletions or insertions.