Large-scale cDNA analysis provides several great advantages for genome investigations in rice. Isolated and partially characterized cDNA clones have contributed not only to the construction of an RFLP linkage map and physical maps of the chromosomes but also to investigations of the mechanisms of expression of various isozymes and family genes. The ultimate aim of our large-scale cDNA analysis is to catalogue all the expressed genes of this important cereal, including tissue-specific, developmental stage-specific, and stress-specific genes. As of August 1996, the Rice Genome Research Program (RGP) has isolated and partially sequenced more than 29,000 cDNA clones from various tissues and calluses in rice (Nipponbare, a japonica variety). The sequence data were translated into amino acid sequences for the 3 possible reading frames, and the similarity of these amino acid sequences to known proteins registered in PIR were examined. About 25% of the clones had significant similarities to known proteins. Some of the hit clones showed library-specific distributions, indicating that the composition of the clones in each library reflects, to some extent, the regulation of gene expression specific to differentiation, growth condition, or environmental stress. To further characterize the cDNA clones, including unknown clones, nucleotide sequence similarities of 24,728 clones were analyzed and the clones were classified into around 10,000 independent groups, suggesting that around a half or one third of expressed genes in rice have already been captured. These results obtained from our large-scale cDNA analysis provide useful information related to gene expression and regulation in rice.