Soybean is a complex matrix containing several potentially bioactive components. The objective was to develop a statistical model to predict the in vitro anticancer potential of soybean varieties based on the correlation between protein composition and bioactive components after simulated gastrointestinal enzyme digestion with their effect on leukemia mouse cells. The IC 50 values of the hydrolysates of soy genotypes (NB1-NB7) on L1210 leukemia cells ranged from 3.5 to 6.2 mg/mL. Depending on genotype, each gram of soy hydrolysates contained 2.7-6.6 micromol of total daidzein, 3.0-4.7 micromol of total genistein, 0.5-1.3 micromol of glycitein, 2.1-2.8 micromol of total saponins, 0.1-0.2 micromol of lunasin, and 0.1-0.6 micromol of Bowman-Birk inhibitor (BBI). The IC 50 values calculated from a partial least-squares (PLS) analysis model correlated well with experimental data ( R (2) = 0.99). Isoflavones and beta-conglycinin positively contributed to the cytotoxicity of soy on L1210 leukemia cells. Lunasin and BBI were potent L1210 cell inhibitors (IC 50 = 13.9 and 22.5 microM, respectively), but made modest contributions to the activity of defatted soy flour hydrolysates due to their relatively low concentrations. In conclusion, the data demonstrated that beta-conglycinins are among the major protein components that inhibit leukemia cell growth in vitro. Furthermore, it was feasible to differentiate soybean varieties on the basis of the biological effect of their components using a statistical model and a cell-based assay.