Based on a set of six vector properties, the partial correlation diagram is calculated for a set of 28 S-alkylcysteine diazomethyl- and chloromethyl-ketone derivatives. Those with the greatest antileukemic activity in the same class correspond to high partial correlations. A periodic classification is performed based on information entropy. The first four characteristics denote the group, and the last two indicate the period. Compounds in the same period and, especially, group present similar properties. The most active substances are situated at the bottom right. Nine classes are distinguished. The principal component analysis of the homologous compounds shows five subclasses included in the periodic classification. Linear fits of both antileukemic activities and stability are good. They are in agreement with the principal component analysis. The variables that appear in the models are those that show positive loading in the principal component analysis. The most important properties to explain the antileukemic activities (50% inhibitory concentration Molt-3 T-lineage acute lymphoblastic leukemia minus the logarithm of 50% inhibitory concentration Nalm-6 B-lineage acute lymphoblastic leukemia and stability k) are ACD logD, surface tension and number of violations of Lipinski's rule of five. After leave-m-out cross-validation, the most predictive model for cysteine diazomethyl- and chloromethyl-ketone derivatives is provided.
Keywords: information entropy; partial correlation diagram; periodic classification; principal component analysis.