Mining the NCI anticancer drug discovery databases: genetic function approximation for the QSAR study of anticancer ellipticine analogues

J Chem Inf Comput Sci. Mar-Apr 1998;38(2):189-99. doi: 10.1021/ci970085w.


The U.S. National Cancer Institute (NCI) conducts a drug discovery program in which approximately 10,000 compounds are screened every year in vitro against a panel of 60 human cancer cell lines from different organs of origin. Since 1990, approximately 63,000 compounds have been tested, and their patterns of activity profiled. Recently, we analyzed the antitumor activity patterns of 112 ellipticine analogues using a hierarchical clustering algorithm. Dramatic coherence between molecular structures and activity patterns was observed qualitatively from the cluster tree. In the present study, we further investigate the quantitative structure-activity relationships (QSAR) of these compounds, in particular with respect to the influence of p53-status and the CNS cell selectivity of the activity patterns. Independent variables (i.e., chemical structural descriptors of the ellipticine analogues) were calculated from the Cerius2 molecular modeling package. Important structural descriptors, including partial atomic charges on the ellipticine ring-forming atoms, were identified by the recently developed genetic function approximation (GFA) method. For our data set, the GFA method gave better correlation and cross-validation results (R2 and CVR2 were usually approximately 0.3 higher) than did classical stepwise linear regression. A procedure for improving the performance of GFA is proposed, and the relative advantages and disadvantages of using GFA for QSAR studies are discussed.

MeSH terms

  • Algorithms
  • Antineoplastic Agents / chemistry*
  • Antineoplastic Agents / pharmacology*
  • Antineoplastic Agents, Phytogenic / chemistry
  • Antineoplastic Agents, Phytogenic / pharmacology
  • Cluster Analysis
  • Databases, Factual*
  • Drug Screening Assays, Antitumor
  • Ellipticines / chemistry*
  • Ellipticines / pharmacology*
  • Humans
  • National Institutes of Health (U.S.)
  • Regression Analysis
  • Structure-Activity Relationship
  • Tumor Cells, Cultured
  • United States


  • Antineoplastic Agents
  • Antineoplastic Agents, Phytogenic
  • Ellipticines