A decoction-inspired genetic algorithm and PPO reinforcement learning for intelligent molecular discovery: Anti-colorectal cancer candidates from the tiao-pi AnChang formula

Comput Biol Med. 2026 May 1:207:111630. doi: 10.1016/j.compbiomed.2026.111630. Epub 2026 Mar 20.

Abstract

Background: Colorectal cancer (CRC) remains among the leading causes of cancer-related mortality worldwide. The TiaoPi AnChang Decoction (TPACD), a traditional Chinese herbal formula, has shown potential anti-CRC activity in preclinical studies (in vitro and in vivo); however, its active chemical basis and key target interactions remain unclear. The complex, multicomponent nature of TPACD poses significant challenges for systematically identifying and optimizing compounds.

Purpose: This study aimed to establish an intelligent molecular discovery framework that integrates a decoction-inspired genetic algorithm (EGA) with proximal policy optimization (PPO)-based reinforcement learning to simulate the dynamic transformation and recombination processes underlying traditional herbal decoctions and to generate and prioritize in silico candidate molecules for further anti-CRC evaluation.

Study design: A parallel molecular generation strategy was developed, comprising three algorithmic pathways, namely, the basic molecular generation algorithm (BMGA), enhanced molecular generation algorithm (EMGA), and intelligent molecular generation algorithm (IMGA), which represent progressive levels of structural diversity, biological relevance, and optimization capacity.

Methods: The EGA simulated the selection-transformation-recombination principles of decoction through adaptive mutation, fragment recombination, and a diversity-preserving selection procedure. PPO-based reinforcement learning further refined the molecular properties via a reward-guided exploration of the chemical space. The BMGA enabled baseline molecular generation; the EMGA incorporated affinity-based selection; and the IMGA achieved multiobjective optimization, integrating drug likeness, novelty, and toxicity filtering. The candidate molecules were validated by affinity prediction, QED scoring, and molecular docking.

Results: All three methods generated chemically valid and pharmacologically enhanced molecules relative to the original TPACD components. The EMGA provided improved biological affinity and exploration efficiency, whereas the IMGA achieved the highest overall performance level, with superior QED and docking stability, and predicted the CRC inhibition potential of the molecules.

Conclusion: The proposed EGA-PPO framework effectively connects traditional decoction principles with modern AI-based drug design. The tri-algorithmic system (BMGA-EMGA-IMGA) provides a scalable and interpretable strategy for de novo molecular discovery, offering promising leads for acquiring natural product-derived CRC therapeutics.

Keywords: Colorectal cancer; Genetic algorithm; Molecular generation; Proximal policy optimization; TiaoPi AnChang decoction.

MeSH terms

  • Algorithms*
  • Antineoplastic Agents* / chemistry
  • Colorectal Neoplasms* / drug therapy
  • Colorectal Neoplasms* / genetics
  • Drug Discovery* / methods
  • Drugs, Chinese Herbal* / chemistry
  • Drugs, Chinese Herbal* / pharmacology
  • Genetic Algorithms
  • Humans
  • Machine Learning*

Substances

  • Drugs, Chinese Herbal
  • Antineoplastic Agents