Prediction of drug absorption using multivariate statistics

J Med Chem. 2000 Oct 19;43(21):3867-77. doi: 10.1021/jm000292e.


Literature data on compounds both well- and poorly-absorbed in humans were used to build a statistical pattern recognition model of passive intestinal absorption. Robust outlier detection was utilized to analyze the well-absorbed compounds, some of which were intermingled with the poorly-absorbed compounds in the model space. Outliers were identified as being actively transported. The descriptors chosen for inclusion in the model were PSA and AlogP98, based on consideration of the physical processes involved in membrane permeability and the interrelationships and redundancies between available descriptors. These descriptors are quite straightforward for a medicinal chemist to interpret, enhancing the utility of the model. Molecular weight, while often used in passive absorption models, was shown to be superfluous, as it is already a component of both PSA and AlogP98. Extensive validation of the model on hundreds of known orally delivered drugs, "drug-like" molecules, and Pharmacopeia, Inc. compounds, which had been assayed for Caco-2 cell permeability, demonstrated a good rate of successful predictions (74-92%, depending on the dataset and exact criterion used).

MeSH terms

  • Biological Transport
  • Caco-2 Cells
  • Cell Membrane Permeability
  • Humans
  • Intestinal Absorption*
  • Models, Biological
  • Multivariate Analysis
  • Pharmaceutical Preparations / metabolism*
  • Reproducibility of Results


  • Pharmaceutical Preparations