QSARpy: A new flexible algorithm to generate QSAR models based on dissimilarities. The log Kow case study

Sci Total Environ. 2018 Oct 1;637-638:1158-1165. doi: 10.1016/j.scitotenv.2018.05.072. Epub 2018 May 14.


Several methods exist to develop QSAR models automatically. Some are based on indices of the presence of atoms, other on the most similar compounds, other on molecular descriptors. Here we introduce QSARpy v1.0, a new QSAR modeling tool based on a different approach: the dissimilarity. This tool fragments the molecules of the training set to extract fragments that can be associated to a difference in the property/activity value, called modulators. If the target molecule share part of the structure with a molecule of the training set and differences can be explained with one or more modulators, the property/activity value of the molecule of the training set is adjusted using the value associated to the modulator(s). This tool is tested here on the n-octanol/water partition coefficient (Kow, usually expressed in logarithmic units as log Kow). It is a key parameter in risk assessment since it is a measure of hydrophobicity. Its wide spread use makes these estimation methods very useful to reduce testing costs. Using QSARpy v1.0, we obtained a new model to predict log Kow with accurate performance (RMSE 0.43 and R2 0.94 for the external test set), comparing favorably with other programs. QSARpy is freely available on request.

Keywords: Hydrophobicity; Log Kow; QSAR; QSPR; Substructure-based models.

MeSH terms

  • Algorithms*
  • Hydrophobic and Hydrophilic Interactions
  • Models, Chemical*
  • Quantitative Structure-Activity Relationship*
  • Water


  • Water