The signature molecular descriptor. 1. Using extended valence sequences in QSAR and QSPR studies

J Chem Inf Comput Sci. 2003 May-Jun;43(3):707-20. doi: 10.1021/ci020345w.

Abstract

We present a new descriptor named signature based on extended valence sequence. The signature of an atom is a canonical representation of the atom's environment up to a predefined height h. The signature of a molecule is a vector of occurrence numbers of atomic signatures. Two QSAR and QSPR models based on signature are compared with models obtained using popular molecular 2D descriptors taken from a commercially available software (Molconn-Z). One set contains the inhibition concentration at 50% for 121 HIV-1 protease inhibitors, while the second set contains 12865 octanol/water partitioning coefficients (Log P). For both data sets, the models created by signature performed comparable to those from the commercially available descriptors in both correlating the data and in predicting test set values not used in the parametrization. While probing signature's QSAR and QSPR performances, we demonstrates that for any given molecule of diameter D, there is a molecular signature of height h </= D+1, from which any 2D descriptor can be computed. As a consequence of this finding any QSAR or QSPR involving 2D descriptors can be replaced with a relationship involving occurrence number of atomic signatures.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • HIV Protease Inhibitors / chemistry
  • HIV Protease Inhibitors / pharmacology
  • Inhibitory Concentration 50
  • Models, Chemical*
  • Molecular Structure
  • Octanols / chemistry
  • Quantitative Structure-Activity Relationship*
  • Software
  • Solubility
  • Water / chemistry

Substances

  • HIV Protease Inhibitors
  • Octanols
  • Water