ToxML, a data exchange standard with content controlled vocabulary used to build better (Q)SAR models

SAR QSAR Environ Res. 2013;24(6):429-38. doi: 10.1080/1062936X.2013.783506. Epub 2013 Apr 27.


Development of accurate quantitative structure-activity relationship (QSAR) models requires the availability of high quality validated data. International regulations such as REACH in Europe will now accept (Q)SAR-based evaluations for risk assessment. The number of toxicity datasets available for those wishing to share knowledge, or to use for data mining and modelling, is continually expanding. The challenge is the current use of a multitude of different data formats. The issues of comparing or combining disparate data apply both to public and proprietary sources. The ToxML project addresses the need for a common data exchange standard that allows the representation and communication of these data in a well-structured electronic format. It is an open standard based on Extensible Markup Language (XML). Supporting information for overall toxicity endpoint data can be included within ToxML files. This makes it possible to assess the quality and detail of the data used in a model. The data file model allows the aggregation of experimental data to the compound level in the detail needed to support (Q)SAR work. The standard is published on a website together with tools to view, edit and download it.

MeSH terms

  • Databases, Factual*
  • Electronic Data Processing / methods*
  • Europe
  • Information Dissemination / methods*
  • Quantitative Structure-Activity Relationship*
  • Risk Assessment
  • Toxicology / methods*
  • Vocabulary, Controlled*