Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jan 3;11(1):1.
doi: 10.1186/s13321-018-0323-6.

A retrosynthetic analysis algorithm implementation

Affiliations
Free PMC article

A retrosynthetic analysis algorithm implementation

Ian A Watson et al. J Cheminform. .
Free PMC article

Abstract

The need for synthetic route design arises frequently in discovery-oriented chemistry organizations. While traditionally finding solutions to this problem has been the domain of human experts, several computational approaches, aided by the algorithmic advances and the availability of large reaction collections, have recently been reported. Herein we present our own implementation of a retrosynthetic analysis method and demonstrate its capabilities in an attempt to identify synthetic routes for a collection of approved drugs. Our results indicate that the method, leveraging on reaction transformation rules learned from a large patent reaction dataset, can identify multiple theoretically feasible synthetic routes and, thus, support research chemist everyday efforts.

Keywords: Chemical synthesis; Reaction informatics; Retrosynthetic analysis; Synthetic route design.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
a A reaction example from US Patent 04703036 and b the reversed version of the same reaction. The reaction core is highlighted in red on the product side
Fig. 2
Fig. 2
A heterocycle forming reaction from US patent number US20030149264A1. The reaction core consists of all numbered atoms 1, 2, 3, 4, 5
Fig. 3
Fig. 3
Reaction clusters defined using smiles-like signatures highlighted in red (radius 0), red and green (radius 1) and red, green and purple (radius 2). Note the effect of the atom properties used to the classification produced
Fig. 4
Fig. 4
The RTSA-train process
Fig. 5
Fig. 5
Pseudocode describing the high level RTSA-Design process. Note that step 2 is optional and could be omitted. Alternatively, the process may be supplemented with a recursive RTSA search for each of the synthons not found to be available leading to a multi-step synthesis route
Fig. 6
Fig. 6
Fraction of reaction examples covered as a function of number of reaction signatures for different radii sizes. Note the steep curves indicating that a large fraction of the reactions examined is covered by relatively few reaction signatures
Fig. 7
Fig. 7
Example RRT’s extracted using signatures of radius 0 and support 10,000
Fig. 8
Fig. 8
Examples of reverse reaction templates exemplifying the 4 signatures of radius 0 with the highest support in Dataset 1. Note that the templates are reversed, i.e. the original products are to the left-hand side of the figure
Fig. 9
Fig. 9
DrugBankID:DB00619 and four potential synthesis routes identified by RTSA using reverse reaction templates of signature radius 2 and frequency 100 extracted from Dataset 1. Note that the origin of the reverse reaction template used to define the route is shown below each pair of synthons

Similar articles

Cited by

References

    1. Bruns RF, Watson IA. Rules for identifying potentially reactive or promiscuous compounds. J Med Chem. 2012;55:9763–9772. doi: 10.1021/jm301008n. - DOI - PubMed
    1. Christ CD, Zentgraf M, Kriegl JM. Mining electronic laboratory notebooks: analysis, retrosynthesis, and reaction based enumeration. J Chem Inf Model. 2012;52:1745–1756. doi: 10.1021/ci300116p. - DOI - PubMed
    1. Coley CW, Rogers L, Green WH, Jensen KF. Computer-assisted retrosynthesis based on molecular similarity. ACS Cent Sci. 2017;3:1237–1245. doi: 10.1021/acscentsci.7b00355. - DOI - PMC - PubMed
    1. Computational Chemistry and Chemoinformatics Group ELaC (2018) LillyMol Public Code. https://github.com/EliLillyCo/LillyMol
    1. Corey EJ. The logic of chemical synthesis: multistep synthesis of complex carbogenic molecules (Nobel Lecture) Angew Chem Int Ed Engl. 1991;30:455–465. doi: 10.1002/anie.199104553. - DOI

LinkOut - more resources