Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2017;17(15):1709-1726.
doi: 10.2174/1568026617666161116143440.

Bioinformatics and Drug Discovery

Affiliations
Free PMC article
Review

Bioinformatics and Drug Discovery

Xuhua Xia. Curr Top Med Chem. .
Free PMC article

Abstract

Bioinformatic analysis can not only accelerate drug target identification and drug candidate screening and refinement, but also facilitate characterization of side effects and predict drug resistance. High-throughput data such as genomic, epigenetic, genome architecture, cistromic, transcriptomic, proteomic, and ribosome profiling data have all made significant contribution to mechanismbased drug discovery and drug repurposing. Accumulation of protein and RNA structures, as well as development of homology modeling and protein structure simulation, coupled with large structure databases of small molecules and metabolites, paved the way for more realistic protein-ligand docking experiments and more informative virtual screening. I present the conceptual framework that drives the collection of these high-throughput data, summarize the utility and potential of mining these data in drug discovery, outline a few inherent limitations in data and software mining these data, point out news ways to refine analysis of these diverse types of data, and highlight commonly used software and databases relevant to drug discovery.

Keywords: Drug candidate; Drug screening; Drug target; Epigenetics; Genomics; Proteomics; Structure; Transcriptomics.

Figures

Fig. (1)
Fig. (1)
Major types of high-throughput data and their key information relevant to drug discovery. Metabolomic data belong to cheminformatics and are not included.
Fig. (2)
Fig. (2)
A general framework of epigenetic effects on gene expression, through 1) DNA methylation and histone acetylation/deacetylation, 2) alteration of DNA-binding proteins and consequent protein-DNA and protein-protein interactions, and 3) alteration of long-distance interactions such as enhancer-promotor interactions. LM – laboratory method, BQ: sample bioinformatic questions.
Fig. (3)
Fig. (3)
Numerical illustration of applying Idd in Eq. (1) in phenotypic screening to two sets of transcriptomic data (a) and (b). Gn, Gp and Gd refer to gene expression of normal cells, disease cells before drug application, and disease cells after drug application, respectively.
Fig. (4)
Fig. (4)
Allocation of shared reads in a gene family with three paralogous genes A, B and C with three idealized segments with a conserved identical middle segment, strongly homologous first segment that is identical in B and C, and a diverged third segment. Reads and the gene segment they match to are of the same color. (The color version of the figure is available in the electronic copy of the article).

Similar articles

See all similar articles

Cited by 6 articles

See all "Cited by" articles

References

    1. David E., Tramontin T., Zemmel R., Pharmaceutical R. D: the road to positive returns. Nat. Rev. Drug Discov. 2009;8:609–610. - PubMed
    1. Drews J., Ryser S. The role of innovation in drug development. Nat. Biotechnol. 1997;15:1318–1319. - PubMed
    1. Davies J., Davies D. Origins and evolution of antibiotic resistance. Microbiol. Mol. Biol. Rev. 2010;74:417–433. - PMC - PubMed
    1. Boxall A.B., Rudd M.A., Brooks B.W., Caldwell D.J., Choi K., Hickmann S., Innes E., Ostapyk K., Staveley J.P., Verslycke T., Ankley G.T., Beazley K.F., Belanger S.E., Berninger J.P., Carriquiriborde P., Coors A., Deleo P.C., Dyer S.D., Ericson J.F., Gagne F., Giesy J.P., Gouin T., Hallstrom L., Karlsson M.V., Larsson D.G., Lazorchak J.M., Mastrocco F., McLaughlin A., McMaster M.E., Meyerhoff R.D., Moore R., Parrott J.L., Snape J.R., Murray-Smith R., Servos M.R., Sibley P.K., Straub J.O., Szabo N.D., Topp E., Tetreault G.R., Trudeau V.L., Van Der Kraak G. Pharmaceuticals and personal care products in the environment: what are the big questions? Environ. Health Perspect. 2012;120:1221–1219. - PMC - PubMed
    1. Doolittle R.F., Hunkapiller M.W., Hood L.E., Devare S.G., Robbins K.C., Aaronson S.A., Antoniades H.N. Simian sarcoma virus onc gene, v-sis, is derived from the gene (or genes) encoding a platelet-derived growth factor. Science. 1983;221:275–277. - PubMed
Feedback