Scientific progress depends on formulating testable hypotheses informed by the literature. In many domains, however, this model is strained because the number of research papers exceeds human readability. Here, we developed computational assistance to analyze the biomedical literature by reading PubMed abstracts to suggest new hypotheses. The approach was tested experimentally on the tumor suppressor p53 by ranking its most likely kinases, based on all available abstracts. Many of the best-ranked kinases were found to bind and phosphorylate p53 (P value = 0.005), suggesting six likely p53 kinases so far. One of these, NEK2, was studied in detail. A known mitosis promoter, NEK2 was shown to phosphorylate p53 at Ser315 in vitro and in vivo and to functionally inhibit p53. These bona fide validations of text-based predictions of p53 phosphorylation, and the discovery of an inhibitory p53 kinase of pharmaceutical interest, suggest that automated reasoning using a large body of literature can generate valuable molecular hypotheses and has the potential to accelerate scientific discovery.
Keywords: automated hypothesis generation; kinase; literature text mining; p53 inhibition; protein–protein interaction.
Copyright © 2018 the Author(s). Published by PNAS.