A Cloud-Based Metabolite and Chemical Prioritization System for the Biology/Disease-Driven Human Proteome Project

J Proteome Res. 2018 Dec 7;17(12):4345-4357. doi: 10.1021/acs.jproteome.8b00378. Epub 2018 Aug 21.


Targeted metabolomics and biochemical studies complement the ongoing investigations led by the Human Proteome Organization (HUPO) Biology/Disease-Driven Human Proteome Project (B/D-HPP). However, it is challenging to identify and prioritize metabolite and chemical targets. Literature-mining-based approaches have been proposed for target proteomics studies, but text mining methods for metabolite and chemical prioritization are hindered by a large number of synonyms and nonstandardized names of each entity. In this study, we developed a cloud-based literature mining and summarization platform that maps metabolites and chemicals in the literature to unique identifiers and summarizes the copublication trends of metabolites/chemicals and B/D-HPP topics using Protein Universal Reference Publication-Originated Search Engine (PURPOSE) scores. We successfully prioritized metabolites and chemicals associated with the B/D-HPP targeted fields and validated the results by checking against expert-curated associations and enrichment analyses. Compared with existing algorithms, our system achieved better precision and recall in retrieving chemicals related to B/D-HPP focused areas. Our cloud-based platform enables queries on all biological terms in multiple species, which will contribute to B/D-HPP and targeted metabolomics/chemical studies.

Keywords: Biology/Disease-Driven Human Proteome Project; Biomedical Entity Search Tool (BEST); Finding Associated Concepts with Text Analysis (FACTA+); Protein Universal Reference Publication-Originated Search Engine (PURPOSE); chemicals; literature mining; metabolomics.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cloud Computing*
  • Data Mining / methods
  • Humans
  • Metabolomics*
  • Proteome*
  • Search Engine


  • Proteome