Analysis of COVID-19 clinical trials: A data-driven, ontology-based, and natural language processing approach

PLoS One. 2020 Sep 30;15(9):e0239694. doi: 10.1371/journal.pone.0239694. eCollection 2020.


With the novel COVID-19 pandemic disrupting and threatening the lives of millions, researchers and clinicians have been recently conducting clinical trials at an unprecedented rate to learn more about the virus and potential drugs/treatments/vaccines to treat its infection. As a result of the influx of clinical trials, researchers, clinicians, and the lay public, now more than ever, face a significant challenge in keeping up-to-date with the rapid rate of discoveries and advances. To remedy this problem, this research mined the corpus to extract COVID-19 related clinical trials, produce unique reports to summarize findings and make the meta-data available via Application Programming Interfaces (APIs). Unique reports were created for each drug/intervention, Medical Subject Heading (MeSH) term, and Human Phenotype Ontology (HPO) term. These reports, which have been run over multiple time points, along with APIs to access meta-data, are freely available at The pipeline, reports, association of COVID-19 clinical trials with MeSH and HPO terms, insights, public repository, APIs, and correlations produced are all novel in this work. The freely available, novel resources present up-to-date relevant biological information and insights in a robust, accessible manner, illustrating their invaluable potential to aid researchers overcome COVID-19 and save hundreds of thousands of lives.

MeSH terms

  • Betacoronavirus
  • Biological Ontologies*
  • COVID-19
  • Clinical Trials as Topic*
  • Computational Biology
  • Coronavirus Infections / therapy*
  • Humans
  • Internet
  • Medical Subject Headings
  • Natural Language Processing*
  • Pandemics
  • Phenotype
  • Pneumonia, Viral / therapy*
  • SARS-CoV-2
  • Software

Grant support

The author received no specific funding for this work.