A Narrative Literature Review of Natural Language Processing Applied to the Occupational Exposome

Int J Environ Res Public Health. 2022 Jul 13;19(14):8544. doi: 10.3390/ijerph19148544.

Abstract

The evolution of the Exposome concept revolutionised the research in exposure assessment and epidemiology by introducing the need for a more holistic approach on the exploration of the relationship between the environment and disease. At the same time, further and more dramatic changes have also occurred on the working environment, adding to the already existing dynamic nature of it. Natural Language Processing (NLP) refers to a collection of methods for identifying, reading, extracting and untimely transforming large collections of language. In this work, we aim to give an overview of how NLP has successfully been applied thus far in Exposome research.

Methods: We conduct a literature search on PubMed, Scopus and Web of Science for scientific articles published between 2011 and 2021. We use both quantitative and qualitative methods to screen papers and provide insights into the inclusion and exclusion criteria. We outline our approach for article selection and provide an overview of our findings. This is followed by a more detailed insight into selected articles.

Results: Overall, 6420 articles were screened for the suitability of this review, where we review 37 articles in depth. Finally, we discuss future avenues of research and outline challenges in existing work.

Conclusions: Our results show that (i) there has been an increase in articles published that focus on applying NLP to exposure and epidemiology research, (ii) most work uses existing NLP tools and (iii) traditional machine learning is the most popular approach.

Keywords: exposome; exposure research; machine learning; natural language processing.

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Exposome*
  • Machine Learning
  • Narration
  • Natural Language Processing*
  • PubMed

Grants and funding

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 874703.