Big data hurdles in precision medicine and precision public health

BMC Med Inform Decis Mak. 2018 Dec 29;18(1):139. doi: 10.1186/s12911-018-0719-2.


Background: Nowadays, trendy research in biomedical sciences juxtaposes the term 'precision' to medicine and public health with companion words like big data, data science, and deep learning. Technological advancements permit the collection and merging of large heterogeneous datasets from different sources, from genome sequences to social media posts or from electronic health records to wearables. Additionally, complex algorithms supported by high-performance computing allow one to transform these large datasets into knowledge. Despite such progress, many barriers still exist against achieving precision medicine and precision public health interventions for the benefit of the individual and the population.

Main body: The present work focuses on analyzing both the technical and societal hurdles related to the development of prediction models of health risks, diagnoses and outcomes from integrated biomedical databases. Methodological challenges that need to be addressed include improving semantics of study designs: medical record data are inherently biased, and even the most advanced deep learning's denoising autoencoders cannot overcome the bias if not handled a priori by design. Societal challenges to face include evaluation of ethically actionable risk factors at the individual and population level; for instance, usage of gender, race, or ethnicity as risk modifiers, not as biological variables, could be replaced by modifiable environmental proxies such as lifestyle and dietary habits, household income, or access to educational resources.

Conclusions: Data science for precision medicine and public health warrants an informatics-oriented formalization of the study design and interoperability throughout all levels of the knowledge inference process, from the research semantics, to model development, and ultimately to implementation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Big Data*
  • Databases, Factual
  • Delivery of Health Care*
  • Electronic Health Records
  • Humans
  • Precision Medicine*
  • Public Health*
  • Social Media