EpiJSON: A unified data-format for epidemiology

Epidemics. 2016 Jun:15:20-6. doi: 10.1016/j.epidem.2015.12.002. Epub 2015 Dec 29.

Abstract

Epidemiology relies on data but the divergent ways data are recorded and transferred, both within and between outbreaks, and the expanding range of data-types are creating an increasingly complex problem for the discipline. There is a need for a consistent, interpretable and precise way to transfer data while maintaining its fidelity. We introduce 'EpiJSON', a new, flexible, and standards-compliant format for the interchange of epidemiological data using JavaScript Object Notation. This format is designed to enable the widest range of epidemiological data to be unambiguously held and transferred between people, software and institutions. In this paper, we provide a full description of the format and a discussion of the design decisions made. We introduce a schema enabling automatic checks of the validity of data stored as EpiJSON, which can serve as a basis for the development of additional tools. In addition, we also present the R package 'repijson' which provides conversion tools between this format, line-list data and pre-existing analysis tools. An example is given to illustrate how EpiJSON can be used to store line list data. EpiJSON, designed around modern standards for interchange of information on the internet, is simple to implement, read and check. As such, it provides an ideal new standard for epidemiological, and other, data transfer to the fast-growing open-source platform for the analysis of disease outbreaks.

Keywords: Communications standards; Databases; Epidemics; Outbreaks; Software.

MeSH terms

  • Datasets as Topic*
  • Disease Outbreaks / prevention & control*
  • Epidemiologic Methods*
  • Humans
  • Software*