Background: Pathogen genomics have become increasingly important in infectious disease epidemiology and public health. The Strengthening the Reporting of Molecular Epidemiology for Infectious Diseases (STROME-ID) guidelines were developed to outline a minimum set of criteria that should be reported in genomic epidemiology studies to facilitate assessment of study quality. We evaluate such reporting practices, using tuberculosis as an example.
Methods: For this systematic review, we initially searched MEDLINE, Embase Classic, and Embase on May 3, 2017, using the search terms "tuberculosis" and "genom* sequencing". We updated this initial search on April 23, 2019, and also included a search of bioRxiv at this time. We included studies in English, French, or Spanish that recruited patients with microbiologically confirmed tuberculosis and used whole genome sequencing for typing of strains. Non-human studies, conference abstracts, and literature reviews were excluded. For each included study, the number and proportion of fulfilled STROME-ID criteria were recorded by two reviewers. A comparison of the mean proportion of fulfilled STROME-ID criteria before and after publication of the STROME-ID guidelines (in 2014) was done using a two-tailed t test. Quasi-Poisson regression and tobit regression were used to examine associations between study characteristics and the number and proportion of fulfilled STROME-ID criteria. This study was registered with PROSPERO, CRD42017064395.
Findings: 976 titles and abstracts were identified by our primary search, with an additional 16 studies identified in bioRxiv. 114 full texts (published between 2009 and 2019) were eligible for inclusion. The mean proportion of STROME-ID criteria fulfilled was 50% (SD 12; range 16-75). The proportion of criteria fulfilled was similar before and after STROME-ID publication (51% [SD 11] vs 46% , p=0·26). The number of criteria reported (among those applicable to all studies) was not associated with impact factor, h-index, country of affiliation of senior author, or sample size of isolates. Similarly, the proportion of criteria fulfilled was not associated with these characteristics, with the exception of a sample size of isolates of 277 or more (the highest quartile). In terms of reproducibility, 100 (88%) studies reported which bioinformatic tools were used, but only 33 (33%) reported corresponding version numbers. Sequencing data were available for 86 (75%) studies.
Interpretation: The reporting of STROME-ID criteria in genomic epidemiology studies of tuberculosis between 2009 and 2019 was low, with implications for assessment of study quality. The considerable proportion of studies without bioinformatics version numbers or sequencing data available highlights a key concern for reproducibility.