Using Open-Source Intelligence to Detect Early Signals of COVID-19 in China: Descriptive Study

JMIR Public Health Surveill. 2020 Sep 18;6(3):e18939. doi: 10.2196/18939.


Background: The coronavirus disease (COVID-19) outbreak in China was first reported to the World Health Organization (WHO) on December 31, 2019, and the first cases were officially identified around December 8, 2019. Although the origin of COVID-19 has not been confirmed, approximately half of the early cases were linked to a seafood market in Wuhan. However, the first two documented patients did not visit the seafood market. News reports, social media, and informal sources may provide information about outbreaks prior to formal notification.

Objective: The aim of this study was to identify early signals of pneumonia or severe acute respiratory illness (SARI) in China prior to official recognition of the COVID-19 outbreak in December 2019 using open-source data.

Methods: To capture early reports, we searched an open source epidemic observatory, EpiWatch, for SARI or pneumonia-related illnesses in China from October 1, 2019. The searches were conducted using Google and the Chinese search engine Baidu.

Results: There was an increase in reports following the official notification of COVID-19 to the WHO on December 31, 2019, and a report that appeared on December 26, 2019 was retracted. A report of severe pneumonia on November 22, 2019, in Xiangyang was identified, and a potential index patient was retrospectively identified on November 17.

Conclusions: The lack of reports of SARI outbreaks prior to December 31, 2019, with a retracted report on December 26, suggests media censorship, given that formal reports indicate that cases began appearing on December 8. However, the findings also support a relatively recent origin of COVID-19 in November 2019. The case reported on November 22 was transferred to Wuhan approximately one incubation period before the first identified cases on December 8; this case should be further investigated, as only half of the early cases were exposed to the seafood market in Wuhan. Another case of COVID-19 has since been retrospectively identified in Hubei on November 17, 2019, suggesting that the infection was present prior to December.

Keywords: COVID-19; biosecurity; epidemiology; infectious disease; surveillance.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Betacoronavirus
  • COVID-19
  • China / epidemiology
  • Coronavirus
  • Coronavirus Infections / diagnosis
  • Coronavirus Infections / epidemiology*
  • Coronavirus Infections / virology
  • Disclosure
  • Disease Outbreaks*
  • Documentation
  • Humans
  • Pandemics
  • Pneumonia
  • Pneumonia, Viral / diagnosis
  • Pneumonia, Viral / epidemiology*
  • Pneumonia, Viral / virology
  • Retrospective Studies
  • SARS-CoV-2
  • Search Engine
  • Severe Acute Respiratory Syndrome / diagnosis
  • Severe Acute Respiratory Syndrome / epidemiology*
  • Severe Acute Respiratory Syndrome / virology