Observed and estimated prevalence of Covid-19 in Italy: How to estimate the total cases from medical swabs data

Sci Total Environ. 2020 Oct 8;142799. doi: 10.1016/j.scitotenv.2020.142799. Online ahead of print.

Abstract

During the Covid-19 pandemic in Italy, official data are collected with medical swabs following a pure convenience criterion which, at least in an early phase, has privileged the exam of patients showing evident symptoms. However, there are evidences of a very high proportion of asymptomatic patients. In this situation, in order to estimate the real number of infected (and to estimate the lethality rate), it should be necessary to run a properly designed sample survey through which it would be possible to calculate the probability of inclusion and hence draw sound probabilistic inference. Unfortunately, the survey run by the Italian Statistical Institute encountered many field difficulties. Some researchers proposed estimates of the total prevalence based on various approaches, including epidemiologic models, time series and the analysis of data collected in countries that faced the epidemic in earlier times. In this paper, we propose to estimate the prevalence of Covid-19 in Italy by reweighting the available official data published by the Istituto Superiore di Sanità so as to obtain a more representative sample of the Italian population. Reweighting is a procedure commonly used to artificially modify the sample composition so as to obtain a distribution which is more similar to the population. In this paper, we will use post-stratification of the official data, in order to derive the weights necessary for reweighting the sample results, using age and gender as post-stratification variables, thus obtaining more reliable estimation of prevalence and lethality. Specifically, for Italy, we obtain a prevalence of 9%. The proposed methodology represents a reasonable approximation while waiting for more reliable data obtained with a properly designed national sample survey and that it could be further improved if more data were made available.

Keywords: Convenience sampling; Covid-19; Lethality; Post-stratification; Prevalence.