Imputation of precipitation data in northeast Brazil

An Acad Bras Cienc. 2023 Jun 5;95(2):e20210737. doi: 10.1590/0001-3765202320210737. eCollection 2023.

Abstract

This article evaluates four statistical methods of multiple imputation to fill in the missing data of daily precipitation in Northeast Brazil (NEB). We used a daily database collected by 94 rain gauges distributed in NEB from January 1, 1986 to December 31, 2015. The methods were: random sampling from the observed values; predictive mean matching, Bayesian linear regression; and bootstrap expectation maximization algorithm (BootEm). To compare these methods, missing data from the original series were initially excluded. The next step was to create three scenarios for each method, in which 10\%, 20\% and 30\% of the data were removed at random. The BootEM method presented the best statistical results. With the average bias between the complete series and the imputed series values ranging between -0.91 and 1.30 mm/day. The values of the Pearson correlation ranging between 0.96, 0.91 and 0.86 respectively for 10\%, 20\% and 30\% missing data. We conclude that this is an adequate method for the reconstruction of historical precipitation data in NEB.

MeSH terms

  • Bayes Theorem
  • Bias
  • Brazil
  • Linear Models
  • Research Design*