[Methods for handling incomplete data in health research: a critical look]

Maylée Cañizares; Isabel Barroso; Karen Alfonso

doi:10.1016/s0213-9111(04)72000-2

[Methods for handling incomplete data in health research: a critical look]

Gac Sanit. 2004 Jan-Feb;18(1):58-63. doi: 10.1016/s0213-9111(04)72000-2.

[Article in Spanish]

Authors

Maylée Cañizares¹, Isabel Barroso, Karen Alfonso

Affiliation

¹ Instituto Nacional de Higiene, Epidemiología y Microbiología, Havana, Cuba. mcperez@yahoo.com

PMID: 14980174
DOI: 10.1016/s0213-9111(04)72000-2

Abstract

Objective: To illustrate methods for handling incomplete data in health research.

Methods: Two strategies for handling missing data are presented: complete-case analysis and imputations. The imputations used were mean imputations, regression imputations, and multiple imputations. These strategies are illustrated in the context of logistic regression through an example using data from the "Second Cuban national survey on risk factors and non communicable disease", carried out in 2001.

Results: The results obtained via mean and regression imputation were similar. The odds ratios were overestimated by 10%. The results of complete-case analysis showed the greatest difference from the reference odds ratios, with a variation of between 2 and 65%. The three methods distorted the relationship between age and hypertension. Multiple imputations produced estimates closest to those of the reference estimates with a variation of less than 16%. This was the only procedure preserving the relationship between age and hypertension.

Conclusions: Selecting methods for handling missing data is difficult, since the same procedure can give precise estimations in certain circumstances and not in others. Complete-case analysis should be used with caution due to the substantial loss of information it produces. Mean and regression imputations produce unreliable estimates under missing at random (MAR) mechanisms.

Publication types

English Abstract
Review

MeSH terms

Cuba / epidemiology
Health Surveys
Humans
Logistic Models
Models, Theoretical
Research / statistics & numerical data*
Risk Factors