On the Distribution of Summary Statistics for Missing Data

Commun Stat Theory Methods. 2019;48(5):1149-1165. doi: 10.1080/03610926.2018.1425447. Epub 2018 Jan 24.


Under an assumption that missing values occur randomly in a matrix, formulae are developed for the expected value and variance of six statistics that summarize the number and location of the missing values. For a seventh statistic, a regression model based on simulated data yields an estimate of the expected value. The results can be used in the development of methods to control the Type I error and approximate power and sample size for multilevel and longitudinal studies with missing data.

Keywords: longitudinal; missing data; multilevel; multinomial; power.