Statistical issues in longitudinal data analysis for treatment efficacy studies in the biomedical sciences

Mol Ther. 2010 Sep;18(9):1724-30. doi: 10.1038/mt.2010.127. Epub 2010 Jun 29.


Longitudinally collected outcomes are increasingly common in cell biology and gene therapy research. In this article, we review the current practice of statistical analysis of longitudinal data in these fields, and recommend the "best performing" statistical method among those available in most statistical packages. A survey of papers published in Molecular Therapy indicates that longitudinal data are only properly analyzed in a small fraction of articles, and the most popular approach was analyzing each measurement time point data separately using an analysis of variance (ANOVA) model with Tukey's post hoc tests. We show that first, such cross-sectional ANOVA approach does not utilize all the power that the longitudinal design of a study provides, and second, Tukey's post hoc tests applied at each measurement time separately could result in a false positivity rate as high as 30% using a simulation study. We recommend mixed effects model analysis instead. We also discuss the complexities of multiple comparison adjustment in the post hoc testing that result from within experimental unit correlation existing in longitudinal data. We recommend resampling as a method that readily adjusts the post hoc testing to be limited to only interesting comparisons and thereby avoids unduly sacrificing the power.

Publication types

  • Review

MeSH terms

  • Analysis of Variance*
  • Biomedical Research / methods*
  • Models, Statistical