A Nonparametric, Multiple Imputation-Based Method for the Retrospective Integration of Data Sets

Multivariate Behav Res. 2015;50(4):383-97. doi: 10.1080/00273171.2015.1022641.


Complex research questions often cannot be addressed adequately with a single data set. One sensible alternative to the high cost and effort associated with the creation of large new data sets is to combine existing data sets containing variables related to the constructs of interest. The goal of the present research was to develop a flexible, broadly applicable approach to the integration of disparate data sets that is based on nonparametric multiple imputation and the collection of data from a convenient, de novo calibration sample. We demonstrate proof of concept for the approach by integrating three existing data sets containing items related to the extent of problematic alcohol use and associations with deviant peers. We discuss both necessary conditions for the approach to work well and potential strengths and weaknesses of the method compared to other data set integration approaches.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adolescent
  • Adult
  • Behavioral Research / methods*
  • Child
  • Humans
  • Psychometrics / methods
  • Reproducibility of Results
  • Retrospective Studies*
  • Statistics, Nonparametric*
  • Young Adult