Optimal multivariate matching before randomization

Robert Greevy; Bo Lu; Jeffrey H Silber; Paul Rosenbaum

doi:10.1093/biostatistics/5.2.263

Optimal multivariate matching before randomization

Biostatistics. 2004 Apr;5(2):263-75. doi: 10.1093/biostatistics/5.2.263.

Authors

Robert Greevy¹, Bo Lu, Jeffrey H Silber, Paul Rosenbaum

Affiliation

¹ Department of Statistics, The Wharton School, University of Pennsylvania, 400 Jon M. Huntsman Hall, 3730 Walnut Street, Philadelphia, PA 19104-6340, USA.

PMID: 15054030
DOI: 10.1093/biostatistics/5.2.263

Abstract

Although blocking or pairing before randomization is a basic principle of experimental design, the principle is almost invariably applied to at most one or two blocking variables. Here, we discuss the use of optimal multivariate matching prior to randomization to improve covariate balance for many variables at the same time, presenting an algorithm and a case-study of its performance. The method is useful when all subjects, or large groups of subjects, are randomized at the same time. Optimal matching divides a single group of 2n subjects into n pairs to minimize covariate differences within pairs-the so-called nonbipartite matching problem-then one subject in each pair is picked at random for treatment, the other being assigned to control. Using the baseline covariate data for 132 patients from an actual, unmatched, randomized experiment, we construct 66 pairs matching for 14 covariates. We then create 10000 unmatched and 10000 matched randomized experiments by repeatedly randomizing the 132 patients, and compare the covariate balance with and without matching. By every measure, every one of the 14 covariates was substantially better balanced when randomization was performed within matched pairs. Even after covariance adjustment for chance imbalances in the 14 covariates, matched randomizations provided more accurate estimates than unmatched randomizations, the increase in accuracy being equivalent to, on average, a 7% increase in sample size. In randomization tests of no treatment effect, matched randomizations using the signed rank test had substantially higher power than unmatched randomizations using the rank sum test, even when only 2 of 14 covariates were relevant to a simulated response. Unmatched randomizations experienced rare disasters which were consistently avoided by matched randomizations.

Publication types

Comparative Study
Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Adult
Algorithms*
Angiotensin-Converting Enzyme Inhibitors / pharmacology
Anthracyclines / adverse effects
Anthracyclines / therapeutic use
Antineoplastic Agents / adverse effects
Antineoplastic Agents / therapeutic use
Cardiac Output / drug effects
Child
Enalapril / pharmacology
Humans
Matched-Pair Analysis*
Neoplasms / drug therapy
Randomized Controlled Trials as Topic / methods*
Research Design*

Substances

Angiotensin-Converting Enzyme Inhibitors
Anthracyclines
Antineoplastic Agents
Enalapril

Grants and funding

R01 HL-50424/HL/NHLBI NIH HHS/United States