A strategy for assembling samples of adult twin pairs in the United States

Stat Med. 1993 Sep 30;12(18):1693-702. doi: 10.1002/sim.4780121805.

Abstract

In this paper we develop a methodology for the identification of large numbers of U.S. adult twin pairs. Data for this study derive from the U.S. Department of Defense and the Vietnam Era Twin (VET) Registry. The Department of Defense identified potential male twins (n = 10,002) using a computerized record linkage algorithm based on the same last name, same date of birth, and the same first five digits of the Social Security number. Twinship was confirmed by comparison with the Vietnam Era Twin Registry. We developed a logistic regression model that predicts the probability that a paired record identifies twins based on the absolute difference in the last four digits in the Social Security number, the age of issuance of the Social Security number, and the frequency of occurrence of the last name. We used the estimated coefficients derived from this regression model to assign predicted probabilities of being a twin to each matched record. There is a close correspondence between the observed and expected number of twins when evaluated across deciles of predicted probabilities of being a twin; the value of the Harrell's c index (c = 0.68 +/- 0.0004) indicates the overall predictive accuracy of the regression equation. The results from this study demonstrate the feasibility of identifying adult male-male twin pairs from any large computerized database that contains name, date of birth and Social Security number. However, the selection criteria used in the creation of the computer database must be clearly specified to avoid constructing a biased sample of twins.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adult
  • Data Interpretation, Statistical
  • Humans
  • Information Systems
  • Male
  • Medical Record Linkage*
  • Probability
  • Registries / statistics & numerical data*
  • Sampling Studies*
  • Twins / statistics & numerical data*
  • United States