There is growing interest in the secondary use of health care data to evaluate medication safety in pregnancy. Tree-based scan statistics (TBSS) offer an innovative approach to help identify potential safety signals; they use hierarchically organized outcomes, generally based on existing clinical coding systems that group outcomes by organ system. When assessing teratogenicity, such groupings often lack a sound embryologic basis, given the etiologic heterogeneity of congenital malformations. The study objective was to enhance the grouping of congenital malformations to be used in scanning approaches through implementation of hierarchical clustering analysis (HCA) and to pilot test an HCA-enhanced TBSS approach for medication safety surveillance in pregnancy in 2 test cases using > 4.2 million mother-child dyads from 2 US-nationwide databases. Hierarchical clustering analysis identified (1) malformation combinations belonging to the same organ system already grouped in existing classifications, (2) known combinations across different organ systems not previously grouped, (3) unknown combinations not previously grouped, and (4) malformations seemingly standing on their own. Testing the approach with valproate and topiramate identified expected signals and a signal for an HCA-cluster missed by traditional classification. Augmenting existing classifications with clusters identified through large data exploration may be promising when defining phenotypes for surveillance and causal inference studies.
Keywords: clustering; malformations; pregnancy; surveillance; teratogen; topiramate; valproate.
© The Author(s) 2024. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.