Applying data science methodologies with artificial intelligence variant reinterpretation to map and estimate genetic disorder prevalence utilizing clinical data

Am J Med Genet A. 2024 May;194(5):e63505. doi: 10.1002/ajmg.a.63505. Epub 2024 Jan 2.

Abstract

Data science methodologies can be utilized to ascertain and analyze clinical genetic data that is often unstructured and rarely used outside of patient encounters. Genetic variants from all genetic testing resulting to a large pediatric healthcare system for a 5-year period were obtained and reinterpreted utilizing the previously validated Franklin© Artificial Intelligence (AI). Using PowerBI©, the data were further matched to patients in the electronic healthcare record to associate with demographic data to generate a variant data table and mapped by ZIP codes. Three thousand and sixty-five variants were identified and 98% were matched to patients with geographic data. Franklin© changed the interpretation for 24% of variants. One hundred and fifty-six clinically actionable variant reinterpretations were made. A total of 739 Mendelian genetic disorders were identified with disorder prevalence estimation. Mapping of variants demonstrated hot-spots for pathogenic genetic variation such as PEX6-associated Zellweger Spectrum Disorder. Seven patients were identified with Bardet-Biedl syndrome and seven patients with Rett syndrome amenable to newly FDA-approved therapeutics. Utilizing readily available software we developed a database and Exploratory Data Analysis (EDA) methodology enabling us to systematically reinterpret variants, estimate variant prevalence, identify conditions amenable to new treatments, and localize geographies enriched for pathogenic variants.

Keywords: data science; demography; laboratory testing; mapping; population genetics; underserved populations.

MeSH terms

  • ATPases Associated with Diverse Cellular Activities
  • Artificial Intelligence*
  • Child
  • Data Science*
  • Genetic Testing / methods
  • Humans
  • Prevalence

Substances

  • PEX6 protein, human
  • ATPases Associated with Diverse Cellular Activities