Background: The search for genetic factors underlying autism spectrum disorders (ASD) has led to the identification of hundreds of genes containing thousands of variants that differ in mode of inheritance, effect size, frequency, and function. A major challenge involves assessing the collective evidence in an unbiased, systematic manner for their functional relevance.
Methods: Here, we describe a scoring algorithm for prioritization of candidate genes based on the cumulative strength of evidence for each ASD-associated variant cataloged in AutDB (also known as SFARI Gene). We retrieved data from 889 publications to generate a dataset of 2187 rare and 711 common variants distributed across 461 genes implicated in ASD. Each individual variant was manually annotated with multiple attributes extracted from the original report, followed by score assignment using a set of standardized parameters yielding a single score for each gene.
Results: There was a wide variation in scores; SHANK3, CHD8, and ADNP had distinctly higher scores than all other genes in the dataset. Our gene scores were significantly correlated with other recently published rankings of ASD genes (RSpearman = 0.40-0.63; p< 0.0001), providing support for our scoring algorithm.
Conclusions: This new resource, which is freely available, for the first time aggregates on one-platform variants identified from various study types (simplex, multiplex, multigenerational, and consanguineous families), from both common and rare variants, and also incorporates their putative functional consequences to arrive at a genetically and biologically driven ranking scheme. This work represents a major step in moving from simply cataloging autism variants to using data-driven approaches to gain insight into their significance.
Keywords: Autistic disorder; Autosomal recessive; Common variants; Genetic variation; Rare variants.