Estimating the effects of copy-number variants on intelligence using hierarchical Bayesian models

Genet Epidemiol. 2020 Nov;44(8):825-840. doi: 10.1002/gepi.22344. Epub 2020 Aug 11.


It is challenging to estimate the phenotypic impact of the structural genome changes known as copy-number variations (CNVs), since there are many unique CNVs which are nonrecurrent, and most are too rare to be studied individually. In recent work, we found that CNV-aggregated genomic annotations, that is, specifically the intolerance to mutation as measured by the pLI score (probability of being loss-of-function intolerant), can be strong predictors of intellectual quotient (IQ) loss. However, this aggregation method only estimates the individual CNV effects indirectly. Here, we propose the use of hierarchical Bayesian models to directly estimate individual effects of rare CNVs on measures of intelligence. Annotation information on the impact of major mutations in genomic regions is extracted from genomic databases and used to define prior information for the approach we call HBIQ. We applied HBIQ to the analysis of CNV deletions and duplications from three datasets and identified several genomic regions containing CNVs demonstrating significant deleterious effects on IQ, some of which validate previously known associations. We also show that several CNVs were identified as deleterious by HBIQ even if they have a zero pLI score, and the converse is also true. Furthermore, we show that our new model yields higher out-of-sample concordance (78%) for predicting the consequences of carrying known recurrent CNVs compared with our previous approach.

Keywords: copy-number variation; hierarchical Bayesian model; informative prior distributions; intelligence; rare variant analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Bayes Theorem
  • Child
  • Chromosomes, Human, Pair 16 / genetics
  • Chromosomes, Human, Pair 22 / genetics
  • Cohort Studies
  • DNA Copy Number Variations / genetics*
  • Genome
  • Humans
  • Intelligence / genetics*
  • Intelligence Tests
  • Linear Models
  • Models, Genetic*
  • Principal Component Analysis
  • Sample Size