Integration of genetic, transcriptomic, and clinical data provides insight into 16p11.2 and 22q11.2 CNV genes

Genome Med. 2021 Oct 29;13(1):172. doi: 10.1186/s13073-021-00972-1.


Background: Deletions and duplications of the multigenic 16p11.2 and 22q11.2 copy number variant (CNV) regions are associated with brain-related disorders including schizophrenia, intellectual disability, obesity, bipolar disorder, and autism spectrum disorder (ASD). The contribution of individual CNV genes to each of these identified phenotypes is unknown, as well as the contribution of these CNV genes to other potentially subtler health implications for carriers. Hypothesizing that DNA copy number exerts most effects via impacts on RNA expression, we attempted a novel in silico fine-mapping approach in non-CNV carriers using both GWAS and biobank data.

Methods: We first asked whether gene expression level in any individual gene in the CNV region alters risk for a known CNV-associated behavioral phenotype(s). Using transcriptomic imputation, we performed association testing for CNV genes within large genotyped cohorts for schizophrenia, IQ, BMI, bipolar disorder, and ASD. Second, we used a biobank containing electronic health data to compare the medical phenome of CNV carriers to controls within 700,000 individuals in order to investigate the full spectrum of health effects of the CNVs. Third, we used genotypes for over 48,000 individuals within the biobank to perform phenome-wide association studies between imputed expressions of individual 16p11.2 and 22q11.2 genes and over 1500 health traits.

Results: Using large genotyped cohorts, we found individual genes within 16p11.2 associated with schizophrenia (TMEM219, INO80E, YPEL3), BMI (TMEM219, SPN, TAOK2, INO80E), and IQ (SPN), using conditional analysis to identify upregulation of INO80E as the driver of schizophrenia, and downregulation of SPN and INO80E as increasing BMI. We identified both novel and previously observed over-represented traits within the electronic health records of 16p11.2 and 22q11.2 CNV carriers. In the phenome-wide association study, we found seventeen significant gene-trait pairs, including psychosis (NPIPB11, SLX1B) and mood disorders (SCARF2), and overall enrichment of mental traits.

Conclusions: Our results demonstrate how integration of genetic and clinical data aids in understanding CNV gene function and implicates pleiotropy and multigenicity in CNV biology.

Keywords: Copy number variants; Electronic health records; Phenome-wide association studies; Psychiatric traits; Transcriptome imputation.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Autism Spectrum Disorder / genetics
  • Autistic Disorder / genetics*
  • Chromosome Deletion*
  • Chromosome Disorders*
  • Chromosomes, Human, Pair 16 / genetics*
  • DNA Copy Number Variations*
  • DiGeorge Syndrome / genetics*
  • Genotype
  • Humans
  • Intellectual Disability / genetics
  • Phenotype
  • Psychotic Disorders / genetics
  • Scavenger Receptors, Class F / genetics
  • Schizophrenia / genetics
  • Transcriptome*
  • Tumor Suppressor Proteins / genetics


  • SCARF2 protein, human
  • Scavenger Receptors, Class F
  • Tumor Suppressor Proteins

Supplementary concepts

  • 16p11.2 Deletion Syndrome