Autism spectrum disorder (ASD) is a neurodevelopmental condition with substantial phenotypic and etiological heterogeneity. Although 10%-20% of ASD cases are attributable to copy number variation (CNV), causative genomic loci and constituent genes remain unclarified. We have developed SNATCNV, a tool that outperforms existing tools, to identify 47 recurrent ASD CNV regions from 19,663 cases and 6,479 controls documented in the AutDB database. Analysis of ASD CNV gene content using FANTOM5 shows that constituent coding genes and long non-coding RNAs have brain-enriched patterns of expression. Notably, such enrichment is not observed for regions identified by using other tools. We also find evidence of sexual dimorphism, one locus uniquely comprising a single lncRNA gene, and correlation of CNVs to distinct clinical and behavioral traits. Finally, we analyze a large dataset for schizophrenia to further demonstrate that SNATCNV is an effective, publicly available tool to define genomic loci and causative genes for multiple CNV-associated conditions.
Keywords: autism spectrum disorder; copy number variation; gene prioritization; lncRNA prioritization; long non-coding RNA; resource for brain enrichment; tissue-specific expression.
Copyright © 2020 The Author(s). Published by Elsevier Inc. All rights reserved.