SOX7: Novel Autistic Gene Identified by Analysis of Multi-Omics Data

bioRxiv [Preprint]. 2023 May 28:2023.05.26.542456. doi: 10.1101/2023.05.26.542456.

Abstract

Background: Genome-wide association studies and next generation sequencing data analyses based on DNA information have identified thousands of mutations associated with autism spectrum disorder (ASD). However, more than 99% of identified mutations are non-coding. Thus, it is unclear which of these mutations might be functional and thus potentially causal variants. Transcriptomic profiling using total RNA-sequencing has been one of the most utilized approaches to link protein levels to genetic information at the molecular level. The transcriptome captures molecular genomic complexity that the DNA sequence solely does not. Some mutations alter a gene's DNA sequence but do not necessarily change expression and/or protein function. To date, few common variants reliably associated with the diagnosis status of ASD despite consistently high estimates of heritability. In addition, reliable biomarkers used to diagnose ASD or molecular mechanisms to define the severity of ASD do not exist.

Objectives: It is necessary to integrate DNA and RNA testing together to identify true causal genes and propose useful biomarkers for ASD.

Methods: We performed gene-based association studies with adaptive test using genome-wide association studies (GWAS) summary statistics with two large GWAS datasets (ASD 2019 data: 18,382 ASD cases and 27,969 controls [discovery data]; ASD 2017 data: 6,197 ASD cases and 7,377 controls [replication data]) which were obtained from the Psychiatric Genomics Consortium (PGC). In addition, we investigated differential expression for genes identified in gene-based GWAS with a RNA-seq dataset (GSE30573: 3 cases and 3 controls) using the DESeq2 package.

Results: We identified 5 genes significantly associated with ASD in ASD 2019 data (KIZ-AS1, p=8.67×10-10; KIZ, p=1.16×10-9; XRN2, p=7.73×10-9; SOX7, p=2.22×10-7; PINX1-DT, p=2.14×10-6). Among these 5 genes, gene SOX7 (p=0.00087), LOC101929229 (p=0.009), and KIZ-AS1 (p=0.059) were replicated in ASD 2017 data. KIZ (p=0.06) was close to the boundary of replication in ASD 2017 data. Genes SOX7 (p=0.0017, adjusted p=0.0085), LOC101929229 (also known as PINX1-DT, p=5.83×10-7, adjusted p=1.18×10-5), and KIZ (p=0.00099, adjusted p=0.0055) indicated significant expression differences between cases and controls in the RNA-seq data. SOX7 encodes a member of the SOX (SRY-related HMG-box) family of transcription factors pivotally contributing to determining of the cell fate and identity in many lineages. The encoded protein may act as a transcriptional regulator after forming a protein complex with other proteins leading to autism.

Conclusion: Gene SOX7 in the transcription factor family could be associated with ASD. This finding may provide new diagnostic and therapeutic strategies for ASD.

Publication types

  • Preprint