Proteome-Wide Assessment of Clustering of Missense Variants in Neurodevelopmental Disorders Versus Cancer

medRxiv [Preprint]. 2024 Feb 4:2024.02.02.24302238. doi: 10.1101/2024.02.02.24302238.


Missense de novo variants (DNVs) and missense somatic variants contribute to neurodevelopmental disorders (NDDs) and cancer, respectively. Proteins with statistical enrichment based on analyses of these variants exhibit convergence in the differing NDD and cancer phenotypes. Herein, the question of why some of the same proteins are identified in both phenotypes is examined through investigation of clustering of missense variation at the protein level. Our hypothesis is that missense variation is present in different protein locations in the two phenotypes leading to the distinct phenotypic outcomes. We tested this hypothesis in 1D protein space using our software CLUMP. Furthermore, we newly developed 3D-CLUMP that uses 3D protein structures to spatially test clustering of missense variation for proteome-wide significance. We examined missense DNVs in 39,883 parent-child sequenced trios with NDDs and missense somatic variants from 10,543 sequenced tumors covering five TCGA cancer types and two COSMIC pan-cancer aggregates of tissue types. There were 57 proteins with proteome-wide significant missense variation clustering in NDDs when compared to cancers and 79 proteins with proteome-wide significant missense clustering in cancers compared to NDDs. While our main objective was to identify differences in patterns of missense variation, we also identified a novel NDD protein BLTP2. Overall, our study is innovative, provides new insights into differential missense variation in NDDs and cancer at the protein-level, and contributes necessary information toward building a framework for thinking about prognostic and therapeutic aspects of these proteins.

Publication types

  • Preprint