Genomics 2 Proteins portal: A resource and discovery tool for linking genetic screening outputs to protein sequences and structures

bioRxiv [Preprint]. 2024 Jan 2:2024.01.02.573913. doi: 10.1101/2024.01.02.573913.

Abstract

Recent advances in AI-based methods have revolutionized the field of structural biology. Concomitantly, high-throughput sequencing and functional genomics technologies have enabled the detection and generation of variants at an unprecedented scale. However, efficient tools and resources are needed to link these two disparate data types - to "map" variants onto protein structures, to better understand how the variation causes disease and thereby design therapeutics. Here we present the Genomics 2 Proteins Portal (G2P; g2p.broadinstitute.org/): a human proteome-wide resource that maps 19,996,443 genetic variants onto 42,413 protein sequences and 77,923 structures, with a comprehensive set of structural and functional features. Additionally, the G2P portal generalizes the capability of linking genomics to proteins beyond databases by allowing users to interactively upload protein residue-wise annotations (variants, scores, etc.) as well as the protein structure to establish the connection. The portal serves as an easy-to-use discovery tool for researchers and scientists to hypothesize the structure-function relationship between natural or synthetic variations and their molecular phenotype.

Publication types

  • Preprint