A large portion of human proteins are referred to as missing proteins, defined as protein-coding genes that lack experimental data on the protein level due to factors such as temporal expression, expression in tissues that are difficult to sample, or they actually do not encode functional proteins. In the present investigation, an integrated omics approach was used for identification and exploration of missing proteins. Transcriptomics data from three different sources-the Human Protein Atlas (HPA), the GTEx consortium, and the FANTOM5 consortium-were used as a starting point to identify genes selectively expressed in specialized tissues. Complementing the analysis with profiling on more specific tissues based on immunohistochemistry allowed for further exploration of cell-type-specific expression patterns. More detailed tissue profiling was performed for >300 genes on complementing tissues. The analysis identified tissue-specific expression of nine proteins previously listed as missing proteins (POU4F1, FRMD1, ARHGEF33, GABRG1, KRTAP2-1, BHLHE22, SPRR4, AVPR1B, and DCLK3), as well as numerous proteins with evidence of existence on the protein level that previously lacked information on spatial resolution and cell-type-specific expression pattern. We here present a comprehensive strategy for identification of missing proteins by combining transcriptomics with antibody-based proteomics. The analyzed proteins provide interesting targets for organ-specific research in health and disease.
Keywords: antibodies; immunohistochemistry; missing proteins; protein localization; proteomics; tissue profiling; transcriptomics.