Finding protein evidence (PE) for protein coding genes is a primary task of the Phase I Chromosome-Centric Human Proteome Project (C-HPP). Currently, there are 2948 PE level 2-4 coding genes per neXtProt, which are deemed missing proteins in the human proteome. As most samples prepared and analyzed in the C-HPP framework were focusing on detergent soluble proteins, we posit that as a natural composition the cytoplasmic detergent-insoluble proteins (DIPs) represent a source of finding missing proteins. We optimized a workflow and separated cytoplasmic DIPs from three human lung and three human hepatoma cell lines via differential speed centrifugation. We verified that the detergent-soluble proteins (DSPs) could be sufficiently depleted and the cytoplasmic DIP isolation was partially reproducible with Spearman r > 0.70 according to two independent SILAC MS experiments. Through label-free MS, we identified 4524 and 4156 DIPs from lung and liver cells, respectively. Among them, a total of 23 missing proteins (22 PE2 and 1 PE4) were identified by MS, and 18 of them had translation evidence; in addition, six PE5 proteins were identified by MS, three with translation evidence. We showed that cytoplasmic DIPs were not an enrichment of transmembrane proteins and were chromosome-, cell type-, and tissue-specific. Furthermore, we demonstrated that DIPs were distinct from DSPs in terms of structural and physical-chemical features. In conclusion, we have found 23 missing proteins and 6 PE5 proteins from the cytoplasmic insoluble proteome that is biologically and physical-chemically different from the soluble proteome, suggesting that cytoplasmic DIPs carry comprehensive and valuable information for finding PE of missing proteins. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD001694.
Keywords: Chromosome-Centric Human Proteome Project; detergent-insoluble proteins; missing proteins; neXtProt.