To confirm the existence of missing proteins, we need to identify at least two unique peptides with length of 9-40 amino acids of a missing protein in bottom-up mass-spectrometry-based proteomic experiments. However, an identified unique peptide of the missing protein, even identified with high level of confidence, could possibly coincide with a peptide of a commonly observed protein due to isobaric substitutions, mass modifications, alternative splice isoforms, or single amino acid variants (SAAVs). Besides unique peptides of missing proteins, identified variant peptides (SAAV-containing peptides) could also alternatively map to peptides of other proteins due to the aforementioned issues. Therefore, we conducted a thorough comparative analysis on data sets in PeptideAtlas Tiered Human Integrated Search Proteome (THISP, 2017-03 release), including neXtProt (2017-01 release), to systematically investigate the possibility of unique peptides in missing proteins (PE2-4), unique peptides in dubious proteins, and variant peptides affected by isobaric substitutions, causing doubtful identification results. In this study, we considered 11 isobaric substitutions. From our analysis, we found <5% of the unique peptides of missing proteins and >6% of variant peptides became shared with peptides of PE1 proteins after isobaric substitutions.
Keywords: HPP; isobaric substitutions; mass spectrometry; missing proteins; single amino acid variants; variant peptides.