Sequence variant analysis (SVA) is critical in therapeutic protein development because it ensures the absence of genetic mutations of a production clone or high-level misincorporations during cell culture. While software for searching sequence variants from mass spectrometry data are available, effectively distinguishing true positives from a large number of false positives in the reported hits or identifications found in the error tolerant search mode is a challenge. This verification process must be done manually and can take several days or even weeks to accomplish. We report here the use of a Perl-based script to evaluate every identified hit to remove the false positives from the search results of PepFinder™ (also known as MassAnalyzer) based on orthogonal criteria. Our data show that the false positives from PepFinder™ output were reduced ∼4-fold without loss of accuracy in the detection of true identifications, representing a more than 70% reduction in time compared with the manual data verification process.
Keywords: Algorithm; automation; false positive removal; sequence variant analysis.