Background: Computational methods have been widely used for the prediction of protein subcellular localization. However, these predictions are rarely validated experimentally and as a result remain questionable. Therefore, experimental validation of the predicted localizations is needed to assess the accuracy of predictions so that such methods can be confidently used to annotate the proteins of unknown localization. Previously, we published a method called ngLOC that predicts the localization of proteins targeted to ten different subcellular organelles. In this short report, we describe the accuracy of these predictions using experimental validations.
Findings: We have experimentally validated the predicted subcellular localizations of 114 human proteins corresponding to nine different organelles in normal breast and breast cancer cell lines using live cell imaging/confocal microscopy. Target genes were cloned into expression vectors as GFP fusions and cotransfected with RFP-tagged organelle-specific gene marker into normal breast epithelial and breast cancer cell lines. Subcellular localization of each target protein is confirmed by colocalization with a co-expressed organelle-specific protein marker. Our results showed that about 82.5% of the predicted subcellular localizations coincided with the experimentally validated localizations. The highest agreement was found in the endoplasmic reticulum proteins, while the cytoplasmic location showed the least concordance. With the exclusion of cytoplasmic location, the average prediction accuracy increased to 90.4%. In addition, there was no difference observed in the protein subcellular localization between normal and cancer breast cell lines.
Conclusions: The experimentally validated accuracy of ngLOC method with (82.5%) or without cytoplasmic location (90.4%) nears the prediction accuracy of 89%. These results demonstrate that the ngLOC method can be very useful for large-scale annotation of the unknown subcellular localization of proteins.