Peroxisomes are small subcellular compartments responsible for a range of essential metabolic processes. Efforts in predicting peroxisomal protein import are challenged by species variation and sparse sequence data sets with experimentally confirmed localization. We present a predictor of peroxisomal import based on the presence of the dominant peroxisomal targeting signal one (PTS1), a seemingly wellconserved but highly unspecific motif. The signal appears to rely on subtle dependencies with the preceding residues. We evaluate prediction accuracies against two alternative predictor services, PEROXIP and the PTS1 PREDICTOR. We test the integrity of prediction on a range of prokaryotic and eukaryotic proteomes lacking peroxisomes. Similarly we test the accuracy on peroxisomal proteins known to not overlap with training data. The model identified a number of proteins within the RIKEN IPS7 mouse protein dataset as potentially novel peroxisomal proteins. Three were confirmed in vitro using immunofluorescent detection of myc-epitope-tagged proteins in transiently transfected BHK-21 cells (Dhrs2, Serhl, and Ehhadh). The final model has a superior specificity to both alternatives, and an accuracy better than PEROXIP and on par with PTS1 PREDICTOR. Thus, the model we present should prove invaluable for labeling PTS1 targeted proteins with high confidence. We use the predictor to screen several additional eukaryotic genomes to revise previously estimated numbers of peroxisomal proteins. Available at http://pprowler.itee.uq.edu.au.
(c) 2007 Wiley-Liss, Inc.