Our current knowledge in biology has been mostly derived from studying model organisms and cell lines in which only a small fraction of all described species have been extensively studied. Although these model organisms are amenable to genetic manipulations, this blinds researchers to the true variability of life. Groundbreaking discoveries are often achieved by analyzing "noncanonical" species; for example, the characterization of Taq polymerase from Thermus aquaticus ultimately led to a revolution in the field of molecular biology. Brazil possesses a rich biodiversity and a considerable fraction of Brazilian groups use current proteomic techniques to explore this natural treasure-trove. However, in our opinion, much more than the widely adopted peptide spectrum match approach is required to explore this rich "proteomosphere." Here, we provide a critical overview of the available strategies for the analysis of proteomic data from "noncanonical" biological samples (e.g. proteins from unsequenced genomes or genomes with high levels of polymorphisms), and demonstrate some limitations of existing approaches for large-scale protein identification and quantitation. An understanding of the premises behind these computational tools is necessary to properly deal with their limitations and draw accurate conclusions.
© 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.