Calpains are intracellular Ca(2+)-dependent Cys proteases that play important roles in a wide range of biological phenomena via the limited proteolysis of their substrates. Genetic defects in calpain genes cause lethality and/or functional deficits in many organisms, including humans. Despite their biological importance, the mechanisms underlying the action of calpains, particularly of their substrate specificities, remain largely unknown. Studies show that certain sequence preferences influence calpain substrate recognition, and some properties of amino acids have been related successfully to substrate specificity and to the calpains' 3D structure. The full spectrum of this substrate specificity, however, has not been clarified using standard sequence analysis algorithms, e.g., the position-specific scoring-matrix method. More advanced bioinformatics techniques were used recently to identify the substrate specificities of calpains and to develop a predictor for calpain cleavage sites, demonstrating the potential of combining empirical data acquisition and machine learning. This review discusses the calpains' substrate specificities, introducing the benefits of bioinformatics applications. In conclusion, machine learning has led to the development of useful predictors for calpain cleavage sites, although the accuracy of the predictions still needs improvement. Machine learning has also elucidated information about the properties of calpains' substrate specificities, including a preference for sequences over secondary structures and the existence of a substrate specificity difference between two similar conventional calpains, which has never been indicated biochemically.