Evolutionary robotics (ER) aims at automatically designing robots or controllers of robots without having to describe their inner workings. To reach this goal, ER researchers primarily employ phenotypes that can lead to an infinite number of robot behaviors and fitness functions that only reward the achievement of the task-and not how to achieve it. These choices make ER particularly prone to premature convergence. To tackle this problem, several papers recently proposed to explicitly encourage the diversity of the robot behaviors, rather than the diversity of the genotypes as in classic evolutionary optimization. Such an approach avoids the need to compute distances between structures and the pitfalls of the noninjectivity of the phenotype/behavior relation; however, it also introduces new questions: how to compare behavior? should this comparison be task specific? and what is the best way to encourage diversity in this context? In this paper, we review the main published approaches to behavioral diversity and benchmark them in a common framework. We compare each approach on three different tasks and two different genotypes. The results show that fostering behavioral diversity substantially improves the evolutionary process in the investigated experiments, regardless of genotype or task. Among the benchmarked approaches, multi-objective methods were the most efficient and the generic, Hamming-based, behavioral distance was at least as efficient as task specific behavioral metrics.