In vivo rabbit data for skin irritation registered in the European New Chemicals Database (NCD) and an ECETOC Database were evaluated to characterise the distribution of irritation potential among chemicals and to assess the variability of the animal test. These databases could be used to determine experimental and rudimentarily within-laboratory variability, but not between-laboratory variability. Our evaluation suggests that experimental variability is small. Using two classification systems--the system currently used in Europe and the Globally Harmonised System (GHS)--the prevalence of skin irritation data obtained from NCD was analysed. This analysis revealed that out of 3121 chemicals tested, less than 10% showed an irritation potential in rabbits which would require an appropriate hazard label and 64% did not cause any irritation. Furthermore, it appears that in practical use the European classification system introduces bias towards overclassification. Based on these findings, we conclude, that the classification systems should be refined taking prevalence into account. Additionally, prevalence should be incorporated into the design and analysis of validation studies for in vitro test methods and in the definition of testing strategies.