One of the most important biological properties of consumer products, and also of many raw materials, is the local compatibility to mucous membranes. Until now standardized in vivo tests are accepted by public health authorities as valid to estimate the irritation potential of chemicals and suitable for the risk assessment. Nevertheless, the controversial discussion on animal tests, and particularly on the Draize rabbit eye test, is increasing in the public and scientific domain. Efforts have been made to validate proper and suitable in vitro tests in international cosmetics industries during the last decade. One of the most important in vitro tests is the HET-CAM, the h en's e gg t est on the c horioa llantoic m embrane of fertilized chicken eggs. In this paper, the efforts to establish the HET-CAM protocol and the defined prediction model (PM) used in the COLIPA (The European Cosmetic, Toiletry and Perfumery Association) study on alternatives to the Draize rabbit eye test are described. Furthermore, the HET-CAM test results of the finalized phase I of the above-mentioned study are discussed in detail. Prior to the COLIPA validation study, the HET-CAM was prevalidated with about 100 test substances covering a broad spectrum of chemical structures and physical appearances and representing the range of chemicals in the cosmetics industry. This prevalidation was performed with a stringent in-house agreement in one company to test each chemical in the HET-CAM before any requested animal test was done. There was a high concordance of the HET-CAM results with in vivo data of the Draize test, especially for slightly irritating test articles. Based on these promising data, the HET-CAM protocol was taken as the final standard operating procedure (SOP) in the international COLIPA validation study, testing 55 coded chemicals in four different laboratories. The HET-CAM has been established and proven to be a robust test with a good prediction of irritation potential. According to strict associations of well-defined irritation categories (in vivo and in vitro), and with the concrete PM, the in vivo irritation potential of 29 out of 55 test articles (about 52%) were correctly predicted with the HET-CAM in at least three laboratories. This quality of prediction was of different success in the four categories of irritation severity. 90% of the slightly irritating chemicals but only 53% of the severely irritating articles were correctly predicted. The necessity to define a "gold standard" for validation purposes and the conflict with heterogeneous in vivo data were also pronounced this article. Here it is discussed, whether the evaluation of such heterogeneous responses and especially of persistent slight effects on the cornea can be done properly with additional data such as physicochemical data and biological information of the test substance.