The in vivo rabbit test is the benchmark against which new approach methodologies for skin irritation are usually compared. No alternative method offers a complete replacement of animal use for this endpoint for all regulatory applications. Variability in the animal reference data may be a limiting factor in identifying a replacement. We established a curated data set of 2624 test records, representing 990 substances, each tested at least twice, to characterize the reproducibility of the in vivo assay. Methodological deviations from guidelines were noted, and multiple data sets with differing tolerances for deviations were created. Conditional probabilities were used to evaluate the reproducibility of the in vivo method in identification of U.S. Environmental Protection Agency or Globally Harmonized System hazard categories. Chemicals classified as moderate irritants at least once were classified as mild or non-irritants at least 40% of the time when tested repeatedly. Variability was greatest between mild and moderate irritants, which both had less than a 50% likelihood of being replicated. Increased reproducibility was observed when a binary categorization between corrosives/moderate irritants and mild/non-irritants was used. This analysis indicates that variability present in the rabbit skin irritation test should be considered when evaluating nonanimal alternative methods as potential replacements.
Keywords: Hazard classification; In vitro; In vivo; New approach methodologies; Reference data; Skin irritation; Validation; Variability.
Copyright © 2021 The Author(s). Published by Elsevier Inc. All rights reserved.