The validity of use of two artist-rendered and two photographic sets of grading scales (grading 'systems') designed for gauging the severity of contact lens-related ocular pathology was assessed in terms of precision and reliability. Thirteen observers each graded 30 images--by interpolation or extrapolation to the nearest 0.1 increment--of each of the three contact lens complications (corneal staining, conjunctival redness and papillary conjunctivitis) that were common to all four grading systems. This entire procedure was repeated approximately two weeks later, yielding a total data base comprising of 9360 individual grading estimates. Analysis of variance revealed statistically significant differences in both precision and reliability between systems, observers and conditions (p < 0.03 for system reliability; p = 0.0001 for all other combinations). The artist-rendered systems generally afforded lower grading estimates and better grading reliability than the photographic systems. Corneal staining could be graded less reliably than conjunctival redness and papillary conjunctivitis. Grading reliability was generally unaffected by the severity of the condition being assessed. Notwithstanding the above differences, all four grading systems are validated for clinical use and practitioners can initially expect to use these systems with average 95% confidence limits of +/- 1.2 grading scale units (observer range +/- 0.7 to +/- 2.5 grading scale units). In view of the significant between-system differences revealed in this study, it is advisable to consistently use the same grading system. It may be possible to reduce between-observer differences by applying personalised correction factors to normalise grading estimates.