The reliability of a diagnostic test depends on the reproducibility of the result. Many clinical diagnostic tests can be quantified with established ranges and standard deviations. Other tests are more subjective, such as those that depend on analysis of a visual image with an increased possibility of variance in the result. To study this variance, the authors analyzed the performance of expert pathologists in the interpretation of cutaneous melanocytic tumors. A panel of expert pathologists was convened to review anatomic pathology specimens from melanocytic tumors. Each pathologist submitted five specimens, from which 37 were selected for review. Only one slide was used for each case. All specimens were interpreted by each pathologist without consultation with each other. In addition to standard diagnostic terms, each specimen was designated as benign, malignant, or indeterminate. Statistical analysis was used to determine the degree of concordance. The combined kappa statistic for the eight observers and three possible outcomes (benign, malignant, or indeterminate) was 0.50. A kappa statistic of this magnitude, is defined as being moderate. In 62% of the specimens, there was unanimous agreement or only one discordant designation. Thirty-eight percent had two or more discordant interpretations. No single pathologist had a disproportionate number of discordant designations. This study mimics the consultation practice of anatomic pathology and shows the variability and discordance in diagnostic language and designation of biological behavior. The results suggest the criteria for the diagnosis of melanomas and melanocytic nevi need to be refined and more consistently applied.