Despite many studies that examine the reliability of competence to stand trial (CST) evaluations, few shed light on "field reliability," or agreement among forensic evaluators in routine practice. We reviewed 216 cases from Hawaii, which requires three separate evaluations from independent clinicians for each felony defendant referred for CST evaluation. Results revealed moderate agreement. In 71% of initial CST evaluations, all evaluators agreed about a defendant's competence or incompetence (kappa = .65). Agreement was somewhat lower (61%, kappa = .57) in re-evaluations of defendants who were originally found incompetent and sent for restoration services. We also examined the decisions judges made about a defendant's CST. When evaluators disagreed, judges tended to make decisions consistent with the majority opinion. But when judges disagreed with the majority opinion, they more often did so to find a defendant incompetent than competent, suggesting a generally conservative approach. Overall, results reveal moderate agreement among independent evaluators in routine practice. But we discuss the potential for standardized training and methodology to further improve the field reliability of CST evaluations.