Field tests are a practical method to assess aerobic fitness, but they demonstrate greater error variability than laboratory tests. The principal goals of this study were to identify potential sources of systematic error in 2 commonly used field tests (Cooper's 12-minute run [12MR] and the multistage shuttle run [MSR]) and estimate the reliability of the 2 tests from these data. In addition, criterion-related validity evidence for field tests was evaluated via Bland-Altman plots. To assess trends across test protocol and test trials, 60 subjects (mean age = 21.8 ± 3.6 years) completed 6 test trials, including 3 trials of each field test. Of these 60 individuals, 21 volunteers completed an incremental treadmill run and expired gas analysis (TR) that was used to establish criterion-related validity evidence for the 2 field tests. G-study analysis of the field test data returned a high reliability coefficient (ϕ = 0.96), with the largest amount of systematic error variance (4.3%) attributable to an interaction between subjects and test occasions. The MSR predicted Vo2max scores lower than those measured in the laboratory setting (p < 0.01), whereas 12MR and TR scores were not different (p > 0.05). However, Bland-Altman plots showed the 12MR to underestimate Vo2max scores at lower Vo2max values and overestimate Vo2max scores at higher values, a trend not observed in the MSR data. These data suggest high overall reliability for Vo2max field tests in young, healthy individuals. Nevertheless, test administrators must use caution when attempting to use field test data to predict criterion Vo2max scores. The MSR appears to be a more useful tool than the 12MR because of a consistent mean bias across fitness levels.