Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) provides quantitative metrics (e.g. Ktrans, ve) via pharmacokinetic models. We tested inter-algorithm variability in these quantitative metrics with 11 published DCE-MRI algorithms, all implementing Tofts-Kermode or extended Tofts pharmacokinetic models. Digital reference objects (DROs) with known Ktrans and ve values were used to assess performance at varying noise levels. Additionally, DCE-MRI data from 15 head and neck squamous cell carcinoma patients over 3 time-points during chemoradiotherapy were used to ascertain Ktrans and ve kinetic trends across algorithms. Algorithms performed well (less than 3% average error) when no noise was present in the DRO. With noise, 87% of Ktrans and 84% of ve algorithm-DRO combinations were generally in the correct order. Low Krippendorff's alpha values showed that algorithms could not consistently classify patients as above or below the median for a given algorithm at each time point or for differences in values between time points. A majority of the algorithms produced a significant Spearman correlation in ve of the primary gross tumor volume with time. Algorithmic differences in Ktrans and ve values over time indicate limitations in combining/comparing data from distinct DCE-MRI model implementations. Careful cross-algorithm quality-assurance must be utilized as DCE-MRI results may not be interpretable using differing software.