This paper describes the use of a data-driven autoregressive integrated moving average model to predict body core temperature in humans during physical activity. We also propose a bootstrap technique to provide a measure of reliability of such predictions in the form of prediction intervals. We investigate the model's predictive capabilities and associated reliability using two distinct datasets, both obtained in the field under different environmental conditions. One dataset is used to develop the model, and the other one, containing an example of heat illness, is used to test the model. We demonstrate that accurate and reliable predictions of an extreme core temperature value of 39.5 degrees C, can be made 20 minutes ahead of time, even when the predictive model is developed on a different individual having core temperatures within healthy physiological limits. This result suggests that data-driven models can be made portable across different core temperature levels and across different individuals. Also, we show that the bootstrap prediction intervals cover the actual core temperature, and that they exhibit intuitively expected behavior as a function of the prediction horizon and the core temperature variability.