Forecasting Corn Yield With Machine Learning Ensembles
- PMID: 32849688
- PMCID: PMC7411227
- DOI: 10.3389/fpls.2020.01120
Forecasting Corn Yield With Machine Learning Ensembles
Abstract
The emergence of new technologies to synthesize and analyze big data with high-performance computing has increased our capacity to more accurately predict crop yields. Recent research has shown that machine learning (ML) can provide reasonable predictions faster and with higher flexibility compared to simulation crop modeling. However, a single machine learning model can be outperformed by a "committee" of models (machine learning ensembles) that can reduce prediction bias, variance, or both and is able to better capture the underlying distribution of the data. Yet, there are many aspects to be investigated with regard to prediction accuracy, time of the prediction, and scale. The earlier the prediction during the growing season the better, but this has not been thoroughly investigated as previous studies considered all data available to predict yields. This paper provides a machine leaning based framework to forecast corn yields in three US Corn Belt states (Illinois, Indiana, and Iowa) considering complete and partial in-season weather knowledge. Several ensemble models are designed using blocked sequential procedure to generate out-of-bag predictions. The forecasts are made in county-level scale and aggregated for agricultural district and state level scales. Results show that the proposed optimized weighted ensemble and the average ensemble are the most precise models with RRMSE of 9.5%. Stacked LASSO makes the least biased predictions (MBE of 53 kg/ha), while other ensemble models also outperformed the base learners in terms of bias. On the contrary, although random k-fold cross-validation is replaced by blocked sequential procedure, it is shown that stacked ensembles perform not as good as weighted ensemble models for time series data sets as they require the data to be non-IID to perform favorably. Comparing our proposed model forecasts with the literature demonstrates the acceptable performance of forecasts made by our proposed ensemble model. Results from the scenario of having partial in-season weather knowledge reveals that decent yield forecasts with RRMSE of 9.2% can be made as early as June 1st. Moreover, it was shown that the proposed model performed better than individual models and benchmark ensembles at agricultural district and state-level scales as well as county-level scale. To find the marginal effect of each input feature on the forecasts made by the proposed ensemble model, a methodology is suggested that is the basis for finding feature importance for the ensemble model. The findings suggest that weather features corresponding to weather in weeks 18-24 (May 1st to June 1st) are the most important input features.
Keywords: US Corn Belt; corn yields; ensemble; forecasting; machine learning.
Copyright © 2020 Shahhosseini, Hu and Archontoulis.
Figures
Similar articles
-
County-scale crop yield prediction by integrating crop simulation with machine learning models.Front Plant Sci. 2022 Nov 28;13:1000224. doi: 10.3389/fpls.2022.1000224. eCollection 2022. Front Plant Sci. 2022. PMID: 36518505 Free PMC article.
-
Corn Yield Prediction With Ensemble CNN-DNN.Front Plant Sci. 2021 Aug 2;12:709008. doi: 10.3389/fpls.2021.709008. eCollection 2021. Front Plant Sci. 2021. PMID: 34408763 Free PMC article.
-
Predictive performance of multi-model ensemble forecasts of COVID-19 across European nations.Elife. 2023 Apr 21;12:e81916. doi: 10.7554/eLife.81916. Elife. 2023. PMID: 37083521 Free PMC article.
-
Ensemble blood glucose prediction in diabetes mellitus: A review.Comput Biol Med. 2022 Aug;147:105674. doi: 10.1016/j.compbiomed.2022.105674. Epub 2022 Jun 10. Comput Biol Med. 2022. PMID: 35716436 Review.
-
Reviewing ensemble classification methods in breast cancer.Comput Methods Programs Biomed. 2019 Aug;177:89-112. doi: 10.1016/j.cmpb.2019.05.019. Epub 2019 May 20. Comput Methods Programs Biomed. 2019. PMID: 31319964 Review.
Cited by
-
Improving grain yield prediction through fusion of multi-temporal spectral features and agronomic trait parameters derived from UAV imagery.Front Plant Sci. 2023 Oct 16;14:1217448. doi: 10.3389/fpls.2023.1217448. eCollection 2023. Front Plant Sci. 2023. PMID: 37908835 Free PMC article.
-
An ensemble deep learning approach for predicting cocoa yield.Heliyon. 2023 Apr 5;9(4):e15245. doi: 10.1016/j.heliyon.2023.e15245. eCollection 2023 Apr. Heliyon. 2023. PMID: 37089327 Free PMC article.
-
Tree-level almond yield estimation from high resolution aerial imagery with convolutional neural network.Front Plant Sci. 2023 Feb 15;14:1070699. doi: 10.3389/fpls.2023.1070699. eCollection 2023. Front Plant Sci. 2023. PMID: 36875622 Free PMC article.
-
County-scale crop yield prediction by integrating crop simulation with machine learning models.Front Plant Sci. 2022 Nov 28;13:1000224. doi: 10.3389/fpls.2022.1000224. eCollection 2022. Front Plant Sci. 2022. PMID: 36518505 Free PMC article.
-
An advanced systems biology framework of feature engineering for cold tolerance genes discovery from integrated omics and non-omics data in soybean.Front Plant Sci. 2022 Sep 30;13:1019709. doi: 10.3389/fpls.2022.1019709. eCollection 2022. Front Plant Sci. 2022. PMID: 36247545 Free PMC article.
References
-
- Ansarifar J., Wang L. (2019). New algorithms for detecting multi-effect and multi-way epistatic interactions. Bioinformatics 35 (24), 5078–5085. - PubMed
-
- Archontoulis S., Licht M. (2019). New Regional Scale Feature Added to FACTS (ICM blog news, Iowa State University; ).
-
- Archontoulis S. V., Castellano M. J., Licht M. A., Nichols V., Baum M., Huber I., et al. (2020). Predicting crop yields and soil-plant nitrogen dynamics in the US Corn Belt. Crop Sci. 60 (2), 721–738. 10.1002/csc2.20039 - DOI
-
- Basso B., Liu L. (2019). Chapter Four - Seasonal crop yield forecast: Methods, applications, and accuracies. Adv. Agron. 154, 201– 255. 10.1016/bs.agron.2018.11.002 - DOI
-
- Belayneh A., Adamowski J., Khalil B., Quilty J. (2016). Coupling machine learning methods with wavelet transforms and the bootstrap and boosting ensemble approaches for drought prediction. Atmos. Res. 172-173, 37–47. 10.1016/j.atmosres.2015.12.017 - DOI
LinkOut - more resources
Full Text Sources
Other Literature Sources
