Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 2:12:709008.
doi: 10.3389/fpls.2021.709008. eCollection 2021.

Corn Yield Prediction With Ensemble CNN-DNN

Affiliations
Free PMC article

Corn Yield Prediction With Ensemble CNN-DNN

Mohsen Shahhosseini et al. Front Plant Sci. .
Free PMC article

Abstract

We investigate the predictive performance of two novel CNN-DNN machine learning ensemble models in predicting county-level corn yields across the US Corn Belt (12 states). The developed data set is a combination of management, environment, and historical corn yields from 1980 to 2019. Two scenarios for ensemble creation are considered: homogenous and heterogenous ensembles. In homogenous ensembles, the base CNN-DNN models are all the same, but they are generated with a bagging procedure to ensure they exhibit a certain level of diversity. Heterogenous ensembles are created from different base CNN-DNN models which share the same architecture but have different hyperparameters. Three types of ensemble creation methods were used to create several ensembles for either of the scenarios: Basic Ensemble Method (BEM), Generalized Ensemble Method (GEM), and stacked generalized ensembles. Results indicated that both designed ensemble types (heterogenous and homogenous) outperform the ensembles created from five individual ML models (linear regression, LASSO, random forest, XGBoost, and LightGBM). Furthermore, by introducing improvements over the heterogenous ensembles, the homogenous ensembles provide the most accurate yield predictions across US Corn Belt states. This model could make 2019 yield predictions with a root mean square error of 866 kg/ha, equivalent to 8.5% relative root mean square and could successfully explain about 77% of the spatio-temporal variation in the corn grain yields. The significant predictive power of this model can be leveraged for designing a reliable tool for corn yield prediction which will in turn assist agronomic decision makers.

Keywords: CNN-DNN; US Corn Belt; heterogenous ensemble; homogenous ensemble; yield prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
The architecture of the proposed base network. prcp, t_max, and gdd represent precipitation, maximum temperature, and growing degree days, respectively. S1, S2, …, and S10 are 10 soil variables which each are measured at 10 depth levels. Y_hat represents the final corn yield prediction made by the model.
FIGURE 2
FIGURE 2
Homogenous ensemble creation with bagging architecture. k data sets (D1, D2, …, Dk) were generated with bootstrap sampling from the original data set (D) and the same base network is trained on each of them. The ensemble creation combines the predictions made by the base networks.
FIGURE 3
FIGURE 3
Heterogenous ensemble creation. k networks with the same architecture but with different hyperparameters are created using the original data set (D).
FIGURE 4
FIGURE 4
Comparing prediction error (relative RMSE) of the homogeneous model with the benchmark on the data from the year 2019 taken as the test data.
FIGURE 5
FIGURE 5
Train and test loss vs. epochs of some of the trained CNN-DNN models. Similar observations were made for all trained models and only some of them are shown for illustration purposes. The shown examples are representative of all the examples.
FIGURE 6
FIGURE 6
Comparing prediction error (relative RMSE) of some of the designed ensembles across all US Corn Belt states on the data from the year 2019 taken as the test data.
FIGURE 7
FIGURE 7
Relative percentage error of the Homogenous GEM predictions shown on a choropleth map of the US Corn Belt.

Similar articles

Cited by

References

    1. Basso B., Liu L. (2019). “Chapter Four - Seasonal crop yield forecast: Methods, applications, and accuracies,” in Advances in Agronomy, ed. Sparks D. L. (Cambridge, Massachusetts: Academic Press; ), 154 201–255. 10.1016/bs.agron.2018.11.002 - DOI
    1. Bengio Y., Simard P., Frasconi P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5 157–166. 10.1109/72.279181 - DOI - PubMed
    1. Borovykh A., Bohte S., Oosterlee C. W. (2017). Conditional time series forecasting with convolutional neural networks. arXiv [preprint] Available Online at: arXiv:1703.04691 (accessed April, 2021).
    1. Breiman L. (1996). Bagging predictors. Mach. Learn. 24 123–140. 10.1007/bf00058655 - DOI
    1. Brown G. (2017). “Ensemble Learning,” in Encyclopedia of Machine Learning and Data Mining, eds Sammut C., Webb G. I. (Boston, MA: Springer US; ), 393–402.