Machine learning algorithms for soybean yield forecasting in the Brazilian Cerrado

J Sci Food Agric. 2022 Jul;102(9):3665-3672. doi: 10.1002/jsfa.11713. Epub 2021 Dec 27.

Abstract

Background: We evaluated different machine learning (ML) models for predicting soybean productivity up to 1 month in advance for the Matopiba agricultural frontier (States of Maranhão, Tocantins, Piauí, and Bahia). We collected meteorological data on the NASA-POWER platform and soybean yield on the SIDRA/IBGE base between 2008 and 2017. The ML models evaluated were random forest (RF), artificial neural networks, radial base support vector machines (SVM_RBF), linear model and polynomial regression. To assess the performance of the models, cross-validation was used, obtaining the value of precision by R2 , accuracy by root mean square error (RMSE), and trend by the mean error of the estimate (EME).

Results: The results showed that the RF algorithm achieves the highest precision and accuracy, with R2 of 0.81, RMSE of 176.93 kg ha-1 and trend (EME) of 1.99 kg ha-1 . On the other hand, the SVM_RBF algorithm showed the lowest performance, with R2 of 0.74, RMSE of 213.58 kg ha-1 and EME of -15.06 kg ha-1 . The average yield values predicted by the models were within the expected range for the region, which has a historical average value of 2.730 kg ha-1 .

Conclusion: All models had acceptable precision, accuracy and trend indices, which makes it possible to use all algorithms to be applied in the prediction of soybean crop yield, observing the particularities of the region to be studied, in addition to being a useful tool for agricultural planning and decision making in soy-producing regions such as the Brazilian Cerrado. © 2021 Society of Chemical Industry.

Keywords: Cerrado; agrometeorology; crop model; forecasting; machine learning techniques.

MeSH terms

  • Algorithms
  • Brazil
  • Fabaceae*
  • Glycine max*
  • Machine Learning
  • Support Vector Machine