Rainfall-Induced Landslide Prediction Using Machine Learning Models: The Case of Ngororero District, Rwanda

Int J Environ Res Public Health. 2020 Jun 10;17(11):4147. doi: 10.3390/ijerph17114147.


Landslides fall under natural, unpredictable and most distractive disasters. Hence, early warning systems of such disasters can alert people and save lives. Some of the recent early warning models make use of Internet of Things to monitor the environmental parameters to predict the disasters. Some other models use machine learning techniques (MLT) to analyse rainfall data along with some internal parameters to predict these hazards. The prediction capability of the existing models and systems are limited in terms of their accuracy. In this research paper, two prediction modelling approaches, namely random forest (RF) and logistic regression (LR), are proposed. These approaches use rainfall datasets as well as various other internal and external parameters for landslide prediction and hence improve the accuracy. Moreover, the prediction performance of these approaches is further improved using antecedent cumulative rainfall data. These models are evaluated using the receiver operating characteristics, area under the curve (ROC-AUC) and false negative rate (FNR) to measure the landslide cases that were not reported. When antecedent rainfall data is included in the prediction, both models (RF and LR) performed better with an AUC of 0.995 and 0.997, respectively. The results proved that there is a good correlation between antecedent precipitation and landslide occurrence rather than between one-day rainfall and landslide occurrence. In terms of incorrect predictions, RF and LR improved FNR to 10.58% and 5.77% respectively. It is also noted that among the various internal factors used for prediction, slope angle has the highest impact than other factors. Comparing both the models, LR model's performance is better in terms of FNR and it could be preferred for landslide prediction and early warning. LR model's incorrect prediction rate FNR = 9.61% without including antecedent precipitation data and 3.84% including antecedent precipitation data.

Keywords: antecedent rainfall; landslide; logistic regression; prediction; rainfall; random forest.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Disasters*
  • Landslides*
  • Logistic Models
  • Machine Learning
  • Rwanda