Forest fire susceptibility assessment under small sample scenario: A semi-supervised learning approach using transductive support vector machine

J Environ Manage. 2024 Apr 26:359:120966. doi: 10.1016/j.jenvman.2024.120966. Online ahead of print.

Abstract

Forest fires threaten global ecosystems, socio-economic structures, and public safety. Accurately assessing forest fire susceptibility is critical for effective environmental management. Supervised learning methods dominate this assessment, relying on a substantial dataset of forest fire occurrences for model training. However, obtaining precise forest fire location data remains challenging. To address this issue, semi-supervised learning emerges as a viable solution, leveraging both a limited set of collected samples and unlabeled data containing environmental factors for training. Our study employed the transductive support vector machine (TSVM), a key semi-supervised learning method, to assess forest fire susceptibility in scenarios with limited samples. We conducted a comparative analysis, evaluating its performance against widely used supervised learning methods. The assessment area for forest fire susceptibility lies in Dayu County, Jiangxi Province, China, renowned for its vast forest cover and frequent fire incidents. We analyzed and generated maps depicting forest fire susceptibility, evaluating prediction accuracies for both supervised and semi-supervised learning methods across various small sample scenarios (e.g., 4, 8, 12, 16, 20, 24, 28, and 32 samples). Our findings indicate that TSVM exhibits superior prediction accuracy compared to supervised learning with limited samples, yielding more plausible forest fire susceptibility maps. For instance, at sample sizes of 4, 16, and 28, TSVM achieves prediction accuracies of approximately 0.8037, 0.9257, and 0.9583, respectively. In contrast, random forests, the top performers in supervised learning, demonstrate accuracies of approximately 0.7424, 0.8916, and 0.9431, respectively, for the same small sample sizes. Additionally, we discussed three key aspects: TSVM parameter configuration, the impact of unlabeled sample size, and performance within typical sample sizes. Our findings support semi-supervised learning as a promising approach compared to supervised learning for forest fire susceptibility assessment and mapping, particularly in scenarios with small sample sizes.

Keywords: Forest fire; Limited sample; Semi-supervised learning; Spatial prediction; Supervised learning; Unlabeled data.