Evaluating remote sensing datasets and machine learning algorithms for mapping plantations and successional forests in Phnom Kulen National Park of Cambodia

Minerva Singh; Damian Evans; Jean-Baptiste Chevance; Boun Suy Tan; Nicholas Wiggins; Leaksmy Kong; Sakada Sakhoeun

doi:10.7717/peerj.7841

Evaluating remote sensing datasets and machine learning algorithms for mapping plantations and successional forests in Phnom Kulen National Park of Cambodia

PeerJ. 2019 Oct 22:7:e7841. doi: 10.7717/peerj.7841. eCollection 2019.

Authors

Minerva Singh¹, Damian Evans², Jean-Baptiste Chevance³, Boun Suy Tan⁴, Nicholas Wiggins⁵, Leaksmy Kong², Sakada Sakhoeun³

Affiliations

¹ Imperial College, Centre of Environmental Policy, London, United Kingdom.
² École Française d'Extrême-Orient, Paris, France.
³ Phnom Kulen Program, Archaeology and Development Foundation, London, United Kingdom.
⁴ Angkor International Research and Documentation Centre, Siem Reap, Cambodia, Siem Reap, Cambodia.
⁵ School of Earth and Environmental Sciences, University of Queensland, St Lucia, Australia.

Abstract

This study develops a modelling framework by utilizing multi-sensor imagery for classifying different forest and land use types in the Phnom Kulen National Park (PKNP) in Cambodia. Three remote sensing datasets (Landsat optical data, ALOS L-band data and LiDAR derived Canopy Height Model (CHM)) were used in conjunction with three different machine learning (ML) regression techniques (Support Vector Machines (SVM), Random Forests (RF) and Artificial Neural Networks (ANN)). These ML methods were implemented on (a) Landsat spectral data, (b) Landsat spectral band & ALOS backscatter data, and (c) Landsat spectral band, ALOS backscatter data, & LiDAR CHM data. The Landsat-ALOS combination produced more accurate classification results (95% overall accuracy with SVM) compared to Landsat-only bands for all ML models. Inclusion of LiDAR CHM (which is a proxy for vertical canopy heights) improved the overall accuracy to 98%. The research establishes that majority of PKNP is dominated by cashew plantations and the nearly intact forests are concentrated in the more inaccessible parts of the park. The findings demonstrate how different RS datasets can be used in conjunction with different ML models to map forests that had undergone varying levels of degradation and plantations.

Keywords: ALOS PALSAR; Deforestation; Landsat; LiDAR; Machine learning; Plantations; Remote sensing; SE Asia; Support vector machines; Tropical forests.

Grants and funding

This project is funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 639828). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.