Forest Foresight Global Research

In our team, we have done a lot of research, both within the team and outsourced to other institutions. This has been done to create an optimally performing model and ensure quality over time.

Internship Report - Predicting Deforestation in Laos

This study addresses the issue of deforestation, which poses a threat to biodiversity and ecosystem services. With a need for early intervention to prevent deforestation events and safeguard the forest, the study focuses on deforestation prediction using the eXtreme Gradient Boosting (XGBoost) model within the Forest Foresight (FF) program. The objectives include optimising model parameters, assessing feature importance, evaluating predictions in deforestation-free areas, and analysing seasonal patterns. The study uses data covering the tropical belt, organised into 10 by 10-degree tiles with a spatial resolution of 400 by 400 meters and monthly temporal resolution from January 2020 to June 2023. The primary focus is on the Lao People’s Democratic Republic (Laos); however, to address the divergent results observed in Laos and to offer broader insights into the effectiveness of the tested method on a global scale, the analysis also includes Gabon, Colombia, Peru, and Bolivia. The Global ForestWatch (GFW) Integrated Alerts integrating GLAD and RADD alert systems are used as ground truth data. Features include static and dynamic variables derived from GFW and GLAD data, as well as accessibility, terrain, anthropogenic activity, and climate variables. The model is evaluated using precision, recall, and F0.5 scores and is compared against a baseline model. Optimising model hyper-parameters had only a modest effect on performance, with consistency observed across different landscapes. However, variations in training duration and the use of a dynamic threshold underscored the necessity for different approaches per country. Notably, during our research, correlated and uninformative features were identified for Laos. However, it was also observed that XGBoost exhibited robustness in managing these features. However, predicting deforestation in non-deforested areas is challenging, suggesting the need for alternative methodologies. While seasonal deforestation patterns were captured, detecting trend changes proved difficult, though using a dynamic threshold somewhat mitigated this. Future investigations may explore alternative algorithms, such as deep learning and spatial models, alongside an emphasis on high confidence alerts and forecasting deforestation extents.

Internship Report- ForestForesight Finetuner [Dutch]

This research addresses the following question: To what extent is it possible to achieve an F0.5 score higher than 0.62 using the current machine learning model or novel machine learning techniques for predicting deforestation in Brazil over the next six months? During the investigation, various classification models were examined and trained using spatiotemporal data. Support Vector Machine, K-Nearest Neighbors, Decision Tree, and Gaussian Naive Bayes algorithms were among those studied, with their performance, parameter configurations, and utilized data sets described. In addition to these models, AutoML was also evaluated. AutoML is a software methodology that automatically searches for optimal models and their corresponding parameters. All models were subsequently validated using the F0.5 score metric. The research findings indicate that no machine learning technique was identified that achieved an F0.5 score exceeding 62% for a specific region in Brazil.

Internship Report - Identifying key deforestation drivers

Protecting forests across the globe is crucial, now more than ever. To address this, ForestForesight was developed to prevent illegal deforestation by generating highly accurate predictions using machine learning models and collaborates with a network of local stakeholders to take action. This research focuses on exploring high-detail features that can serve as proxies for deforestation to be implemented in the ForestForesight model. A literature review identified 29 deforestation indicators across 10 themes. An online search revealed 76 open-source datasets and repositories, offering over 100 high-detail datasets with global coverage. From these, 15 promising datasets were selected, preprocessed, implemented, and tested in the models. Results showed no significant change in prediction accuracy with the newly implemented features. However, the new features proved to be important in several models, though they did not come close to the importance of GLAD integrated alerts datasets. Implementing new features can lead to marginal improvements and contribute valuable explanations of why deforestation is occurring. However, it is unlikely that adding features unrelated to integrated alerts will lead to significant improvements in model accuracy, as their predictive power is too low at the high resolution of 0.004 degrees.

Experiments- Amounts, Classes and Isolated Pixels

This document described three experiments and their results: the Amounts and Likelihood Adjusted Quantity Analysis, the prediction of deforestation classes, and the analysis of isolated pixels. Unfortunately, none of these methods were effective enough to be suitable for implementation in the FF project.

Academic Consultancy Project - Analysis of Residuals from ForestForesight XGBoost model in Kaliminatan

Preservation of tropical rainforests is of the utmost importance as these forests are one of the most valuable biomes for global flora and fauna diversity. This project aims to study the spatial patterns of the false positives(FPs) from the World Wildlife Funds (WWF-NL) Forest Foresight model to help improve its performance. The primary goal is to understand and mitigate the significant amount of false positive predictions produced by the model, which is critical for maintaining these essential ecosystems. To achieve the aforementioned goal, the project focuses on identifying spatial patterns in false positives and evaluating both internal and external deforestation drivers that might account for these patterns. Methods used within the study revolve around the assessment of spatial distribution, variation and correlation. To determine these features of the residuals default, isolated and patches scenarios were considered. In these scenarios, a hotspot analysis, kernel density and variogram estimation were performed to identify the spatial distribution and variation. During the study, atemporal analysis was performed to generalize results for different dates and explore future research purposes. Lastly, the input and external drivers of FP predictions were correlated against the FP predictions to explain the patterns in the residuals. The study revealed distinct spatial patterns of false positives, especially in the patch scenarios, while their spatial variation was proven to be non-random. Correlations between false positives and input drivers can be seen in areas where deforestation has previously occurred, while palm oil plantations and logging concessions offer additional valuable information for the model as new additions. In conclusion, this report highlights the importance of an integrated approach to properly assess deforestation, taking into accountspatial and temporal variability. The knowledge gained from this study will play a key role in upgrading theFF model to become a more powerful tool for decision-making.

Miscellaneous Findings

During research for our program we have done some miscellaneous findings that are worth sharing:

 

28.jpg

 

29.jpg

 

30.jpg

 

Â