Table of Contents | ||
---|---|---|
|
In our team, we have done a lot of research, both within the team and outsourced to other institutions. This has been done to create an optimally performing model and ensure quality over time.
Internship Report - Predicting Deforestation in Laos
View file | ||
---|---|---|
|
This study addresses the issue of deforestation, which poses a threat to biodiversity and ecosystem services. With a need for early intervention to prevent deforestation events and safeguard the forest, the study focuses on deforestation prediction using the eXtreme Gradient Boosting (XGBoost) model within the Forest Foresight (FF) program. The objectives include optimising model parameters, assessing feature importance, evaluating predictions in deforestation-free areas, and analysing seasonal patterns. The study uses data covering the tropical belt, organised into 10 by 10-degree tiles with a spatial resolution of 400 by 400 meters and monthly temporal resolution from January 2020 to June 2023. The primary focus is on the Lao People’s Democratic Republic (Laos); however, to address the divergent results observed in Laos and to offer broader insights into the effectiveness of the tested method on a global scale, the analysis also includes Gabon, Colombia, Peru, and Bolivia. The Global ForestWatch (GFW) Integrated Alerts integrating GLAD and RADD alert systems are used as ground truth data. Features include static and dynamic variables derived from GFW and GLAD data, as well as accessibility, terrain, anthropogenic activity, and climate variables. The model is evaluated using precision, recall, and F0.5 scores and is compared against a baseline model. Optimising model hyper-parameters had only a modest effect on performance, with consistency observed across different landscapes. However, variations in training duration and the use of a dynamic threshold underscored the necessity for different approaches per country. Notably, during our research, correlated and uninformative features were identified for Laos. However, it was also observed that XGBoost exhibited robustness in managing these features. However, predicting deforestation in non-deforested areas is challenging, suggesting the need for alternative methodologies. While seasonal deforestation patterns were captured, detecting trend changes proved difficult, though using a dynamic threshold somewhat mitigated this. Future investigations may explore alternative algorithms, such as deep learning and spatial models, alongside an emphasis on high confidence alerts and forecasting deforestation extents.
Internship Report- ForestForesight Finetuner [Dutch]
View file | ||
---|---|---|
|
This research addresses the following question: To what extent is it possible to achieve an F0.5 score higher than 0.62 using the current machine learning model or novel machine learning techniques for predicting deforestation in Brazil over the next six months? During the investigation, various classification models were examined and trained using spatiotemporal data. Support Vector Machine, K-Nearest Neighbors, Decision Tree, and Gaussian Naive Bayes algorithms were among those studied, with their performance, parameter configurations, and utilized data sets described. In addition to these models, AutoML was also evaluated. AutoML is a software methodology that automatically searches for optimal models and their corresponding parameters. All models were subsequently validated using the F0.5 score metric. The research findings indicate that no machine learning technique was identified that achieved an F0.5 score exceeding 62% for a specific region in Brazil.
Internship Report - Identifying key deforestation drivers
xxxxxx ADD FINAL VERSION xxxxxxxx
Protecting forests across the globe is crucial, now more than ever. To address this, ForestForesight was developed to prevent illegal deforestation by generating highly accurate predictions using machine learning models and collaborates with a network of local stakeholders to take action. This research focuses on exploring high-detail features that can serve as proxies for deforestation to be implemented in the ForestForesight model. A literature review identified 29 deforestation indicators across 10 themes. An online search revealed 76 open-source datasets and repositories, offering over 100 high-detail datasets with global coverage. From these, 15 promising datasets were selected, preprocessed, implemented, and tested in the models. Results showed no significant change in prediction accuracy with the newly implemented features. However, the new features proved to be important in several models, though they did not come close to the importance of GLAD integrated alerts datasets. Implementing new features can lead to marginal improvements and contribute valuable explanations of why deforestation is occurring. However, it is unlikely that adding features unrelated to integrated alerts will lead to significant improvements in model accuracy, as their predictive power is too low at the high resolution of 0.004 degrees.
Experiments- Amounts, Classes and Isolated Pixels
View file | ||
---|---|---|
|
This document described three experiments and their results: the Amounts and Likelihood Adjusted Quantity Analysis, the prediction of deforestation classes, and the analysis of isolated pixels. Unfortunately, none of these methods were effective enough to be suitable for implementation in the FF project.
Miscellaneous Findings
During research for our program we have done some miscellaneous findings that are worth sharing: