ff_prep: data preparation
The ff_prep
function is designed to prepare data for training, validating, and predicting deforestation using the ForestForesight algorithm. It processes spatial data (rasters and vectors) for specified areas (countries or custom shapes) and dates. The function handles tasks such as loading raster data, filtering features, sampling data, adding date-related features, and preparing ground truth data for model training and validation.
It is important to note that ff_prep will take the last previous date for every month for every feature, as illustrated below.
Exercises
Basic Usage:
Create a simple call toff_prep
for a specific country and date range.
library(ForestForesight)
prepared_data <- ff_prep(
datafolder = "/path/to/your/data",
country = "BRA",
dates = ForestForesight::daterange("2022-01-01", "2022-12-31"),
sample_size = 0.2
)
print(names(prepared_data))
print(head(prepared_data$data_matrix$features))
Custom Shape:
Use a custom shape instead of a country code to prepare data.
library(terra)
# Create a simple polygon
custom_shape <- vect("POLYGON((-60 -10, -55 -10, -55 -5, -60 -5, -60 -10))", crs="EPSG:4326")
prepared_data <- ff_prep(
datafolder = "/path/to/your/data",
shape = custom_shape,
dates = "2023-01-01",
fltr_features = "initialforestcover",
fltr_condition = ">0"
)
print(prepared_data$features)
Feature Selection:
Prepare data with specific included and excluded features.
prepared_data <- ff_prep(
datafolder = "/path/to/your/data",
country = "PER",
dates = c("2022-06-01", "2022-12-01"),
inc_features = c("slope", "elevation", "precipitation"),
exc_features = "temperature"
)
print(prepared_data$features)
Validation Split:
Prepare data with a validation split for model evaluation.
prepared_data <- ff_prep(
datafolder = "/path/to/your/data",
country = "COL",
dates = ForestForesight::daterange("2023-01-01", "2023-12-31"),
sample_size = 0.5,
validation_sample = 0.2
)
print(dim(prepared_data$data_matrix$features))
print(dim(prepared_data$validation_matrix$features))
Custom Filtering:
Apply custom filtering conditions to the prepared data.
prepared_data <- ff_prep(
datafolder = "/path/to/your/data",
country = "IDN",
dates = "2023-01-01",
fltr_features = c("elevation", "slope"),
fltr_condition = c(">100", "<30")
)
print(nrow(prepared_data$data_matrix$features))
These exercises will help you understand how to use various parameters of the ff_prep
function to prepare data for different scenarios in deforestation prediction. Remember to replace "/path/to/your/data" with the actual path to your ForestForesight data folder.