ff_dqc: Input data quality control

The ff_dqc (ForestForesight Data Quality Control) function is designed to analyze a folder of TIF files, providing a comprehensive summary of the raster data quality. It checks for consistency in spatial properties, temporal coverage, and data values across multiple raster files. This function is crucial for ensuring data integrity and identifying potential issues in the ForestForesight dataset.

Exercises

  1. Basic Usage:
    Run a data quality check on a folder of TIF files.

library(ForestForesight) folder_path <- "path/to/your/tif/folder" dqc_results <- ff_dqc(folder_path) print(names(dqc_results)) print(head(dqc_results$byfeature))
  1. Analyzing Spatial Consistency:
    Check if all rasters in the folder have the same spatial extent and resolution.

dqc_results <- ff_dqc("path/to/tif/folder") print(paste("Equal extent across all files:", dqc_results$equalextent)) print("Minimum extent:") print(dqc_results$minextent)
  1. Temporal Coverage Analysis:
    Examine the temporal characteristics of dynamic features.

dqc_results <- ff_dqc("path/to/tif/folder") dynamic_features <- dqc_results$byfeature[dqc_results$byfeature$type == "dynamic", ] print(dynamic_features[, c("feature", "mindate", "maxdate", "gaps", "doubles")])
  1. Identifying Problematic Features:
    Find features with inconsistent properties across files.

dqc_results <- ff_dqc("path/to/tif/folder") problematic_features <- dqc_results$byfeature[ dqc_results$byfeature$npixel == "diff" | dqc_results$byfeature$resolution == "diff" | dqc_results$byfeature$crsname == "diff", ] print(problematic_features)
  1. Checking Date Format Consistency:
    Verify if all files have correctly formatted dates in their filenames.

dqc_results <- ff_dqc("path/to/tif/folder") print(paste("Number of files with incorrect date formats:", dqc_results$incorrect_dateformats))
  1. Summarizing Data Values:
    Examine the range of values for each feature.

dqc_results <- ff_dqc("path/to/tif/folder") value_summary <- dqc_results$byfeature[, c("feature", "mean", "max", "hasNA")] print(value_summary)

These exercises will help you understand how to use the ff_dqc function to perform comprehensive quality checks on your ForestForesight raster datasets. This function is particularly useful for identifying inconsistencies in spatial properties, temporal coverage, and data values across multiple raster files, which is crucial for ensuring the reliability of your deforestation prediction models.