Skip to content

SimonDedman/statscourse

Repository files navigation

statscourse

Teaching materials for the FIU graduate statistics course (Spring 2026), covering data wrangling, exploration, and species distribution modelling with R.

Lectures

# Topic Slides Script
01 Tidy Data R/01_tidy-data.R
02 Data Transformation R/02_transform.R
08 TidyModels, BRTs & SDMs HTML slides R/tidymodels_sdm_workflow.R

Lecture 08: TidyModels for Species Distribution Modelling

A complete workflow for building Boosted Regression Tree (BRT/xgboost) species distribution models using the tidymodels framework. Covers:

  • Data splitting with rsample and spatialsample (spatial block CV)
  • Preprocessing with recipes (imputation, normalisation, VIF)
  • Model specification with parsnip (boost_tree/xgboost)
  • Hyperparameter tuning with dials and tune
  • Evaluation with yardstick (MCC, TSS/j_index, AUC, SEDI, and 12 other metrics)
  • Variable importance with vip and partial dependence with DALEX
  • Class imbalance handling with themis (SMOTE, class weights)
  • Spatial packages: terra, sf, tidyterra, tidysdm
  • Prediction to raster grids
  • SEDI metric: custom yardstick implementation for low-prevalence species (< 2.5%)

Key metric choices

  • Model selection: MCC (Matthews correlation coefficient) — uses all four confusion matrix quadrants
  • Low prevalence (< 2.5%): switch to SEDI (Wunderlich et al. 2019) — prevalence-independent via log transform
  • Reporting: AUC + TSS + MCC (standard); add SEDI for rare species

Data

Example datasets use Irish Sea survey trawl data:

  • samples.rds (2,244 records, training) and grids.rds (378,570 cells, prediction surface) are required for Lecture 08 but not included in the repo due to size. Available from the course instructor.
  • sharkdata.rda and associated files are used in Lectures 01-02.

Installation

# install.packages("pak")
pak::pak("SimonDedman/statscourse")

References

About

2026-01-28 Al Harbourne FIU stats/R course

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors