Comparing multivariate regression techniques for hyperspectral plant trait prediction
The project Spectral Soft Sensor Project (LUT, 2025) is focused on Theme A3 (Advanced, 30p). The dataset combines hyperspectral measurements (450โ2500 nm, 1 nm resolution) with plant functional traits such as leaf pigments, equivalent water thickness, and leaf area index.
Our task is to predict vegetation traits from spectra using multiple regression approaches and evaluate their strengths and weaknesses across accuracy, interpretability, and computation time. We compare:
- MLR (Multiple Linear Regression)
- PCR (Principal Component Regression)
- PLS (Partial Least Squares)
- k-PLS (Kernel Partial Least Squares)
๐ For full details, see the official Theme A3 project description (PDF).
- Nada Rahali
- Umme Tanjuma Haque
- Chamath Wijerathne
- Dataset origin: Multi-sensor vegetation study spanning different continents, climates, and vegetation types.
- Inputs (X): Hyperspectral reflectance (450โ2500 nm), preprocessed to:
- remove water absorption bands: 1351โ1430, 1801โ2023, 2451โ2501
- apply SavitzkyโGolay smoothing
- interpolate bands to a uniform 1 nm grid
- Responses (Y): Leaf and canopy traits (e.g., pigments, leaf area index, equivalent water thickness).
- Goal: Predict traits using MLR/PCR/PLS/k-PLS, then compare methods by:
- test-partition performance,
- model interpretability,
- training & inference time.
The dataset originates from a multi-sensor study, where spectral data and vegetation properties from 42 datasets collected across various continents, climates, and vegetation types have been combined.
-
Hyperspectral data (input variables):
- Wavelengths: 450โ2500 nm, in 1 nm increments
- Preprocessing steps:
- Removal of water absorption bands (1351โ1430, 1801โ2023, 2451โ2501)
- Smoothing with SavitzkyโGolay filter
- Band interpolation to ensure the same resolution
-
Leaf and canopy traits (response variables):
- Leaf pigments
- Leaf Area Index (LAI)
- Equivalent water thickness
- Other canopy structural and physiological traits (see table in dataset description)
- Language: Python
- Libraries: NumPy, pandas, matplotlib, seaborn, scipy, scikit-learn
- Analysis: EDA, PCA, model calibration/validation
- Modeling: MLR ยท PCR ยท PLS ยท k-PLS
- Week 1 (W36): Team formation & topic selection
- Week 2 (W37): Data summary & visualization (PDF #1 + code)
- Week 3โ4 (W38โW39): Data pretreatment and Modelling Plan (PDF #2 + code)
- Week 5โ6 (W40โW41): Initial model results (PDF #3 + code)
- Week 7โ8 (W42โW43): Final model (PDF #4 + code)
- Turn hyperspectral reflectance into trait-level insights
- Benchmark MLR/PCR/PLS/k-PLS for trait prediction
- Balance accuracy, interpretability, and computational efficiency
- Practice collaborative workflow with Teams/WhatsApp + GitHub
- To be added later on.
