Skip to content

Latest commit

 

History

History
88 lines (69 loc) · 3.74 KB

File metadata and controls

88 lines (69 loc) · 3.74 KB

M3LSNet for Mars Landslide Segmentation Challenge

This repository contains the solution for the 1st Mars Landslide Segmentation Challenge. It implements M3LSNet (Mamba-based Multimodal Martian Landslide Segmentation Network), optimized with domain-specific preprocessing and advanced training strategies to handle the unique challenges of Martian terrain segmentation (e.g., Domain Shift, Class Imbalance).

🚀 Key Features

1. Architecture: M3LSNet

  • Mamba-based Backbone: Incorporates Visual State Space Duality (VSSD) blocks for efficient long-range dependency modeling.
  • Hardware Aware: Automatically detects if mamba_ssm (CUDA kernel) is available. Falls back to a Gated Conv Simulator if specific hardware requirements aren't met.
  • Multimodal Fusion: Dedicated modules to fuse 7-channel inputs (Thermal, Slope, DEM, Grayscale, RGB).
  • Optimization: Replaced conflicting BatchNorm with GroupNorm to work harmoniously with per-image normalization.

2. Advanced Data Strategy

  • Domain Shift Mitigation:
    • DEM (Elevation): Uses Mean-Shift Normalization (remove offset, preserve scale) to handle absolute elevation differences between regions.
    • Visuals (RGB/Gray): Uses Damped Instance Normalization to correct lighting/contrast shifts without amplifying noise in shadow regions.
  • Robust Augmentation (augmentations.py):
    • Random Horizontal/Vertical Flips.
    • Random 90/180/270° Rotations to force learning of morphological features rather than orientation textures.
    • Gaussian Noise injection for channel robustness.

3. Training Stability

  • Loss Function: CombinedLoss (50% CrossEntropy + 50% DiceLoss) to handle extreme foreground/background imbalance.
  • Gradient Clipping: Stabilizes training on small/noisy batches.
  • Monitoring: Full integration with Tensorboard, logging mIoU, F1, Precision, Recall, and IoU per class.

📂 Repository Structure

M3LSNet/
├── network.py             # M3LSNet Model Definition (VSSD, SSMCore, MFF)
├── dataset.py             # 7-Channel Tiff Loader with Smart Normalization
├── train.py               # Main Training Loop (A100 Optimized)
├── generate_submission.py # Inference & Submission Zipping
├── augmentations.py       # Online Data Augmentation
├── utils_loss.py          # Combined Loss Implementation
└── outputs/               # Structured Logs & Checkpoints
    └── YYYYMMDD_HHMMSS/
        ├── checkpoints/   # best.pth, last.pth
        ├── logs/          # Training logs
        └── tensorboard/   # Visualization

🛠️ Usage

1. Requirements

Ensure you have the following installed:

  • PyTorch 2.x (CUDA recommended)
  • tifffile (for 7-channel images)
  • mamba_ssm (Optional, for acceleration)
  • tensorboard, tqdm

2. Training

To start training with the optimized settings (Batch Size 32, Combined Loss):

python train.py

Outputs will be saved to outputs/CurrentTimestamp/.

3. Monitoring

Track training progress:

tensorboard --logdir outputs/

4. Generating Submission

To generate the submission.zip for the challenge leaderboard:

# Uses the 'best.pth' from the specified run
python generate_submission.py --checkpoint outputs/YOUR_RUN_DIR/checkpoints/best.pth

or use the helper script:

sh generate_submission.sh

📊 Performance Notes

  • Validation mIoU: May reach high values (e.g., >0.90) due to spatial correlation in the dataset.
  • Test Generalization: The Per-Image Normalization and Augmentation strategies are specifically designed to maximize performance on the hidden Test Set by forcing the model to learn invariant physical features.