Skip to content

dinhdat07/mabe-challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MABe Mouse Behavior Detection

Kaggle Medal Ranking Task


Overview

This repository presents a Bronze Medal (Top 9%) solution for the Kaggle MABe Mouse Behavior Detection Challenge.

The objective is to classify 30+ social and non-social behaviors from multi-agent pose estimation data, requiring robust modeling of temporal dynamics, agent interactions, and cross-domain generalization across different experimental settings.


Problem Characteristics

Multi-Agent Temporal Challenge

  • Multi-agent interaction modeling (pairwise + group behaviors)
  • Strong temporal dependencies across sequences
  • Distribution shift across different labs and environments
  • Highly imbalanced behavior classes

Approach

Models Optimization

Pipeline

Feature Engineering

  • Extracted spatial features from keypoints
  • Built pairwise interaction features between agents
  • Designed temporal features (velocity, motion patterns, windows)

Modeling

  • XGBoost baseline with per-behavior binary classification
  • ResNet-1D CNN on windowed sequences
  • Causal Temporal Convolutional Network (TCN) for sequence modeling

Ensemble Strategy

  • Combined predictions across three XGBoost models with different hyperparameters and seeds
  • Optimized weights and thresholds using Optuna
  • Improved robustness across sections and modes (single / pair)

Results

  • Bronze Medal on Kaggle leaderboard
  • Top 9% overall ranking
  • Built a robust pipeline for multi-agent behavior recognition

Repository Structure

data/        # generators, labels
features/    # feature engineering (single / pair)
training/    # model trainers (XGB, CNN, TCN, ensemble)
inference/   # inference pipelines and submission builders
notebooks/   # EDA and modeling notebooks
scripts/     # training & inference bash scripts

Key Components

  • training/train_runner.py Orchestrates feature building, label preparation, and model training

  • training/ensemble_trainer.py Performs Optuna-based optimization for ensemble weights and thresholds

  • inference/runner.py Runs inference loops and assembles final submission outputs


Tech Stack

Python PyTorch XGBoost Optuna Pandas NumPy


Requirements

  • Python 3.10+
  • PyTorch, scikit-learn, optuna, joblib
  • pandas, numpy, xgboost

Environment variables:

TRAIN_CSV
TEST_CSV
TRAIN_ANNO
TRAIN_TRACK
TEST_TRACK
MODEL_DIR

Quick Run

# XGBoost
bash scripts/run_xgb.sh

# CNN
export MODEL_DIR=models/cnn
bash scripts/run_cnn.sh

# TCN
export MODEL_DIR=models/tcn
bash scripts/run_tcn.sh

# Ensemble
export THR_JSON=models/ensemble/thresholds.json
export WEIGHT_JSON=models/ensemble/weights.json
export MODEL_ROOTS_JSON='{"xgb":"models/xgb","cnn":"models/cnn","tcn":"models/tcn"}'
bash scripts/run_ensemble.sh

Notes

  • Ensemble requires precomputed OOF predictions
  • CNN/TCN training is significantly faster on GPU
  • Thresholds and weights must be tuned before final submission

Summary

This project demonstrates a complete pipeline for:

  • Multi-agent sequence modeling
  • Feature engineering for structured time-series data
  • Ensemble optimization for competitive ML performance
  • Robust handling of distribution shift in real-world datasets

About

Bronze Medal solution for the Kaggle MABe Challenge, focusing on multi-agent behavior recognition from pose data using feature engineering, temporal modeling, and ensemble methods.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors