CursorTracker

Unsupervised mouse cursor detection and tracking in instructional videos using tracking-by-detection.

Quick Start

# Install dependencies
poetry install && poetry shell

# From YouTube URL - single command for everything
python cursor_tracker.py \
  --url https://youtube.com/watch?v=VIDEO_ID \
  --output-dir ./data/my_video

# View results
open data/my_video/our_results_1/tracked_video_our_results_1.mp4

Features

Fully Unsupervised: Automatically discovers cursor templates, no manual annotation needed
End-to-End Pipeline: YouTube URL to Download to Extract to Track to Visualize
Robust Tracking: Handles fast motion (over 200px per frame) and instant appearance changes
Visual Output: Generates annotated videos with bounding boxes around detected cursors

How It Works

Unsupervised Template Discovery: Uses background subtraction + blob detection to identify cursor templates
Multi-Scale Template Matching: Generates cursor proposals for each frame
Spatiotemporal Path Optimization: Finds optimal tracking trajectory through entire video
Visualization: Draws bounding boxes on frames and creates annotated video

Installation

# Install Poetry
curl -sSL https://install.python-poetry.org | python3 -

# Install dependencies
git clone https://github.com/yourusername/CursorTracker.git
cd CursorTracker
poetry install
poetry shell

# Create directories
mkdir -p data templates saved_models

Usage

YouTube Videos (Recommended)

Basic usage:

python cursor_tracker.py \
  --url "https://youtube.com/watch?v=VIDEO_ID" \
  --output-dir ./data/my_video

Options:

# Custom quality
--quality 1080p  # Options: 144p, 360p, 480p, 720p, 1080p, 1440p, 2160p

# Process specific frames
--start-frame 100 --end-frame 500

# Skip tracking (preprocessing only)
--skip-tracking

# Custom configuration
--config my_config.yaml

Local Video Files

# Step 1: Preprocess video
python preprocess_video.py \
  --video_path /path/to/video.mp4 \
  --output_dir ./data/my_video \
  --extract_templates

# Step 2: Track cursor
python cursor_tracker_dp.py \
  --video_name my_video \
  --base_dir ./data

# Step 3: Visualize (optional - automatic with YouTube pipeline)
python visualize_results.py \
  --video_name my_video \
  --base_dir ./data

Output Structure

data/my_video/
├── original_video.mp4              # Downloaded video
├── images/                         # Extracted frames
├── background/                     # Background masks
├── estimated_templates/            # Auto-discovered cursor templates
└── our_results_1/
    ├── our_results.txt             # Tracking results (CSV)
    ├── visualizations/             # Annotated frames
    └── tracked_video_our_results_1.mp4  # Annotated video

Visualization

Automatic (YouTube Pipeline)

Visualizations are generated automatically when using cursor_tracker.py.

Manual (Standalone)

python visualize_results.py \
  --video_name my_video \
  --base_dir ./data \
  --bbox_color "0,255,0" \  # Green (BGR format)
  --bbox_thickness 2 \
  --fps 30 \
  --quality 9

Configuration

Edit config/config.yaml to customize:

template_matching:
  score_threshold: 0.5          # Min template match score
  use_laplacian: true           # Edge detection
  template_vicinity: 300        # Temporal window for templates
  max_scale: 2                  # Max template scale factor
  nms_overlap_threshold: 0.3    # IoU threshold for NMS

tracking:
  enabled: true                 # Enable path optimization
  dist_threshold: 150           # Max pixel distance between frames
  scale_threshold: 1.3          # Max scale change ratio

Performance

Tested on 8 Adobe Photoshop instructional videos (3595 frames):

Method	VIOU	Success Rate
CursorTracker (Ours)	0.365	~87%
Faster-RCNN	0.05	~25%
Online Trackers (TLD/MIL)	0.03	~15%

Speed: ~0.5 seconds/frame (1280×720)
Robustness: Handles 200+ pixel movements and instant appearance changes

Key Scripts

Script	Purpose
`cursor_tracker.py`	Main pipeline: YouTube → Track → Visualize
`preprocess_video.py`	Extract frames + background masks from local video
`extract_templates.py`	Discover cursor templates from preprocessed data
`cursor_tracker_dp.py`	Run cursor tracking with DP path optimization
`visualize_results.py`	Generate annotated frames and video

Dependencies

Core:

Python >=3.10,<3.13
OpenCV, NumPy, scikit-image
PyYAML, tqdm, imageio
youtube-downloader (git dependency)

Optional (install with poetry install --with ml):

TensorFlow, Keras (for CNN filtering)

Citation

@inproceedings{cursortracker2020,
  title={Mouse Cursor Detection and Tracking in Instructional Videos},
  booktitle={IEEE Winter Conference on Applications of Computer Vision (WACV)},
  year={2020}
}

Troubleshooting

Few templates discovered?

Check background subtraction quality in background/ folder
Adjust --consecutive_frames parameter in template extraction

Poor tracking results?

Tune dist_threshold and scale_threshold in config
Try adjusting score_threshold (lower = more proposals)

Out of memory?

Reduce template_vicinity parameter
Process in segments with --start-frame / --end-frame

Algorithm Overview

Phase 1: Unsupervised Template Discovery

Apply MOG background subtraction
Detect blobs (moving objects) in difference images
Track sequences where exactly 1 blob appears for N consecutive frames
Extract and save cursor templates from these sequences

Phase 2: Multi-Scale Template Matching

Select templates from temporal vicinity of current frame
Generate multi-scale template versions
Perform normalized cross-correlation matching
Apply non-maximum suppression to proposals

Phase 3: Optimal Path Search

Model as graph optimization problem
Find highest-scoring spatiotemporal path through video
Enforce distance and scale constraints between consecutive frames
Output optimal cursor trajectory

Key Insight: Cursors in screencasts exhibit unique motion signatures (movement while background stays static), enabling unsupervised discovery without labeled training data.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CursorTracker

Quick Start

Features

How It Works

Installation

Usage

YouTube Videos (Recommended)

Local Video Files

Output Structure

Visualization

Automatic (YouTube Pipeline)

Manual (Standalone)

Configuration

Performance

Key Scripts

Dependencies

Citation

Troubleshooting

Algorithm Overview

Phase 1: Unsupervised Template Discovery

Phase 2: Multi-Scale Template Matching

Phase 3: Optimal Path Search

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

CursorTracker

Quick Start

Features

How It Works

Installation

Usage

YouTube Videos (Recommended)

Local Video Files

Output Structure

Visualization

Automatic (YouTube Pipeline)

Manual (Standalone)

Configuration

Performance

Key Scripts

Dependencies

Citation

Troubleshooting

Algorithm Overview

Phase 1: Unsupervised Template Discovery

Phase 2: Multi-Scale Template Matching

Phase 3: Optimal Path Search

License