Skip to content

MarkoMilenovic01/OpticalFlowFlownetPytorch

Repository files navigation

Optical Flow – FlowNet & Farnebäck (PyTorch)

This repository contains a complete optical flow pipeline implemented in PyTorch, covering:

  • 📦 Dataset loading & preprocessing (FlyingChairs, MPI-Sintel)
  • 🧠 Two FlowNet-style CNN architectures (single-scale & multi-scale)
  • 🏋️ Training with AMP (mixed precision)
  • 📊 Quantitative evaluation using Endpoint Error (EPE)
  • 🎨 Extensive visualizations (HSV flow maps, training curves, comparisons)
  • 🎥 Optical flow estimation on real videos (GIF & frame outputs)
  • 🔍 Classical baseline using OpenCV Farnebäck for comparison

The project is structured as a research-style pipeline, suitable for coursework, experiments, and extensions.


Repository Structure

.
├── src/                        # Core source code
│   ├── dataset_preparation.py  # FlyingChairs + Sintel loaders, .flo parsing, HSV visualization
│   ├── model.py                # FlowNetSingleOutput & FlowNetMultiOutput
│   ├── train.py                # Training loop (AMP, checkpoints, curves)
│   ├── eval_single.py          # Sintel evaluation – single-output model
│   ├── eval_multi.py           # Sintel evaluation – multi-output model
│   ├── farneback_eval.py       # Farnebäck multi-configuration evaluation
│   └── video_flow.py           # Optical flow on real videos + GIF generation
│
├── eval_single_clean/           # Single-output Sintel visualizations & stats
├── eval_multi_clean/            # Multi-output Sintel visualizations & stats
├── farneback_multi_test/        # Farnebäck visual results (multiple configs)
├── vis_output/                  # FlyingChairs flow visualizations
├── video_flow_output/           # Video optical flow frames & GIFs
│
├── curve_single.png             # Training curve (single-output)
├── curve_multi.png              # Training curve (multi-output)
├── video.mp4                    # Example input video
└── README.md

Datasets

FlyingChairs

Used for training FlowNet models.

Expected structure:

data/FlyingChairs/FlyingChairs_release/data/
├── 00001_img1.ppm
├── 00001_img2.ppm
├── 00001_flow.flo
├── ...

Features:

  • Optional resizing with correct flow vector scaling
  • PyTorch Dataset + DataLoader

MPI-Sintel (Clean / Final)

Used for evaluation.

Expected structure:

data/MPI-Sintel-complete/training/
├── clean/
│   ├── alley_1/
│   └── ...
├── flow/
│   ├── alley_1/
│   └── ...

Models

FlowNetSingleOutput

  • Encoder–decoder (U-Net style)
  • Predicts one full-resolution flow map
  • Loss: EPE (Endpoint Error)

Output:

[B, 2, H, W]

FlowNetMultiOutput

  • Multi-scale FlowNet variant
  • Predicts flow at 4 resolutions: H, H/2, H/4, H/8
  • Training with multi-scale supervision

Outputs:

(flow0, flow1, flow2, flow3)

Training

Training uses:

  • ✅ Adam optimizer
  • ✅ AMP (Automatic Mixed Precision)
  • ✅ Multi-scale loss (for multi-output model)
  • ✅ Learning rate decay
  • ✅ Periodic validation + checkpointing

Example:

python src/train.py

Outputs:

  • checkpoints/flownet_*_best.pt
  • checkpoints/flownet_*_last.pt
  • Training curves saved as PNG

Training Curves

These curves are generated automatically at the end of training and are already present in the repository root.

  • Left: FlowNetSingleOutput training & validation EPE
  • Right: FlowNetMultiOutput (multi-scale supervision)

Flow Visualization (HSV)

Optical flow is visualized using HSV mapping:

  • Hue → direction
  • Saturation → magnitude
  • Value → constant

FlyingChairs Examples (Ground Truth)

The following visualizations are generated directly from the FlyingChairs dataset using HSV flow encoding and are saved in the repository.

Each figure shows:

  1. Image 1
  2. Image 2
  3. Ground-truth optical flow (HSV)

Evaluation – Single Output (MPI-Sintel)

Each visualization (already stored in the repository) contains:

  • Input image
  • Ground-truth flow (HSV)
  • Predicted flow (HSV)
  • Per-sample Endpoint Error (EPE) shown in the title

Summary statistics are stored in:

eval_single_clean/clean_summary.txt

Evaluation – Multi Output (MPI-Sintel, Multi-Scale)

Each figure visualizes four resolution levels:

  • Full resolution (H)
  • H / 2
  • H / 4
  • H / 8

For every scale, ground truth vs prediction is shown using HSV encoding.

Summary statistics are stored in:

eval_multi_clean/clean_summary.txt

Classical Baseline – Farnebäck (OpenCV)

The repository includes real Farnebäck results generated with multiple parameter configurations.

Each image shows:

  • Input frame
  • Ground-truth flow
  • Farnebäck-predicted flow

All configurations and quantitative results are summarized in:

farneback_multi_test/summary_all_configs.txt

Video Optical Flow (Real Video Input)

The following results are generated from the provided video.mp4 and are already stored in the repository.

GIF Animations

  • FlowNet GIF (LEFT): deep-learning-based optical flow
  • Farnebäck GIF (RIGHT): classical OpenCV baseline

Metrics

Endpoint Error (EPE)

$$ \mathrm{EPE} = \lVert \hat{\mathbf{u}} - \mathbf{u} \rVert_2 $$

Used for:

  • Training loss
  • Validation
  • Sintel benchmarking

Requirements

  • Python ≥ 3.9
  • PyTorch ≥ 2.0
  • CUDA (optional, recommended)
  • OpenCV
  • NumPy
  • Matplotlib
  • ImageIO

Notes

  • All resizing operations correctly rescale flow vectors
  • Models expect spatial dimensions divisible by 8
  • Code is written to be readable and modular for experimentation

Author

Developed as an academic / research project focused on optical flow estimation, combining:

  • Deep learning (FlowNet-style CNNs)
  • Classical vision (Farnebäck)
  • Quantitative evaluation (EPE)
  • Rich visual analysis

If you use this repository for coursework or research, feel free to adapt and extend it 🚀

About

Optical flow estimation with FlowNet (PyTorch), FlyingChairs & MPI-Sintel evaluation, and Farnebäck baseline.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages