Skip to content

venumigihansa/PointNet-S3DIS-SemanticSegmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PointNet for S3DIS Scene Semantic Segmentation

PyTorch Python License

A complete PyTorch implementation of PointNet for 3D indoor scene semantic segmentation using the Stanford 3D Indoor Scene Dataset (S3DIS). This project implements the architecture from scratch based on the original research paper by Qi et al.

🎯 Overview

This implementation focuses on scene semantic segmentation, classifying every point in room-scale 3D point clouds into semantic categories. The model processes entire indoor scenes and assigns semantic labels to each point, enabling detailed understanding of 3D indoor environments.

πŸ—οΈ Architecture

Core Components

  • STN3d: 3D Spatial Transformer Network for input transformation
  • STNkd: k-dimensional Spatial Transformer Network for feature alignment
  • PointNetFeatureExtractor: Main feature extraction backbone
  • PointNetSegmentation: Complete segmentation model with classification head
image

✨ Key Features

  • Input transformation networks for rotation invariance
  • Optional feature transformation for better alignment
  • Point-wise classification for semantic segmentation
  • Regularization loss for transformation matrices
  • 🏷 Support for 13 semantic classes from S3DIS

πŸ“Š Dataset

S3DIS (Stanford 3D Indoor Scene Dataset)

  • 6 indoor areas with 271 rooms
  • 13 semantic classes: ceiling, floor, wall, beam, column, window, door, chair, table, bookcase, sofa, board, clutter
  • Point clouds with RGB information
  • Instance and semantic annotations

πŸ“ Project Structure

pointnet-s3dis/
β”œβ”€β”€  src/
β”‚   β”œβ”€β”€  models/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ pointnet.py          # Core PointNet architecture
β”‚   β”‚   └── transforms.py        # Spatial transformer networks
β”‚   β”œβ”€β”€  data/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ dataset.py           # S3DIS dataset loader
β”‚   β”‚   └── preprocessing.py     # Data preprocessing utilities
β”‚   β”œβ”€β”€ πŸ›  utils/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ metrics.py           # Evaluation metrics
β”‚   β”‚   β”œβ”€β”€ visualization.py     # Visualization utilities
β”‚   β”‚   └── training.py          # Training utilities
β”‚   └──  train.py              # Main training script
β”œβ”€β”€  notebooks/
β”‚   └── pointnet_implementation.ipynb
β”œβ”€β”€  configs/
β”‚   └── config.yaml
β”œβ”€β”€  requirements.txt
β”œβ”€β”€  README.md
└──  .gitignore

πŸš€ Quick Start

1️⃣ Installation

git clone https://github.com/yourusername/pointnet-s3dis.git
cd pointnet-s3dis
pip install -r requirements.txt

2️⃣ Data Preparation

python src/data/preprocessing.py

3️⃣ Training

# Default training
python src/train.py

# Custom parameters
python src/train.py --batch_size 16 --num_points 4096 --epochs 100 --test_area 5

4️⃣ Evaluation

python src/evaluate.py --model_path checkpoints/best_model.pth --test_area 5

5️⃣ Visualization

python src/visualize.py --model_path checkpoints/best_model.pth --num_samples 5

πŸ“ˆ Results

Performance Metrics

Metric Value Status
Final Validation Accuracy 67.45% βœ… Good
Best Mean IoU 36.42% βœ… Solid
Final Mean IoU 31.41% βœ… Reasonable
Training Epochs 100 ⏱️ Complete

Per-Class IoU Results

Class IoU Performance Analysis
Floor 89.03% Excellent Best performing - large planar surfaces
Ceiling 83.43% Excellent Strong geometric consistency
Wall 54.12% Good Solid performance with room for improvement
Bookcase 41.17% Moderate Complex furniture structure
Table 35.24% Moderate Shape variation challenges
Chair 30.61% Moderate High variability and occlusion
Door 26.53% Moderate Confusion with walls
Window 23.24% Moderate Embedded in walls
Clutter 16.51% Poor Highly variable category
Board 5.97% Very Poor Small objects, scale issues
Column 2.47% Very Poor Thin structures, limited examples
Beam 0.00% Failed Extremely sparse in dataset
Sofa 0.00% Failed High variation, dataset imbalance
image result-1

πŸ”§ Implementation Details

πŸ—οΈ Model Architecture

  • Input: Point clouds with XYZ coordinates (N Γ— 3)
  • Feature Extraction: Shared MLPs with batch normalization
  • Spatial Invariance: Transformer networks for geometric robustness
  • Permutation Invariance: Global max pooling
  • Output: Point-wise classification head

🎯 Training Strategy

  • Loss: Cross-entropy with feature transformation regularization
  • Optimizer: Adam with learning rate scheduling
  • Split: Area-based (Area 5 for testing)
  • Augmentation: Point sampling and normalization

πŸ“Š Hyperparameters

 Training:
  batch_size: 16
  num_points: 4096
  epochs: 100
  learning_rate: 0.001
  weight_decay: 1e-4

 Model:
  num_classes: 13
  feature_transform: true

 Data:
  test_area: 5

πŸ“Š Visualization Tools

The project includes comprehensive visualization capabilities:

  • RGB point cloud visualization
  • Semantic segmentation results
  • Confusion matrices
  • Training curve plots
  • Per-class performance analysis

πŸ“š References

Citation

If you use this implementation in your research, please cite:

@article{qi2017pointnet,
  title={PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation},
  author={Qi, Charles R and Su, Hao and Mo, Kaichun and Guibas, Leonidas J},
  journal={arXiv preprint arXiv:1612.00593},
  year={2017}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Original PointNet authors for the groundbreaking architecture
  • Stanford University for the S3DIS dataset
  • PyTorch team for the deep learning framework

⭐ Star this repo if you find it useful! ⭐

About

PyTorch implementation of PointNet from scratch for 3D indoor scene semantic segmentation on S3DIS dataset pointclouds

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors