DSPRO1 - CarMa

Car Market Analysis — Swiss used car price prediction using machine learning.

HSLU Data Science Project 1 (HS25) | Documentation | W&B Dashboard | References

Results

Model	R²	MAE (CHF)	RMSE (CHF)	Status
Random Forest	0.68	6,490	9,850	Baseline
Neural Network	0.96	3,059	4,200	Best

Best NN achieved via W&B hyperparameter sweep (s5dzwgec)

Architecture

flowchart LR
    subgraph Data
        A[Raw Data] --> B[Cleaned]
        B --> C[Train/Val/Test Split]
    end

    subgraph Pipelines
        C --> D[RF Pipeline]
        C --> E[NN Pipeline]
    end

    subgraph Training
        D --> F[RandomForest]
        E --> G[MLP Neural Net]
    end

    subgraph Evaluation
        F --> H[Metrics & Viz]
        G --> H
        H --> I[W&B Logging]
    end

Factory Pattern: All components instantiated via factories for reproducibility.

from src.pipelines import PipelineFactory

pipeline = PipelineFactory.create("NeuralNetworkPipeline")
result = pipeline.run()  # Returns metrics, model path, artifacts

See Architecture Documentation for detailed diagrams.

Quick Start

# Clone and setup
git clone https://github.com/S4sch/DSPRO1.git && cd DSPRO1
./setup.sh  # Creates venv, installs deps, configures pre-commit

# Run models
python scripts/step05_random_forest.py   # RF baseline
python scripts/step06_neural_network.py  # NN model

Manual Installation

git lfs install && git lfs pull
python3 -m venv venv && source venv/bin/activate
pip install -e ".[dev]"
pre-commit install

Project Structure

DSPRO1/
├── configs/           # YAML configurations (orchestration, model, training)
├── data/              # Raw and processed data (Git LFS)
├── docs/              # Documentation and diagrams
├── notebooks/         # Jupyter notebooks (EDA, baselines, HPO)
├── reports/           # Generated reports and figures
├── scripts/           # Pipeline execution scripts
├── src/               # Source code (factory-based architecture)
│   ├── datasets/      # Dataset classes (RF, NN)
│   ├── models/        # Model architectures (MLP)
│   ├── pipelines/     # ML pipelines with stages
│   ├── trainers/      # Training logic (RF, NN)
│   └── transforms/    # Data transformations
└── tests/             # Unit and integration tests

Key Notebooks

Notebook	Purpose
`baseline_rf.ipynb`	Random Forest baseline analysis
`baseline_nn.ipynb`	Neural Network baseline analysis
`hpo_nn.ipynb`	NN hyperparameter optimization
`experiments.ipynb`	Central experiment dashboard
`EDA01_*.ipynb`	Exploratory data analysis

Documentation

Document	Description
Documentation Index	Complete documentation hub
Architecture	System design and diagrams
Reproducibility Guide	How to reproduce results
Technical Report	LaTeX academic report

Links & Resources

Resource	Link
W&B Dashboard	wandb.ai/hs25_dspro1/dspro1_carma
Best NN Sweep	s5dzwgec (R²=0.9577)
RF Sweep	4diy2psz (R²=0.679)
GitHub Repository	github.com/S4sch/DSPRO1

Development

make lint      # Run Ruff linting
make test      # Run pytest
make ci-local  # Full CI check locally

See CONTRIBUTING.md for development workflow and guidelines.

Authors

Javier Izquierdo — @javihslu
Sascha Lüscher — @S4sch

Acknowledgments

Supervisors: Dr. Seraina Glaus, Dr. Umberto Michelucci, Dr. Jan Svoboda (HSLU)

AI Assistance: We used AI-assisted tools (ChatGPT, GitHub Copilot, and similar LLMs) to support development and documentation. All AI-generated content was reviewed and verified by the authors, who take full responsibility for the final work.

License

This project is licensed under the MIT License - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 195 Commits
.github		.github
Data Scrapping		Data Scrapping
Obsidian/DSPRO		Obsidian/DSPRO
configs		configs
data		data
docs		docs
notebooks		notebooks
reports		reports
scripts		scripts
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
DEPRECATION.md		DEPRECATION.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
REFERENCES.md		REFERENCES.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DSPRO1 - CarMa

Results

Architecture

Quick Start

Project Structure

Key Notebooks

Documentation

Links & Resources

Development

Authors

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DSPRO1 - CarMa

Results

Architecture

Quick Start

Project Structure

Key Notebooks

Documentation

Links & Resources

Development

Authors

Acknowledgments

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages