Skip to content

feat(data-processing): add isolation forest anomaly detection#468

Open
David-patrick-chuks wants to merge 3 commits intoPulsefy:mainfrom
David-patrick-chuks:issue-453-isolation-forest
Open

feat(data-processing): add isolation forest anomaly detection#468
David-patrick-chuks wants to merge 3 commits intoPulsefy:mainfrom
David-patrick-chuks:issue-453-isolation-forest

Conversation

@David-patrick-chuks
Copy link
Copy Markdown

Summary

Replaced the existing Z-score-only anomaly detector with an ML-based approach using scikit-learn's Isolation Forest in apps/data-processing/src/anomaly_detector.py. The update keeps the legacy Z-score logic available for comparison, adds configurable contamination support, and updates the pipeline flow so new samples are evaluated against the existing rolling baseline before being added.

To preserve prior spike-detection behavior while meeting the new ML requirement, the detector now uses Isolation Forest as the primary multi-feature model and still surfaces strong legacy Z-score anomalies as part of the final decision. This allows comparison against the old logic while avoiding regressions on obvious spike cases.

Linked Issue

Closes #453

Type of Change

  • feat
  • fix
  • docs
  • refactor
  • test
  • chore

Validation

  • Lint passed for affected area(s)
  • Tests passed for affected area(s)
  • Manual verification completed (if applicable)

Commands Run

python3 -m compileall apps/data-processing/src/anomaly_detector.py apps/data-processing/src/main.py apps/data-processing/src/scheduler.py

cd apps/data-processing
source .venv/bin/activate
python -m pytest --version
python -m pytest tests/test_anomaly_detector.py

Results

  • python3 -m compileall apps/data-processing/src/anomaly_detector.py apps/data-processing/src/main.py apps/data-processing/src/scheduler.py completed successfully.
  • A local virtual environment was created and the missing Python test dependencies were installed.
  • python -m pytest --version returned pytest 9.0.2.
  • python -m pytest tests/test_anomaly_detector.py was executed locally.

Screenshots / Test Evidence

image

Attach terminal screenshots here.

  • Compile output screenshot: [attach here]
  • Pytest version screenshot: [attach here]
  • Pytest run screenshot: [attach here]

Documentation

  • Documentation updated (or N/A with explanation)
  • Screenshots/videos attached for UI changes

Documentation note: N/A for separate docs files; this is a backend/data-processing change.
Screenshots/videos note: N/A for UI, but terminal validation screenshots are attached in the Validation section.

Checklist

  • Branch name uses feat/, fix/, or docs/
  • Commit messages follow Conventional Commits
  • PR scope matches linked issue acceptance criteria

@Cedarich
Copy link
Copy Markdown
Contributor

Please resolve conflicts

…-forest

# Conflicts:
#	apps/data-processing/src/main.py
@David-patrick-chuks David-patrick-chuks force-pushed the issue-453-isolation-forest branch from bac686d to 296973e Compare March 25, 2026 16:00
@David-patrick-chuks
Copy link
Copy Markdown
Author

@Cedarich Done

@Cedarich
Copy link
Copy Markdown
Contributor

Kindly fix workflow

@Cedarich
Copy link
Copy Markdown
Contributor

Kindly fix failing workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ML-Based Anomaly Detection (Isolation Forest)

2 participants