-
-
Notifications
You must be signed in to change notification settings - Fork 1
feat(monitoring): wire HealthCollector state-change events to workflow trigger + notification pipeline #3404
Copy link
Copy link
Open
Labels
Description
Context7 Score: 73/100
Test query: "Implement a real-time monitoring task that triggers a notification when a specific Linux service enters a failed state."
Audit Findings
What EXISTS
autobot-slm-backend/slm/agent/health_collector.py— polls systemd viasystemctl, maps states (active/failed/crash-loop/inactive)autobot-backend/services/notification_service.py— 4 channels (email, Slack, webhook, in-app)autobot-backend/services/trigger_service.py— REDIS_PUBSUB trigger type readydocs/guides/realtime-monitoring-notifications.md— 1300-line guide with working example
What is MISSING
- HealthCollector doesn't publish to Redis — detects state changes but never emits to
autobot:services:{name}:state_changepub/sub channel - No
SERVICE_FAILEDevent type inNotificationEventenum — only workflow lifecycle events exist - No service monitoring workflow template — users must manually wire everything
- No end-to-end example connecting HealthCollector → REDIS_PUBSUB trigger → notification
Acceptance Criteria
-
HealthCollectorpublishes{"service": name, "prev_state": ..., "new_state": ..., "error_context": ...}toautobot:services:{name}:state_changeon every state transition -
NotificationEvent.SERVICE_FAILEDadded with default template:"Service {service} entered {state} state" - Workflow template:
autobot-backend/workflow_templates/service_health_monitor.yaml— trigger: REDIS_PUBSUB onautobot:services:*:state_change, step: send notification - Example:
docs/examples/service_failure_monitoring.py— complete runnable demo -
docs/user/guides/workflows.mdupdated with "Monitor a Linux Service" section - Tests: state transitions, Redis pub/sub publish, notification dispatch
Files to Touch
autobot-slm-backend/slm/agent/health_collector.pyautobot-backend/services/notification_service.pyautobot-backend/workflow_templates/service_health_monitor.yaml(new)docs/examples/service_failure_monitoring.py(new)docs/user/guides/workflows.md
Reactions are currently unavailable