Conversation
Implements Nagios XI integration for Keep, allowing users to: - Pull host and service alerts from Nagios XI API - Receive webhook notifications from Nagios - Map Nagios states to Keep alert statuses Closes keephq#3960
|
Someone is attempting to deploy a commit to the KeepHQ Team on Vercel. A member of the Team first needs to authorize it. |
|
Related Documentation No published documentation to review for changes on this repository. |
|
Jean Bot seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
|
Target branch is not in the allowed branches list. |
There was a problem hiding this comment.
Pull request overview
Implements a new Nagios XI provider to ingest alerts into Keep via both polling the Nagios XI REST API and formatting Nagios webhooks.
Changes:
- Adds
NagiosProviderAuthConfigandNagiosProviderwith API-based fetching of host and service problems from Nagios XI v1 endpoints. - Maps Nagios host/service states and severities into
AlertStatus/AlertSeverityand converts Nagios API responses intoAlertDtoinstances. - Introduces a
_format_alertwebhook handler to normalize incoming Nagios notifications into Keep’s alert model.
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
keep/providers/nagios_provider/nagios_provider.py |
Defines the Nagios provider, including auth config, REST polling, Nagios→Keep state/severity mapping, and webhook alert formatting. |
keep/providers/nagios_provider/__init__.py |
Declares the nagios_provider package so the factory can import the provider module. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return AlertDto( | ||
| id=f"nagios-host-{host.get('host_id', '')}", | ||
| name=f"Host: {host.get('name', 'Unknown')}", | ||
| description=host.get("status_text", host.get("plugin_output", "")), | ||
| status=self.HOST_STATUS_MAP.get(current_state, AlertStatus.FIRING), | ||
| severity=self.SEVERITY_MAP.get(current_state, AlertSeverity.WARNING), | ||
| source=["nagios"], | ||
| lastReceived=datetime.datetime.fromisoformat( | ||
| host.get("last_check", datetime.datetime.now().isoformat()) | ||
| ) if host.get("last_check") else datetime.datetime.now(), |
There was a problem hiding this comment.
AlertDto.lastReceived is being passed a datetime object here, but AlertDto's lastReceived validator (see keep/api/models/alert.py:170-208) expects a string/timestamp and will fail when given a datetime, causing _host_to_alert to raise and return None so host alerts are silently dropped. To avoid this, either pass a string (e.g. using .isoformat() or the raw API value) or let AlertDto handle the original last_check value directly and remove the datetime.fromisoformat call here.
| return AlertDto( | ||
| id=f"nagios-service-{service.get('servicestatus_id', '')}", | ||
| name=f"Service: {host_name}/{service_name}", | ||
| description=service.get("status_text", service.get("plugin_output", "")), | ||
| status=self.SERVICE_STATUS_MAP.get(current_state, AlertStatus.FIRING), | ||
| severity=self.SEVERITY_MAP.get(current_state, AlertSeverity.WARNING), | ||
| source=["nagios"], | ||
| lastReceived=datetime.datetime.fromisoformat( | ||
| service.get("last_check", datetime.datetime.now().isoformat()) | ||
| ) if service.get("last_check") else datetime.datetime.now(), |
There was a problem hiding this comment.
Same as for hosts, lastReceived is set to a datetime object here, which conflicts with AlertDto's lastReceived validator expecting a string and will cause _service_to_alert to fail and drop service alerts. Align this with other providers by passing a string value (e.g. ISO-formatted or the raw last_check from the API) instead of a datetime instance.
| # Service alert | ||
| state = int(event.get("servicestate", event.get("state", 0))) | ||
| status = NagiosProvider.SERVICE_STATUS_MAP.get(state, AlertStatus.FIRING) | ||
| severity = NagiosProvider.SEVERITY_MAP.get(state, AlertSeverity.WARNING) | ||
|
|
||
| host_name = event.get("hostname", event.get("host", "Unknown")) | ||
| service_name = event.get("servicedesc", event.get("service", "Unknown")) | ||
|
|
||
| return AlertDto( | ||
| id=event.get("id", f"nagios-{host_name}-{service_name}"), | ||
| name=f"Service: {host_name}/{service_name}", | ||
| description=event.get("output", event.get("message", "")), | ||
| status=status, | ||
| severity=severity, | ||
| source=["nagios"], | ||
| fingerprint=f"nagios-service-{host_name}-{service_name}", | ||
| labels={ | ||
| "host_name": host_name, | ||
| "service_name": service_name, | ||
| "notification_type": event.get("notificationtype", ""), | ||
| }, | ||
| ) | ||
| else: | ||
| # Host alert | ||
| state = int(event.get("hoststate", event.get("state", 0))) | ||
| status = NagiosProvider.HOST_STATUS_MAP.get(state, AlertStatus.FIRING) | ||
| severity = NagiosProvider.SEVERITY_MAP.get(state, AlertSeverity.WARNING) |
There was a problem hiding this comment.
The webhook formatter assumes servicestate/hoststate (or state) are numeric and calls int(...), but Nagios notification macros like SERVICESTATE/HOSTSTATE are typically strings such as "OK", "WARNING", "CRITICAL", "UP", "DOWN"; trying to cast those to int will raise and break webhook handling, contradicting the documented state mapping behavior. Consider supporting both textual and numeric states (e.g. by normalizing strings through a separate mapping or by trying int only when the value is digit-like) so that common Nagios webhook payloads are accepted.
Summary
Implements Nagios XI integration for Keep, allowing users to pull alerts from Nagios monitoring systems.
Features
Implementation Details
Testing
Closes #3960
This PR was created to claim the Nagios Provider bounty.