feat: add maintenance window MCP tools for proactive deploy suppression by caballeto · Pull Request #21 · devhelmhq/mcp-server

caballeto · 2026-05-06T10:17:54Z

Summary

Adds five maintenance-window MCP tools so AI coding assistants (Cursor, Claude Desktop, Windsurf, etc.) can proactively suppress alerts during risky operations and clear them once the work succeeds.

This closes one of the highest-value gaps in the launch-story tool surface: an agent that runs a deploy script today either pages on-call when monitors briefly fail mid-rollout (noisy) or leaves the user to remember to silence things by hand (error-prone). With these tools the agent can do it itself.

The tools

Tool	Purpose
`list_maintenance_windows`	Inspect active or upcoming windows; supports `monitor_id` and `status` (`active` / `upcoming`) filters
`get_maintenance_window`	Fetch a single window by ID with full details
`create_maintenance_window`	Schedule a window — call BEFORE running a deploy / migration / scheduled task
`update_maintenance_window`	Push the `endsAt` back when a deploy runs long (full PUT)
`cancel_maintenance_window`	Clear the window so alerts resume — call AFTER a deploy succeeds

Time fields use ISO 8601 / RFC 3339 with explicit timezone (UTC preferred, e.g. 2026-05-15T14:00:00Z). Setting monitorId=null on create produces an org-wide window that suppresses every monitor.

Sample agent workflow

The docstrings spell this out for the LLM, but the canonical flow is:

Agent receives a "deploy v0.7.3 to prod" instruction.
Agent calls create_maintenance_window with startsAt = now, endsAt = now + 30min, reason = "v0.7.3 deploy".
Agent runs the deploy script. Monitors briefly fail. No pages.
On success, agent calls cancel_maintenance_window(window_id) so any post-deploy regression pages on-call immediately.
If the deploy runs long, agent calls update_maintenance_window with a later endsAt instead of letting the window expire and pages flooding the channel.

Implementation notes

SDK coupling: the parallel SDK PR adds client.maintenance_windows later. Until that ships, the tools call /api/v1/maintenance-windows through the SDK's existing low-level helpers (api_get / api_post / api_put / api_delete) plus the generated CreateMaintenanceWindowRequest / MaintenanceWindowDto / UpdateMaintenanceWindowRequest Pydantic models. Once the SDK ships the resource, every tool body collapses to a one-liner; the public tool surface stays unchanged.
SDK lock bump: bumps the locked devhelm to 0.6.3 (latest on PyPI) to pick up the generated maintenance-window models. The pyproject pin (devhelm>=0.6.0) is unchanged — this is a lock-only bump.
No managedBy on this resource: unlike Monitor, the maintenance-window API doesn't carry a managedBy column. Surface attribution still happens via X-DevHelm-Surface: mcp on every request, so dashboard filters by surface continue to work; we just don't have a per-row attribution channel here. The schema-hygiene test pins the absence so a future SDK regen that bolts the field on can't silently expose it to the LLM.
api_token hidden from the LLM: every tool keeps the api_token kwarg for path-style /{api_key}/mcp clients but the field is stripped from the inputSchema by the existing _strip_internal_schema_fields lifespan hook (P2.Bug7). The tests pin this for all five new tools.

Test plan

uv sync — pulls devhelm 0.6.3 from PyPI
uv run ruff check src/ tests/ — clean
uv run ruff format --check src/ tests/ — clean
uv run mypy src/ — clean (Python 3.11 and 3.13)
uv run pytest tests/ -x — 135 passed (110 baseline + 25 new) on Python 3.11 and 3.13
Tool registration: every tool surfaces with non-empty description
HTTP wire contract: every tool builds the right path / method / body, including camelCase aliases (startsAt, endsAt, monitorId) and filter / monitorId query keys
Schema hygiene: api_token not in any inputSchema; managedBy not on create / update body schemas
Error surfacing: upstream DevhelmApiError propagates as isError=True with the formatted ApiError envelope (P1.Bug3)

Out of scope

pyproject.toml / serverInfo.version bumps — release engineering owns those.
Switching to client.maintenance_windows.create(...) — that's a follow-up for v0.7.3 polish once the SDK PR merges and a new devhelm release ships.

Made with Cursor

Five new tools so AI coding assistants can suppress alerts around risky operations: - list_maintenance_windows — inspect active / upcoming windows - get_maintenance_window — fetch one window with full details - create_maintenance_window — schedule downtime BEFORE a deploy - update_maintenance_window — extend ``endsAt`` when a deploy runs long - cancel_maintenance_window — clear suppression after success The tools wrap ``/api/v1/maintenance-windows`` directly through the SDK's low-level helpers because the parallel ``client.maintenance_windows`` SDK PR hasn't shipped yet. Once it does, every tool body collapses to a one-liner against the resource without changing the public tool surface. Bumps the locked ``devhelm`` SDK to 0.6.3 to pick up the generated ``CreateMaintenanceWindowRequest`` / ``MaintenanceWindowDto`` / ``UpdateMaintenanceWindowRequest`` Pydantic models. The pyproject pin (``devhelm>=0.6.0``) is unchanged; this is a lock-only bump. Co-authored-by: Cursor <cursoragent@cursor.com>

Adds 25 tests across five concerns: - registration: every new tool surfaces in ``mcp.list_tools`` with a non-empty description (LLM docs) - HTTP contract: each tool builds the right path / method / body via patched ``api_get`` / ``api_post`` / ``api_put`` / ``api_delete``, including the camelCase aliases on the wire (``startsAt``, ``endsAt``, ``monitorId``) and the ``filter`` / ``monitorId`` query keys for the list endpoint - schema hygiene: regression that ``api_token`` never leaks into the LLM-facing ``inputSchema`` (P2.Bug7 contract) and that the ``CreateMaintenanceWindowRequest`` / ``UpdateMaintenanceWindowRequest`` body schemas don't expose ``managedBy`` — pinning now means a future SDK regen that bolts the field on can't silently surface it - error surfacing: upstream ``DevhelmApiError`` propagates to the client as ``isError=True`` with the formatted ApiError envelope (P1.Bug3 contract) - expected-tools list: keeps ``test_tools.py`` in sync so the count assertion reflects the new five tools Total suite is now 135 tests (110 baseline + 25 new), all green on Python 3.11 and 3.13. Co-authored-by: Cursor <cursoragent@cursor.com>

caballeto and others added 2 commits May 6, 2026 12:16

caballeto merged commit b7624bf into main May 6, 2026
4 checks passed

caballeto deleted the feat/maintenance-window-tools branch May 6, 2026 10:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add maintenance window MCP tools for proactive deploy suppression#21

feat: add maintenance window MCP tools for proactive deploy suppression#21
caballeto merged 2 commits into
mainfrom
feat/maintenance-window-tools

caballeto commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

caballeto commented May 6, 2026

Summary

The tools

Sample agent workflow

Implementation notes

Test plan

Out of scope

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant