Skip to content

MCP run_tests tool — on-demand test execution for agents #19

@hellno

Description

@hellno

Summary

Add a run_tests MCP tool that lets AI agents trigger a project's test suite on-demand and receive structured pass/fail results. Today agents can deploy and check logs, but they have no way to run tests and iterate based on results — the tightest feedback loop for producing correct code.

Motivation

Agents produce significantly better work when given immediate, deterministic feedback. Currently the only "backpressure" Jack provides is deploy success/failure and log output. A structured test runner would let agents:

  • Write code → run tests → fix failures → repeat, without human intervention
  • Validate changes before deploying (shift-left)
  • Get machine-readable results (not just log text) to reason about failures precisely

Proposed behavior

MCP tool: run_tests

Input:
  - project_id (optional, defaults to current project)
  - test_command (optional, auto-detect from package.json scripts)
  - filter (optional, run specific test files/patterns)

Output:
  - success: boolean
  - summary: { total, passed, failed, skipped }
  - failures: [{ test_name, file, error_message, diff? }]
  - duration_ms: number
  - raw_output: string (truncated)

Detection logic

  • Check package.json for test, test:unit, test:integration scripts
  • Support common runners: vitest, jest, bun test, playwright
  • Parse structured output (JSON reporters) when available, fall back to stdout parsing

Execution

  • BYO mode: Run locally via shell
  • Managed mode: Run in Jack Cloud sandbox (requires compute allocation — could be a follow-up)

Acceptance criteria

  • run_tests MCP tool callable from Claude Code / Claude Desktop
  • Auto-detects test command from project config
  • Returns structured results (not just raw text)
  • Handles timeout gracefully (default 60s, configurable)
  • Works in BYO mode; managed mode can return "not yet supported" initially

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions