Skip to content

[Feature]: Agent-based hook evaluation (HookType.AGENT) #2864

@burakkeless

Description

@burakkeless

Is there an existing feature request for this?

  • I have searched existing issues and feature requests, and this is not a duplicate.

Problem or Use Case

Some hook decisions can't be made from the event payload alone — they require the evaluator to go look at something first. Prompt-based hooks are limited to what's in the payload; agent-based hooks let the evaluator actively investigate before deciding.

Issue #2755 identifies that HookType.PROMPT is declared in openhands-sdk/openhands/sdk/hooks/config.py but unimplemented, and proposes filling that gap. A complementary gap exists but hasn't been flagged yet: there is no agent-based hook type at all — not even a stub in the enum. The current HookType only contains COMMAND and PROMPT.

Use cases:

  • Stop hooks"Did the agent actually complete everything in TASKS.json?" → subagent reads the file, compares it against session activity, then decides.
  • Pre-tool-use hooks"Does this rm -rf target have uncommitted changes?" → subagent checks git state before allowing the command.
  • Post-tool-use hooks"Do tests still pass after this file edit?" → subagent runs the relevant test suite and blocks continuation on failure.

Proposed Solution

Enum change

HookType in openhands-sdk/openhands/sdk/hooks/config.py currently has only COMMAND and PROMPT. This proposal adds a third value:

class HookType(str, Enum):
    """Types of hooks that can be executed."""
    COMMAND = "command"  # Shell command executed via subprocess
    PROMPT = "prompt"    # LLM-based evaluation (implemented in #2755)
    AGENT = "agent"      # Agent-based evaluation with tool access

Add a new hook flavor, configured like prompt hooks but with tool access via a short-lived Conversation.

Config shape

{
  "stop": [
    {
      "matcher": "*",
      "hooks": [
        {
          "type": "agent",
          "prompt": "Read TASKS.json and verify every pending item has been addressed. Respond with ALLOW or DENY plus any remaining items.",
          "model": "claude-sonnet-4-6",
          "tools": ["file_editor", "terminal"],
          "timeout": 60
        }
      ]
    }
  ]
}

Execution flow

  1. Executor instantiates a short-lived Conversation with the configured LLM and restricted toolset (tools allowlist)
  2. Hook event payload passed as JSON context; user's prompt becomes the task
  3. Subagent runs until it emits a structured decision or hits timeout
  4. Response parsed as {"decision": "allow" | "deny", "reason": "..."} and returned as a HookResult

Changes required (on top of prompt-hooks foundation from #2755)

  • New HookType.AGENT enum value in openhands-sdk/openhands/sdk/hooks/config.py
  • tools field on HookDefinition (allowlist, not denylist)
  • New branch in HookExecutor.execute() that spawns a Conversation instead of a single LLM call
  • Higher default timeout than prompt hooks (60s vs 30s) to reflect multi-turn evaluation

Design defaults

Concern Default
Tool scoping Read-only by default; write access requires explicit opt-in
Context passing HookEvent payload always included; additional context (TASKS.json, history) opt-in via a context field
Recursion guard Hooks do not fire inside hook-spawned subagents — prevents infinite loops when the evaluator itself has hooks configured
Response format Same structured JSON as prompt hooks — both flavors produce identical HookResult downstream
Cost controls Log token usage per invocation; default to a fast/cheap model; gate PreToolUse agent hooks behind explicit opt-in to avoid runaway costs on tool-heavy sessions
Cancellation Subagent Conversation must be cancellable from the executor so parent-conversation interrupts terminate in-flight hooks cleanly

Mirrors Claude Code's agent-based hooks design, adapted to OpenHands' existing Conversation and HookExecutor infrastructure.

Alternatives Considered

No response

Priority / Severity

Medium - Would improve experience

Estimated Scope

Unknown - Not sure about the technical complexity

Feature Area

  • Agent API / Core functionality
  • Tools / Tool system
  • Skills / Plugins
  • Agent Server
  • Workspace management
  • Configuration / Settings
  • Examples / Templates
  • Documentation
  • Testing / Development tools
  • Performance / Optimization
  • Integrations (GitHub, APIs, etc.)
  • Other

Technical Implementation Ideas (Optional)

No response

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions