Skip to content

feat: The Governed Agent Protocol — hard safety layer for destructive commands#505

Open
Mnehmos wants to merge 1 commit intoKilo-Org:devfrom
Mnehmos:feat/governed-agent-protocol
Open

feat: The Governed Agent Protocol — hard safety layer for destructive commands#505
Mnehmos wants to merge 1 commit intoKilo-Org:devfrom
Mnehmos:feat/governed-agent-protocol

Conversation

@Mnehmos
Copy link

@Mnehmos Mnehmos commented Feb 20, 2026

Kilo League Challenge #3 — DeveloperWeek 2026 Hackathon Entry

The Governed Agent Protocol

A hard safety layer that blocks destructive commands before execution, regardless of permission settings. ON by default — no opt-in required.

The Problem

Kilo's safety model is almost entirely soft. System prompts tell the LLM "don't run git reset --hard" but nothing in code enforces it. If a user grants bash:* permission (or the LLM social-engineers approval), any command executes — rm -rf /, DROP DATABASE, fork bombs, anything.

The Solution

A pure-function detection engine that intercepts tool calls before permission checks. Governance overrides permissions.

Three Severity Levels

Level Behavior Examples
CRITICAL Always blocked rm -rf /, fork bomb, dd of=/dev/sda, DROP DATABASE, mkfs, kill 1
HIGH Blocked with rejection git push --force, git reset --hard, git clean -f, DROP TABLE, chmod -R 777
MEDIUM Warns, allows execution rm -rf node_modules, curl | bash, DELETE FROM without WHERE

Path Guard (Write/Edit Protection)

Category Patterns
System paths /etc/, /usr/, /boot/, C:\Windows\
Credential files .env, id_rsa, credentials.json, secrets.*, private.pem
Credential dirs .ssh/, .aws/, .gnupg/, .kube/, .docker/

Architecture

governance/
  types.ts          — Zod schemas (Severity, Category, Verdict, CheckResult)
  patterns.ts       — ~30 destructive patterns across 3 severity levels
  detector.ts       — Pure function engine: checkBashCommand(), formatRejection()
  path-guard.ts     — Sensitive path protection for write/edit tools
  integration.ts    — Bridge module (lazy import, zero overhead when disabled)
  index.ts          — Barrel export

Integration Points

  • bash.ts — Intercepts after tree-sitter AST parsing, before ctx.ask() permission checks
  • write.ts — Intercepts after filepath resolution, before assertExternalDirectory()
  • edit.ts — Same as write.ts
  • flag.tsKILO_DISABLE_GOVERNANCE flag added

Key Design Decisions

  • ON by default — Unlike --warm, governance doesn't need a flag to activate
  • Pure functions — Detector is stateless, fully synchronous, trivially testable
  • Governance overrides permissions — Check happens BEFORE ctx.ask(), so even bash:* allowall can't bypass it
  • Lazy importawait import("../governance/integration") keeps zero overhead for disabled state
  • Rejection messages include suggestions — "Refusal is honest engagement, not withdrawal"

Test Results

  • 75 governance tests passing across 3 test files
  • Standalone demo exercising all severity levels + path guard
  • Pre-existing typecheck errors in kilo-sessions (unrelated simple-git missing types)

Run the Demo

cd packages/opencode
bun run test/governance/demo.ts

Philosophy

"Refusal is honest engagement, not withdrawal. When an AI system says no to a destructive command, it isn't failing — it's succeeding at the deeper task of responsible operation."

The Governed Agent Protocol treats safety as a first-class outcome, not an afterthought. Blocking rm -rf / isn't an error state — it's the system working exactly as designed.


🤖 Generated with Claude Code

Hard safety layer that blocks destructive commands before execution,
regardless of permission settings. ON by default.

Three severity levels:
- CRITICAL: always blocked (rm -rf /, fork bomb, dd to /dev/sda, DROP DATABASE)
- HIGH: blocked with rejection (git push --force, git reset --hard, DROP TABLE)
- MEDIUM: allowed with warning (rm -rf node_modules, curl | bash)

Path Guard protects write/edit operations against:
- System paths (/etc, /usr, /boot, C:\Windows)
- Credential files (.env, id_rsa, credentials.json)
- Credential directories (.ssh, .aws, .gnupg, .kube)

Integrates into bash, write, and edit tools via lazy import.
Pure function engine, zero external deps, 75 tests passing.

Disable with KILO_DISABLE_GOVERNANCE=1.
@github-actions
Copy link
Contributor

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

  1. Open an issue describing the bug/feature (if one doesn't exist)
  2. Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.


export namespace GovernanceIntegration {
export function isEnabled(): boolean {
return process.env.KILO_DISABLE_GOVERNANCE !== "1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: isEnabled() checks !== "1" but Flag.KILO_DISABLE_GOVERNANCE uses truthy() which accepts both "true" and "1".

If a user sets KILO_DISABLE_GOVERNANCE=true, the Flag module would report it as truthy, but isEnabled() would still return true (governance stays enabled). This inconsistency could confuse users.

Consider using the Flag module for consistency:

Suggested change
return process.env.KILO_DISABLE_GOVERNANCE !== "1"
return !Flag.KILO_DISABLE_GOVERNANCE

(with import { Flag } from "../flag/flag" at the top)

{
test: (_cmd, tokens) =>
tokens[0] === "kill" &&
(tokens.includes("1") || tokens.includes("-9") && tokens.includes("1")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Operator precedence issue — && binds tighter than ||, so this is parsed as:

tokens.includes("1") || (tokens.includes("-9") && tokens.includes("1"))

The second branch (-9 && 1) is entirely redundant since the first branch already matches whenever "1" is in tokens. This means kill 1 (without -9) is blocked, which is likely correct, but the || tokens.includes("-9") && tokens.includes("1") part is dead code.

If the intent was to ONLY block kill -9 1 (not kill 1), parentheses are needed:

tokens[0] === "kill" &&
(tokens.includes("-9") && tokens.includes("1"))

If the intent is to block any kill targeting PID 1 (current behavior), simplify to:

tokens[0] === "kill" && tokens.includes("1")


export function checkWritePath(
filepath: string,
projectRoot: string,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: The projectRoot parameter is accepted but never used in the function body. This suggests a missing feature — for example, allowing writes to .env files within the project root, or distinguishing between project-local and system-global credential files.

If this is intentional (planned for future use), consider adding a comment. Otherwise, removing the unused parameter would avoid confusion.

"id_rsa", "id_ed25519", "id_dsa", "id_ecdsa",
"id_rsa.pub", "id_ed25519.pub",
"credentials.json", "service-account.json",
".npmrc", ".pypirc",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: Blocking writes to .npmrc in the project directory may be overly aggressive. A project-level .npmrc is a common configuration file (e.g., for setting registry URLs, scoped package configs) and typically does not contain credentials. The global ~/.npmrc might contain auth tokens, but that would already be caught by the credential directory check for the home directory.

Similarly, .pypirc is typically only in the home directory. Consider whether project-level instances of these files should be allowed.

@kiloconnect
Copy link
Contributor

kiloconnect bot commented Feb 20, 2026

Code Review Summary

Status: 4 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 2
SUGGESTION 2
Issue Details (click to expand)

WARNING

File Line Issue
packages/opencode/src/governance/integration.ts 6 isEnabled() checks !== "1" but Flag.KILO_DISABLE_GOVERNANCE uses truthy() which also accepts "true" — inconsistent disable behavior
packages/opencode/src/governance/patterns.ts 82 Operator precedence makes the kill -9 1 branch dead code — the `

SUGGESTION

File Line Issue
packages/opencode/src/governance/path-guard.ts 43 projectRoot parameter is accepted but never used in the function body
packages/opencode/src/governance/path-guard.ts 28 Blocking .npmrc / .pypirc writes in project directories may be overly aggressive for common dev workflows
Files Reviewed (11 files)
  • packages/opencode/src/flag/flag.ts - 0 issues
  • packages/opencode/src/governance/detector.ts - 0 issues
  • packages/opencode/src/governance/index.ts - 0 issues
  • packages/opencode/src/governance/integration.ts - 1 issue
  • packages/opencode/src/governance/path-guard.ts - 2 issues
  • packages/opencode/src/governance/patterns.ts - 1 issue
  • packages/opencode/src/governance/types.ts - 0 issues
  • packages/opencode/src/tool/bash.ts - 0 issues
  • packages/opencode/src/tool/edit.ts - 0 issues
  • packages/opencode/src/tool/write.ts - 0 issues
  • packages/opencode/test/governance/ (3 test files) - 0 issues

Fix these issues in Kilo Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments