feat: The Governed Agent Protocol — hard safety layer for destructive commands#505
feat: The Governed Agent Protocol — hard safety layer for destructive commands#505Mnehmos wants to merge 1 commit intoKilo-Org:devfrom
Conversation
Hard safety layer that blocks destructive commands before execution, regardless of permission settings. ON by default. Three severity levels: - CRITICAL: always blocked (rm -rf /, fork bomb, dd to /dev/sda, DROP DATABASE) - HIGH: blocked with rejection (git push --force, git reset --hard, DROP TABLE) - MEDIUM: allowed with warning (rm -rf node_modules, curl | bash) Path Guard protects write/edit operations against: - System paths (/etc, /usr, /boot, C:\Windows) - Credential files (.env, id_rsa, credentials.json) - Credential directories (.ssh, .aws, .gnupg, .kube) Integrates into bash, write, and edit tools via lazy import. Pure function engine, zero external deps, 75 tests passing. Disable with KILO_DISABLE_GOVERNANCE=1.
|
Thanks for your contribution! This PR doesn't have a linked issue. All PRs must reference an existing issue. Please:
See CONTRIBUTING.md for details. |
|
|
||
| export namespace GovernanceIntegration { | ||
| export function isEnabled(): boolean { | ||
| return process.env.KILO_DISABLE_GOVERNANCE !== "1" |
There was a problem hiding this comment.
WARNING: isEnabled() checks !== "1" but Flag.KILO_DISABLE_GOVERNANCE uses truthy() which accepts both "true" and "1".
If a user sets KILO_DISABLE_GOVERNANCE=true, the Flag module would report it as truthy, but isEnabled() would still return true (governance stays enabled). This inconsistency could confuse users.
Consider using the Flag module for consistency:
| return process.env.KILO_DISABLE_GOVERNANCE !== "1" | |
| return !Flag.KILO_DISABLE_GOVERNANCE |
(with import { Flag } from "../flag/flag" at the top)
| { | ||
| test: (_cmd, tokens) => | ||
| tokens[0] === "kill" && | ||
| (tokens.includes("1") || tokens.includes("-9") && tokens.includes("1")), |
There was a problem hiding this comment.
WARNING: Operator precedence issue — && binds tighter than ||, so this is parsed as:
tokens.includes("1") || (tokens.includes("-9") && tokens.includes("1"))The second branch (-9 && 1) is entirely redundant since the first branch already matches whenever "1" is in tokens. This means kill 1 (without -9) is blocked, which is likely correct, but the || tokens.includes("-9") && tokens.includes("1") part is dead code.
If the intent was to ONLY block kill -9 1 (not kill 1), parentheses are needed:
tokens[0] === "kill" &&
(tokens.includes("-9") && tokens.includes("1"))If the intent is to block any kill targeting PID 1 (current behavior), simplify to:
tokens[0] === "kill" && tokens.includes("1")|
|
||
| export function checkWritePath( | ||
| filepath: string, | ||
| projectRoot: string, |
There was a problem hiding this comment.
SUGGESTION: The projectRoot parameter is accepted but never used in the function body. This suggests a missing feature — for example, allowing writes to .env files within the project root, or distinguishing between project-local and system-global credential files.
If this is intentional (planned for future use), consider adding a comment. Otherwise, removing the unused parameter would avoid confusion.
| "id_rsa", "id_ed25519", "id_dsa", "id_ecdsa", | ||
| "id_rsa.pub", "id_ed25519.pub", | ||
| "credentials.json", "service-account.json", | ||
| ".npmrc", ".pypirc", |
There was a problem hiding this comment.
SUGGESTION: Blocking writes to .npmrc in the project directory may be overly aggressive. A project-level .npmrc is a common configuration file (e.g., for setting registry URLs, scoped package configs) and typically does not contain credentials. The global ~/.npmrc might contain auth tokens, but that would already be caught by the credential directory check for the home directory.
Similarly, .pypirc is typically only in the home directory. Consider whether project-level instances of these files should be allowed.
Code Review SummaryStatus: 4 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
SUGGESTION
Files Reviewed (11 files)
|
The Governed Agent Protocol
A hard safety layer that blocks destructive commands before execution, regardless of permission settings. ON by default — no opt-in required.
The Problem
Kilo's safety model is almost entirely soft. System prompts tell the LLM "don't run
git reset --hard" but nothing in code enforces it. If a user grantsbash:*permission (or the LLM social-engineers approval), any command executes —rm -rf /,DROP DATABASE, fork bombs, anything.The Solution
A pure-function detection engine that intercepts tool calls before permission checks. Governance overrides permissions.
Three Severity Levels
rm -rf /, fork bomb,dd of=/dev/sda,DROP DATABASE,mkfs,kill 1git push --force,git reset --hard,git clean -f,DROP TABLE,chmod -R 777rm -rf node_modules,curl | bash,DELETE FROMwithout WHEREPath Guard (Write/Edit Protection)
/etc/,/usr/,/boot/,C:\Windows\.env,id_rsa,credentials.json,secrets.*,private.pem.ssh/,.aws/,.gnupg/,.kube/,.docker/Architecture
Integration Points
ctx.ask()permission checksassertExternalDirectory()KILO_DISABLE_GOVERNANCEflag addedKey Design Decisions
--warm, governance doesn't need a flag to activatectx.ask(), so evenbash:*allowall can't bypass itawait import("../governance/integration")keeps zero overhead for disabled stateTest Results
kilo-sessions(unrelatedsimple-gitmissing types)Run the Demo
cd packages/opencode bun run test/governance/demo.tsPhilosophy
The Governed Agent Protocol treats safety as a first-class outcome, not an afterthought. Blocking
rm -rf /isn't an error state — it's the system working exactly as designed.🤖 Generated with Claude Code