feat(prompts): allow overriding the base prompt from the config dir (#3638)#3696
Conversation
Issue Hmbown#3638 asks to make the hard-loaded base prompt repurposable (e.g. for long-form writing) without editing in-tree files or building a custom embedder. The prompt-override hooks already exist for embedders (set_base_prompt_override + the OnceLock cells), but there was no user-facing source that feeds them. Bridge those hooks to a config-directory file: at startup, if `~/.codewhale/prompts/constitution.md` (under $CODEWHALE_HOME) exists and is non-empty, install it via the existing set_base_prompt_override path; otherwise fall back to the bundled constant. Loaded once before any engine spawns (first-call-wins cells). Scope is deliberately narrow and safe: only the byte-stable base prompt segment is user-overridable. Mode deltas, approval policy, tool taxonomy, Context Management, and the Compaction Relay stay owned by the runtime assembly (see StaticPromptCtx), so an override cannot strip safety-relevant guidance (sandbox/approvals). - prompts.rs: pure, unit-tested resolver read_prompt_override_file + load_config_dir_prompt_overrides / load_prompt_overrides_from_config_home. - main.rs: wire the loader in once after CLI parse, before subcommand dispatch. - docs/CONFIGURATION.md: document the override file, its scope, and that it cannot remove safety layers. - 3 unit tests (present/absent/empty-file). Empty/whitespace files are ignored so a stray file can't blank the system prompt. Refs Hmbown#3638 Signed-off-by: findshan <224246733+findshan@users.noreply.github.com>
|
Thanks @findshan for taking the time to contribute. This repository is observing a maintainer-managed PR intake gate in dry-run mode, so this pull request is staying open. This note helps maintainers prepare the allowlist before any enforcement is considered. Please read |
|
Thank you so much for this, @findshan — this is thoughtful and very clearly scoped, and thanks @DracheTek for the broader-use-case request behind #3638. The implementation shape is crisp and the checks are green. I am going to hold this one for explicit maintainer sign-off rather than merge it automatically, because it changes a user-facing prompt trust boundary: config-dir base prompt overrides effectively make the global Constitution replaceable at runtime. Current repo guidance still treats the shipped constitution as the sole base prompt and avoids runtime prompt/tag injection. That does not mean the idea is bad — it may be exactly the right direction for broader writing/review workflows — but I think it needs an intentional product/security call, probably around whether this should be behind an explicit opt-in flag/name/path and how we describe what safety layers remain owned by the runtime. Really appreciate the care you put into making the slice narrow and reversible. |
Addresses maintainer review on Hmbown#3638: replacing the global Constitution is a prompt trust boundary, so the override file alone must not be enough. Gate it behind an explicit CODEWHALE_ALLOW_BASE_PROMPT_OVERRIDE flag — the file is ignored (with a log line pointing to the flag) unless the user has deliberately opted in. Makes replacing the base prompt a two-step, auditable action. - prompts.rs: BASE_PROMPT_OVERRIDE_OPT_IN_ENV + base_prompt_override_opt_in(); load_config_dir_prompt_overrides now requires it. Added a test that a present file without the flag applies nothing (safe: no global cell mutation). - docs/CONFIGURATION.md: document the two-step opt-in. Refs Hmbown#3638 Signed-off-by: findshan <224246733+findshan@users.noreply.github.com>
|
Completely agree it's a trust-boundary call that's yours to make — thanks for framing it that way rather than just bouncing it. I pushed one change that I think de-risks the security half regardless of the product decision, and left the product decision to you: Explicit opt-in flag. The override file alone is no longer sufficient. The user must also set On "what safety layers remain owned by the runtime": the override only replaces the byte-stable base segment via the existing Totally fine to hold for your product/security call — just wanted to make the "yes" version as conservative as possible. Open questions I'd defer to you: env flag vs |
|
Thanks again @findshan — I re-reviewed the follow-up commit and the explicit I am still leaving it unmerged for the moment because this is now a product/security sign-off question rather than a CI/readiness question: do we want user-configurable replacement of the base Constitution at all, even behind an explicit flag, and is |
|
Approved for merge. Thanks again @findshan — this is the conservative version we wanted: config-directory prompt customization is explicit, documented, and gated behind One product caveat for future readers: we are working toward a broader, first-class customizable prompt/config story. This file/env-flag surface gives advanced users a useful escape hatch now, but we may rename, move, or replace it later when the durable config UX lands. For now, the docs make the opt-in and scope clear, and the runtime still owns the safety layers after the base segment. |
What
Closes the core of #3638: let a user repurpose the TUI for non-software use cases (long-form writing, document review, etc.) by swapping the base/constitutional system prompt from a config-directory file, without editing in-tree files or building a custom embedder.
The override hooks already exist for embedders (
set_base_prompt_override+ theOnceLockcells inprompts.rs), but there was no user-facing source feeding them. This PR adds that source.How
~/.codewhale/prompts/constitution.md(under$CODEWHALE_HOMEwhen set).set_base_prompt_overridepath; otherwise no-op → the bundled constant is used. Fully backward compatible.Safety / trust-boundary scope (deliberately narrow)
I know the prompt surface is a trust boundary, so the scope is intentionally minimal: only the byte-stable base prompt segment is overridable. Mode deltas, the approval policy, the tool taxonomy, Context Management, and the Compaction Relay stay owned by CodeWhale's runtime assembly (per the existing
StaticPromptCtxcontract), so an override cannot strip safety-relevant guidance (sandbox/approvals) — it only swaps the task/voice framing. Byte-stability of the composed prompt when no override is set is unchanged (the existing byte-stable tests still pass).Happy to gate this behind a feature flag or adjust the path/semantics if you'd prefer — flagging the trust-boundary explicitly for sign-off.
Files
crates/tui/src/prompts.rs— pure, unit-tested resolverread_prompt_override_file+load_config_dir_prompt_overrides/load_prompt_overrides_from_config_home.crates/tui/src/main.rs— one wiring call.docs/CONFIGURATION.md— documents the file, its scope, and that it can't remove safety layers.Tests
3 new unit tests (present / absent / empty-file resolution). The global-install path isn't unit-tested by design —
set_base_prompt_overridewrites a process-wideOnceLockthat would leak across the test binary (same reason the existingprompt_override_storage_reports_duplicate_setsuses a local cell). Verified locally:Follow-up (not in this PR)
#3638 also mentions personality overlays. Those don't have override hooks yet; happy to add
prompts/personalities/<name>.mdoverrides as a follow-up if you want the same treatment.Refs #3638