auto: add topic guard to deflect off-topic questions#20
Open
github-actions[bot] wants to merge 1 commit into
Open
auto: add topic guard to deflect off-topic questions#20github-actions[bot] wants to merge 1 commit into
github-actions[bot] wants to merge 1 commit into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
auto: add topic guard to deflect off-topic questions
Why
Trace
d92027fa(threadc9d826f2) shows a real user in an active cooking session asking"where is the light mode in this website" — the agent responded with a detailed guide on
changing dark/light mode settings in ChatGPT and other websites, never redirecting to
cooking. This is a core identity failure: the agent confused itself with ChatGPT and answered
an off-topic UI question as if it were a general assistant.
The base system prompt says "You are a world-class home-cooking assistant" but includes no
explicit instruction to decline non-cooking requests. This gap surfaces in multi-turn
conversations where users sometimes go off-script.
What
agent/cooking_agent.py: Added_TOPIC_GUARD_INSTRUCTIONconstant and injected it intoself._promptwhenauto_topic_guardflag is on. Updated the multi-turn path to useself._promptwheneverauto_topic_guardis enabled (regardless ofauto_fix_history_prompt),so the guard also covers conversations with history — exactly where the failure occurred.
tests/test_topic_guard.py: New scenario test that injects the topic guard instructiondirectly (via a custom adapter) to verify the instruction works, independent of Flagsmith
flag state.
Flag
auto_topic_guard(id: 204647) — default off. Enable in Flagsmith "cooking" project →Development to activate. When on, the agent responds to off-topic questions with a brief
redirect to cooking instead of answering them.
Eval delta
No regressions. The new
test_topic_guardusesTopicGuardAdapterwhich bakes the instructionin directly, so it passes independently of flag state and validates the instruction text itself.
How to test
Rollback
Flip
auto_topic_guardoff in Flagsmith. No code revert needed.Follow-ups
1)numbered list rendering: trace0ee0a635— user said "markdown is notrendering the numbers." The agent uses
1)style (parenthesis-delimited) for choice menuswhich may not render as styled ordered lists in react-markdown.
14d418e2— user said "I expected to have a keto filter."Current chips: Vegan, GF, Nut-Free, Dairy-Free.