feat: add SM skill — precision complement to PUA#82
Open
zl190 wants to merge 1 commit intotanweai:mainfrom
Open
feat: add SM skill — precision complement to PUA#82zl190 wants to merge 1 commit intotanweai:mainfrom
zl190 wants to merge 1 commit intotanweai:mainfrom
Conversation
PUA gives persistence. SM gives direction. Together they prevent the "Seven Patches" problem: more attempts without diagnosis = more damage. 4 rules, ~140 lines, zero overlap with existing PUA skills. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Owner
|
你说的有道理的 , 其实目前ai 就是这样的 anthropic的论文也指出了, ai有的时候说出来的东西和内心想的不一样。 |
Author
|
P8 旁路校验比我原来定位的 precision complement 更准确——SM 的结构化判断天然适合做独立校验层。我这边在做 context degradation 对 agent 行为的影响,和 agent 一致性是同一条线。等你们定好 P7 的输出 schema,我来适配。 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SM — PUA 的精准补丁
TL;DR
我对 PUA 做了 A/B 实验(3 场景,S2 各 3 次复现)。发现 PUA 在误导性 traceback 场景下有一个可复现的盲区:AI 完美分析了根因,但不修改代码。SM 的诊断先行规则将 S2 正确率从 0/3 提升到 2/3。
实验数据
3 个调试场景,Claude Sonnet 4.6,PUA alone vs PUA+SM:
全场景首次运行:
S2 详细复现(各 3 次):
总成本:~$3.50。
S2 发生了什么
PUA alone 的失败模式:
AI 准确诊断了完整因果链——
serializer.py缺少 DatetimeEncoder → TypeError 被client.py的 except 吞掉 → 发送空 body → 服务器返回纯文本 400 →deserialize_response解析纯文本触发 JSONDecodeError。分析满分。但 3 次中有 2 次不修改任何文件(原因:"改了之后 TestOriginalBug 会 fail"),1 次改了错误文件(修 client.py 而非 serializer.py)。
PUA 已有的规则为什么没生效:
PUA 有 "只回答问题不解决问题→你是工程师不是搜索引擎"。但 AI 不认为自己在"只回答不解决"——它认为不破坏通过的测试才是正确行为。PUA 检测"放弃"和"偷懒",不检测"过度谨慎导致不行动"。
SM 怎么补的
SM 的规则 1:改代码前输出
[SM·诊断] 问题是___,因为___。这个格式化输出创造了行动承诺——写了"问题是 serializer.py 缺少 DatetimeEncoder"之后,不修改 serializer.py 在认知上更困难。
4 条规则
诚实声明
文件
skills/sm/SKILL.md— 169 行。维护者可以选择:实验代码和完整结果:实验报告
🤖 Generated with Claude Code