Problem
When Claude Code attempts to write standard OSS security infrastructure files, the response is blocked by its content filtering policy even though the intent is entirely defensive/posturing:
● CONTRIBUTING.md and SECURITY.md exist. I need to create CODE_OF_CONDUCT.md,
CodeQL workflow, and expand CODEOWNERS.
⎿ API Error: Output blocked by content filtering policy
This blocks making progress on PR 4b (issue #166) which covers:
- Creating
CODE_OF_CONDUCT.md (Contributor Covenant)
- Adding
.github/workflows/codeql.yml (GitHub CodeQL static analysis)
- Expanding
.github/CODEOWNERS to cover all packages, not just .ai/*
- Updating
SECURITY.md to list all 12 packages
Why this is a false positive
None of these files contain exploit code, attack tooling, or harmful instructions. They are:
- A standard community health file (CODE_OF_CONDUCT.md)
- A static analysis CI workflow (CodeQL scans TypeScript for vulnerabilities — it is a defender tool)
- A file ownership declaration (CODEOWNERS)
- A disclosure policy document (SECURITY.md)
The word "security" and "CodeQL" appear to be triggering the filter in a context where the purpose is entirely to improve security posture, not circumvent it.
Impact
Workaround
Files need to be written manually or in a separate session with explicit context framing.
Suggested fix
File feedback with Anthropic at https://github.com/anthropics/claude-code/issues referencing this false-positive pattern: writing CODEOWNERS, CodeQL workflows, and CODE_OF_CONDUCT.md for an OSS monorepo.
Problem
When Claude Code attempts to write standard OSS security infrastructure files, the response is blocked by its content filtering policy even though the intent is entirely defensive/posturing:
This blocks making progress on PR 4b (issue #166) which covers:
CODE_OF_CONDUCT.md(Contributor Covenant).github/workflows/codeql.yml(GitHub CodeQL static analysis).github/CODEOWNERSto cover all packages, not just.ai/*SECURITY.mdto list all 12 packagesWhy this is a false positive
None of these files contain exploit code, attack tooling, or harmful instructions. They are:
The word "security" and "CodeQL" appear to be triggering the filter in a context where the purpose is entirely to improve security posture, not circumvent it.
Impact
.github/workflows/with security-related names will hit the same wallWorkaround
Files need to be written manually or in a separate session with explicit context framing.
Suggested fix
File feedback with Anthropic at https://github.com/anthropics/claude-code/issues referencing this false-positive pattern: writing CODEOWNERS, CodeQL workflows, and CODE_OF_CONDUCT.md for an OSS monorepo.