Problem
Every message to Claude Code loads all configured MCP server schemas into context upfront — regardless of which tools are needed. With many MCP servers configured (Obsidian, Slack, Google, Atlassian, GitLab, PDF reader, etc.), this adds significant hidden token overhead per request.
Data (from Igor Barinov's analysis):
- 20 servers × ~150 tokens = 3,000+ tokens of hidden overhead per message
- Only a fraction of tools are needed per session (e.g., curaitor review only needs Obsidian + Slack + browser)
This is especially relevant for curaitor: during `/cu:triage` or `/cu:discover` (unattended cron), only Obsidian MCP and scripts are needed. During `/cu:review`, add Slack and browser. Most other MCP servers (Jira, Confluence, Google Docs, GitLab) are unused but still loaded.
Current state
Curaitor sessions load ALL configured MCP servers because Claude Code doesn't support per-session or per-skill tool filtering. The deferred tool loading feature helps somewhat, but schemas are still fetched when referenced.
Alternatives considered
1. Per-skill MCP profiles
Define which MCP servers each skill needs in the skill frontmatter:
```yaml
In skill.md frontmatter
mcp_servers: [obsidian, slack-mcp]
```
Claude Code would only load those servers for that skill invocation. Status: Not supported by Claude Code — would need to be a feature request upstream.
2. Separate Claude Code profiles per workflow
Run curaitor in a separate Claude Code config/profile with only the needed MCP servers:
```bash
CLAUDE_CONFIG=~/.claude/profiles/curaitor.json claude -p "/cu:triage"
```
Status: Claude Code doesn't support config profiles yet, but could be approximated with separate settings files.
3. Project-scoped MCP (already partially supported)
Use project-level `.mcp.json` to define only the needed servers for the curaitor-review workspace:
```json
{
"mcpServers": {
"obsidian": { ... },
"slack-mcp": { ... },
"pdf-reader": { ... }
}
}
```
Status: This works TODAY — the curaitor-review project could define its own `.mcp.json` that only includes the servers it needs, overriding the global config.
4. Lazy-load via deferred tools (already partially supported)
Claude Code supports deferred tool loading — schemas fetched on first reference, not upfront. Ensure all non-essential MCP tools are deferred.
Status: Partially working — many tools already show as deferred in curaitor sessions.
5. Specialized agent routing (Cezary Dziemian suggestion)
Split functionality into separate specialized agents rather than loading everything centrally. E.g., a "browser agent" that only has cmux, an "obsidian agent" that only has Obsidian MCP.
Status: Possible with Claude Code teams/subagents, but adds orchestration complexity.
Recommendation
Short term (now): Create a project-scoped `.mcp.json` in the curaitor-review workspace that only includes: obsidian, slack-mcp, pdf-reader, personal-slack. Remove Jira, Confluence, GitLab, Google from the curaitor context.
Medium term: File upstream feature request for per-skill MCP profiles in Claude Code.
Long term: The pre-synthesis work (issue #1) reduces the need for MCP calls during review — if frontmatter contains the synthesis, the LLM doesn't need to call Obsidian MCP to read the full note.
Related
Problem
Every message to Claude Code loads all configured MCP server schemas into context upfront — regardless of which tools are needed. With many MCP servers configured (Obsidian, Slack, Google, Atlassian, GitLab, PDF reader, etc.), this adds significant hidden token overhead per request.
Data (from Igor Barinov's analysis):
This is especially relevant for curaitor: during `/cu:triage` or `/cu:discover` (unattended cron), only Obsidian MCP and scripts are needed. During `/cu:review`, add Slack and browser. Most other MCP servers (Jira, Confluence, Google Docs, GitLab) are unused but still loaded.
Current state
Curaitor sessions load ALL configured MCP servers because Claude Code doesn't support per-session or per-skill tool filtering. The deferred tool loading feature helps somewhat, but schemas are still fetched when referenced.
Alternatives considered
1. Per-skill MCP profiles
Define which MCP servers each skill needs in the skill frontmatter:
```yaml
In skill.md frontmatter
mcp_servers: [obsidian, slack-mcp]
```
Claude Code would only load those servers for that skill invocation. Status: Not supported by Claude Code — would need to be a feature request upstream.
2. Separate Claude Code profiles per workflow
Run curaitor in a separate Claude Code config/profile with only the needed MCP servers:
```bash
CLAUDE_CONFIG=~/.claude/profiles/curaitor.json claude -p "/cu:triage"
```
Status: Claude Code doesn't support config profiles yet, but could be approximated with separate settings files.
3. Project-scoped MCP (already partially supported)
Use project-level `.mcp.json` to define only the needed servers for the curaitor-review workspace:
```json
{
"mcpServers": {
"obsidian": { ... },
"slack-mcp": { ... },
"pdf-reader": { ... }
}
}
```
Status: This works TODAY — the curaitor-review project could define its own `.mcp.json` that only includes the servers it needs, overriding the global config.
4. Lazy-load via deferred tools (already partially supported)
Claude Code supports deferred tool loading — schemas fetched on first reference, not upfront. Ensure all non-essential MCP tools are deferred.
Status: Partially working — many tools already show as deferred in curaitor sessions.
5. Specialized agent routing (Cezary Dziemian suggestion)
Split functionality into separate specialized agents rather than loading everything centrally. E.g., a "browser agent" that only has cmux, an "obsidian agent" that only has Obsidian MCP.
Status: Possible with Claude Code teams/subagents, but adds orchestration complexity.
Recommendation
Short term (now): Create a project-scoped `.mcp.json` in the curaitor-review workspace that only includes: obsidian, slack-mcp, pdf-reader, personal-slack. Remove Jira, Confluence, GitLab, Google from the curaitor context.
Medium term: File upstream feature request for per-skill MCP profiles in Claude Code.
Long term: The pre-synthesis work (issue #1) reduces the need for MCP calls during review — if frontmatter contains the synthesis, the LLM doesn't need to call Obsidian MCP to read the full note.
Related