Skip to content

feat(budget): model-aware startup context budget for profiles#75

Merged
NagyVikt merged 1 commit into
mainfrom
claude/cue-profile-context-limits-1zozjg
Jun 24, 2026
Merged

feat(budget): model-aware startup context budget for profiles#75
NagyVikt merged 1 commit into
mainfrom
claude/cue-profile-context-limits-1zozjg

Conversation

@NagyVikt

Copy link
Copy Markdown
Contributor

What

Adds a model-aware startup context budget to cue profiles. A profile's always-on footprint — skill frontmatter (loaded into the skill router on every message) + MCP tool schemas (one set per connected server) — is paid before the user types anything. This change warns when that load exceeds 50% of the model's context window, leaving the other half as working headroom.

For a 256K model that's a ~128K startup ceiling.

Why

The existing token budget (lib/token-budget.ts + the launch banner) only counted skill frontmatter with flat 🟢🟡🟠🔴 thresholds (5K/10K/15K) — it ignored MCP cost and had no model/context-window awareness. There was no signal that a profile's skills + MCPs were eating into the half of the window you want free for the actual task.

How

cue can't auto-detect the main-session model (it's a per-launch /model choice), so the window is resolved from a declared/configured source:

resolveContextWindow precedence: profile contextWindow → profile model (via MODEL_CONTEXT_WINDOWS) → env CUE_CONTEXT_WINDOW → env CUE_MODELDEFAULT_CONTEXT_WINDOW (256K).

  • src/lib/token-budget.ts — pure, unit-tested math: MODEL_CONTEXT_WINDOWS, resolveContextWindow, estimateMcpTokens (MCP_TOKENS_PER_SERVER default 1500, override CUE_MCP_TOKENS_PER_SERVER), computeContextBudget, formatContextBudgetWarning (silent <80% of budget, 🟡 near, 🔴 over).
  • Profiles — optional model / contextWindow fields (schema.json + _types.ts), resolved leaf-wins through inheritance and composites in profile-loader.ts. core declares contextWindow: 256000 as the fan-out baseline.
  • cue launch — budget warning in the startup banner, using real resolved skill + MCP token data.
  • cue profile suggest — new Context budget section auditing every profile; new flags --model <id>, --context <tokens>, --load-factor <0..1>, --no-budget.
  • docs/agent-context-budget.md — documents the budget + flags.

Example

4. Context budget (model-aware)

  window 40K · 50% startup target → budget ~20K always-on

  1 profile(s) over the 20K startup budget:

    🔴 full  ~24.0K always-on, 16 MCPs — 120% of budget

Tests

  • src/lib/token-budget.test.ts — 20 new tests (window resolution precedence, MCP estimate, budget math at 256K→128K, warning bands).
  • Full relevant suite green: 145 pass / 0 fail across token-budget, profile-loader, launch, and discover.suggest tests.
  • tsc clean on touched files (one pre-existing unrelated error in pair-suggestions.ts); biome lint clean; cue validate core schema-valid with the new field.

Notes

  • In this sandbox the resources/skills submodule isn't checked out, so skill-token measurement reads 0 and the audit counts MCPs only — it degrades gracefully (missing skills count as 0). On a real install with skills present, both are measured.

🤖 Generated with Claude Code

https://claude.ai/code/session_01TwQNpDsSR466euk2d1McHV


Generated by Claude Code

Add a model-aware "startup budget" check so a profile's always-on
footprint (skill frontmatter + MCP tool schemas) is warned about when it
exceeds 50% of the model's context window — a ~128K ceiling on a 256K
model, leaving the other half free for the conversation.

- lib/token-budget.ts: pure, tested math — MODEL_CONTEXT_WINDOWS map,
  resolveContextWindow (profile contextWindow/model → CUE_CONTEXT_WINDOW
  /CUE_MODEL env → 256K default), estimateMcpTokens, computeContextBudget,
  formatContextBudgetWarning (silent <80% budget, 🟡 near, 🔴 over).
- profiles: optional `model` / `contextWindow` fields (schema.json +
  _types.ts), resolved leaf-wins through inheritance in profile-loader.
  core declares contextWindow: 256000 as the fan-out baseline.
- cue launch: budget warning in the startup banner (real resolved
  skill + MCP token data).
- cue profile suggest: new "Context budget" section auditing every
  profile; --model / --context / --load-factor / --no-budget flags.
- docs/agent-context-budget.md: documents the budget + flags.
- token-budget.test.ts: 20 new tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TwQNpDsSR466euk2d1McHV
@NagyVikt NagyVikt marked this pull request as ready for review June 24, 2026 09:02
@NagyVikt NagyVikt merged commit f735b10 into main Jun 24, 2026
0 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants