feat(mcp): token-count telemetry on code_*/scout/invoke#75
Conversation
|
Warning Review limit reached
Your plan includes 1 review of capacity. Refill in 16 minutes and 45 seconds. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more review capacity refills, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: ⛔ Files ignored due to path filters (4)
📒 Files selected for processing (18)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9d4479bdc5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /// and accurate enough for capacity tracking; do NOT use for LLM budget | ||
| /// enforcement — pull in `tiktoken-rs` if exact counts are required. | ||
| fn estimate_tokens(s: &str) -> usize { | ||
| s.len().div_ceil(4) |
There was a problem hiding this comment.
Count Unicode chars instead of UTF-8 bytes
estimate_tokens documents a char-based heuristic (~4 chars/token) but uses s.len(), which counts UTF-8 bytes in Rust. For non-ASCII inputs (for example CJK text or emoji in code_execute/scout payloads), this inflates input_tokens/output_tokens and makes the new telemetry systematically inaccurate, which can mislead capacity/cost monitoring and trigger false regressions. Use a character count (or explicitly rename/document this as byte-based) to keep the metric semantics consistent.
Useful? React with 👍 / 👎.
Surface input_tokens and output_tokens fields on every dispatch event for the five gateway meta-tools — code_search, code_schema, code_execute, scout, and invoke. The fields complement the existing elapsed_ms and make it possible to size LLM context budgets and spot ballooning responses from log analytics. Uses a simple chars/4 estimator (chars div_ceil 4) — dependency-free and accurate enough for capacity tracking. Pull in tiktoken-rs later if exact counts are required. Input tokens are computed once at handler entry from the MCP arguments map. Output tokens are computed before the success log emits, against the serialized result. Failure paths only emit input_tokens (no useful output to size). Refs: timing already existed inline; this commit adds token accounting on the same boundary. New estimator helpers covered by 3 unit tests.
9d4479b to
f8c20ca
Compare
Summary
input_tokensandoutput_tokenson every dispatch event for the five gateway meta-tools —code_search,code_schema,code_execute,scout, andinvoke. The existingelapsed_msfield already gave us timing; tokens close the gap so log analytics can spot ballooning payloads and size LLM context budgets.Test plan
Summary by cubic
Adds token-count telemetry to gateway meta-tools and ships Code Mode v2 with upstream-only execution, JS catalog search, and capped responses. Improves observability and tightens safety.
New Features
code_search,code_execute,scout, andinvokealongsideelapsed_ms. Inputs computed once from MCP args; outputs on success; failures log inputs only. Addsestimate_tokens*helpers (chars/4) with 3 tests.upstream::<server>::<tool>IDs). Adds response budgets (byte/token caps) and mapscode_mode_timeout/code_mode_fuel_exhaustedto HTTP 504.code_searchnow filters an inlined catalog with JavaScript;code_schemais removed. Optionalcode_mode_wasmfeature usingjavyandwasmtimewith a helper script to fetch the Javy plugin.Migration
lab::<service>.<action>withtool_execute/invoke; Code Mode accepts upstream tool IDs only.code_schema; usescoutfor discovery orcode_searchfor JS catalog filtering.[code_mode]config.libclangfor Code Mode builds.Written for commit f8c20ca. Summary will update on new commits. Review in cubic