feat(server): plain-text call:<verb>{} tool parsing (Gemma4) by easel · Pull Request #340 · Luce-Org/lucebox-hub

easel · 2026-06-03T21:40:59Z

Summary

Adds plain-text call:<verb>{...} tool-call parsing for Gemma4 emissions to the server's tool_parser. The parser also tolerates the `_call: tokenizer artifact prefix and renders Anthropic-style tool_use / tool_result blocks. Changes are localized to tool_parser.{cpp,h}; no streaming-path edits in this PR.

Files

server/src/server/tool_parser.cpp — parser implementation for the call:<verb>{...} form (+192 LOC)
server/src/server/tool_parser.h — small surface additions for the new branch
server/test/test_server_unit.cpp — unit tests covering the new parser branches (+172 LOC)

Single commit: feat(server): plain-text call:<verb>{} tool parsing (Gemma4).

Dependencies

None — this PR is independent of the other split PRs.

Note: while this PR's code is self-contained, the server target itself cross-references symbols introduced/moved by sibling PRs (#336 layer split, #338 pflash drafter, #339 soft-close, #341 thinking-control), so building the server CMake target from this branch in isolation will not produce a complete binary. The parser additions themselves do not depend on any sibling PR.

Test plan

test_server_unit (the new branches added in this PR) passes locally
Smoke a Gemma4 trace emitting call:<verb>{...} (and the `_call: variant) end-to-end once the full server stack is assembled with sibling PRs
Confirm no regressions in existing tool_parser branches (JSON tool_use path, etc.)

Note: server PRs in this split cannot be validated as a runnable binary standalone due to CMake cross-references with the other split PRs; the unit tests added here are the only standalone validation.

## What Extends server/src/server/tool_parser.{cpp,h} to parse Gemma's plain-text call:<verb>{} emissions (also accepts the \`\`_call:\`\` tokenizer-artifact prefix) and render them as Anthropic tool_use + tool_result blocks. Isolated to tool_parser; the streaming detection hook in sse_emitter ships with Luce-Org#341. Adds 364 lines of C++ unit coverage in test_server_unit.cpp plus the call-verb parser plan and Gemma4-26B parser-fix writeup. ## Why Gemma4 emits tool calls as plain-text call:<verb>{...} rather than structured JSON, which breaks the existing Anthropic tool_use pipeline on agentic workloads. This parser closes that gap so Gemma4 can drive coding-agent loops end-to-end. ## Dependencies None - this PR is independent.

…ps schema-4 ## What User-facing thinking-control API across the HTTP server surface: - chat_template prefills a closed <think> block when thinking is off (Qwen3-gated) so the model skips the reasoning preamble without losing the assistant turn. - http_server bumps /props schema 2 -> 4, adding build / model.target / model.draft / host blocks for client introspection. - server_main adds --debug-thinking-logits and --think-soft-close-* flags plus image/host-info loaders for card-driven boot. - sse_emitter routes Qwen3.6/Laguna think-mode output to the reasoning_content channel so reasoning never leaks into the user-visible content stream (Pattern-B call-verb streaming hook). - Ships the model-card _schema.json, qwen3.6-27b and laguna-xs.2 cards, the /props OpenAPI doc, updated thinking-budget spec, and the thinking-control protocol/mechanism experiments. - test_server_unit gets matching coverage (~1100 lines) for prefill, /props schema-4, and reasoning_content routing. ## Why Gives clients a single, card-driven API to control thinking budgets, soft-close behavior, and reasoning visibility - and an introspectable /props surface to discover what the server supports. ## Dependencies - Luce-Org#336 (server-layer-split): CMake/build references - Luce-Org#338 (server-pflash-drafter): check_admission uses pflash_keep_ratio + pflash_on contracts - Luce-Org#340 (server-call-verb): sse_emitter Pattern-B call-verb streaming hooks rely on tool_parser changes from Luce-Org#340

cubic-dev-ai

1 issue found across 3 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="server/src/server/tool_parser.cpp">

<violation number="1" location="server/src/server/tool_parser.cpp:582">
P1: Disallowed `call:<verb>{...}` spans are not shadowed, allowing pattern 6 to emit spurious inner tool calls.</violation>
</file>

_{Reply with feedback, questions, or to request a fix.

Re-trigger cubic}

cubic-dev-ai · 2026-06-04T05:10:27Z

+            if (colon != std::string::npos) verb = verb.substr(colon + 1);
+            if (verb.empty()) continue;
+
+            add_call(verb, args, call_start, brace_close);


P1: Disallowed call:<verb>{...} spans are not shadowed, allowing pattern 6 to emit spurious inner tool calls.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At server/src/server/tool_parser.cpp, line 582: <comment>Disallowed `call:<verb>{...}` spans are not shadowed, allowing pattern 6 to emit spurious inner tool calls.</comment> <file context> @@ -397,7 +545,45 @@ ToolParseResult parse_tool_calls(const std::string & text, const json & tools) { + if (colon != std::string::npos) verb = verb.substr(colon + 1); + if (verb.empty()) continue; + + add_call(verb, args, call_start, brace_close); + } + } </file context>

Suggested change

add_call(verb, args, call_start, brace_close);

if (tool_allowed(tools, verb)) {

add_call(verb, args, call_start, brace_close);

} else {

removals.push_back({call_start, brace_close});

}

…ps schema-4 User-facing thinking-control API across the HTTP server surface: - chat_template prefills a closed <think> block when thinking is off (Qwen3-gated) so the model skips the reasoning preamble without losing the assistant turn. - http_server bumps /props schema 2 -> 4, adding build / model.target / model.draft / host blocks for client introspection. - server_main adds --debug-thinking-logits and --think-soft-close-* flags plus image/host-info loaders for card-driven boot. - sse_emitter routes Qwen3.6/Laguna think-mode output to the reasoning_content channel so reasoning never leaks into the user-visible content stream (Pattern-B call-verb streaming hook). - Ships the model-card _schema.json, qwen3.6-27b and laguna-xs.2 cards, the /props OpenAPI doc, updated thinking-budget spec, and the thinking-control protocol/mechanism experiments. - test_server_unit gets matching coverage (~1100 lines) for prefill, /props schema-4, and reasoning_content routing. Gives clients a single, card-driven API to control thinking budgets, soft-close behavior, and reasoning visibility - and an introspectable /props surface to discover what the server supports. - Luce-Org#336 (server-layer-split): CMake/build references - Luce-Org#338 (server-pflash-drafter): check_admission uses pflash_keep_ratio + pflash_on contracts - Luce-Org#340 (server-call-verb): sse_emitter Pattern-B call-verb streaming hooks rely on tool_parser changes from Luce-Org#340

easel force-pushed the feat/server-call-verb-parser branch 2 times, most recently from d4c2de0 to 8b97d57 Compare June 4, 2026 03:13

easel mentioned this pull request Jun 4, 2026

feat(luce-bench): in-tree bench harness + multi-turn agent_recorded + LLM judge #337

Open

4 tasks

easel force-pushed the feat/server-call-verb-parser branch from 8b97d57 to 89e8505 Compare June 4, 2026 04:52

easel mentioned this pull request Jun 4, 2026

feat(server): card-driven thinking control + reasoning_content channel + /props schema-4 #341

Open

6 tasks

easel force-pushed the feat/server-call-verb-parser branch from 89e8505 to 4472aa9 Compare June 4, 2026 05:03

easel marked this pull request as ready for review June 4, 2026 05:03

cubic-dev-ai Bot reviewed Jun 4, 2026

View reviewed changes

This was referenced Jun 4, 2026

feat(lucebox): docker stack + CLI + bench/profile + harness + luce-bench in-tree #285

Closed

fix(server): support gemma-4's plain-text call:<verb>{} tool-call format #329

Closed

easel pushed a commit to easel/lucebox-hub that referenced this pull request Jun 4, 2026

docs: record PR Luce-Org#340 probe as held

4309491

easel pushed a commit to easel/lucebox-hub that referenced this pull request Jun 4, 2026

docs: refresh auto-integration manifest after PR Luce-Org#340 probe

fb15bcb

easel pushed a commit to easel/lucebox-hub that referenced this pull request Jun 4, 2026

docs: refresh auto-integration manifest after PR Luce-Org#340 probe

a12107a

easel pushed a commit to easel/lucebox-hub that referenced this pull request Jun 4, 2026

docs: refresh auto-integration manifest after PR Luce-Org#340 probe

fb442bb

davide221 merged commit f4eb504 into Luce-Org:main Jun 4, 2026
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(server): plain-text call:<verb>{} tool parsing (Gemma4)#340

feat(server): plain-text call:<verb>{} tool parsing (Gemma4)#340
davide221 merged 1 commit into
Luce-Org:mainfrom
easel:feat/server-call-verb-parser

easel commented Jun 3, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

cubic-dev-ai Bot Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-            add_call(verb, args, call_start, brace_close);
+            if (tool_allowed(tools, verb)) {
+                add_call(verb, args, call_start, brace_close);
+            } else {
+                removals.push_back({call_start, brace_close});
+            }

Conversation

easel commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files

Dependencies

Test plan

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

easel commented Jun 3, 2026 •

edited

Loading