feat(llm): route search.ts classify/extract/synthesize through the LLM router by HomenShum · Pull Request #463 · HomenShum/nodebench-ai

HomenShum · 2026-06-02T07:33:24Z

What

Track B of the LLM-router rollout. Wires the 7 hardcoded gemini-3.1-flash-lite-preview call sites in server/routes/search.ts through the shared planner-on-a-pool router (shared/llm/router.ts, merged in #460), so model choice is owned in one place and genuinely hard turns can escalate.

ADDITIVE + behavior-preserving. The floor of every classify/extract/synthesize pool is the same flash-lite model, and signals are derived cheaply + locally (query length -> inputChars, retrieved source count -> sourceCount, comparison branches -> multiEntity). A simple single-entity query routes to the exact same model as before — only long, analytical, multi-entity, or many-source turns escalate to gemini-3-flash-preview.

Call sites wired (`server/routes/search.ts`)

Site	Task class	Notes
`classifyQueryWithLLM` query classification	`classify`	single-candidate pool — guaranteed no-op, just centralizes the model id
`agent_synthesize` trace (`synthesizeResults`)	`synthesize`	surfaces chosen model + reason in the trace. The wire-level model for `synthesizeResults` itself lives in `server/agentHarness.ts` (out of scope); this routes + labels the search-side decision
why-this-team credibility enrichment	`extract`	structured extraction over synthesized result + local context
multi-entity comparison extraction	`extract` (`multiEntity: true`)	most likely branch to escalate
single-entity extraction	`extract`	source count = `allSnippets.length`
founder-direction extraction	`extract`	source count = `genWebSnippets.length`

Observability

The chosen model lands in each trace step's tool field, and the route reason ("<model> -- floor light (complexity N.NN)" / "<model> -- escalated to balanced (complexity N.NN)") is appended to the step's detail — matching the existing SearchTraceEntry shape exactly. Escalations are visible in "How we got this answer".

Reliability (`.claude/rules/agentic_reliability.md`)

DETERMINISTIC — searchRouteSignals is a pure function (no Date/random), so routing is replay-safe. NaN/negative source counts are coerced to 0 (no NaN leak into the complexity score).
The AbortController / Promise.race request-budget gates and the 4-layer grounding pipeline are untouched.

Tests

server/searchRouteLlmRouting.test.ts — scenario-based (founder bare-name lookup, investor head-to-head comparison, banker diligence teardown), asserting:

the no-op floor for simple single-entity queries (the additive guarantee),
escalation for hard multi-entity / analytical / many-source turns,
classify never escalates regardless of complexity,
determinism (identical inputs -> identical model).

Verification

npx tsc --noEmit --pretty false — 0 errors
npx vitest run server/searchRouteLlmRouting.test.ts shared/llm/router.test.ts — 21 passed
npx vitest run server/searchRoute.test.ts — 25 passed (no regression)
npm run build — clean

Do not enable auto-merge — for review.

🤖 Generated with Claude Code

…M router Track B of the LLM-router rollout. The /search route hardcoded gemini-3.1-flash-lite-preview at 7 Gemini call sites. Wire each through the shared planner-on-a-pool router (shared/llm/router.ts) so model choice is owned in one place and long / analytical / multi-entity turns can escalate. ADDITIVE + behavior-preserving: the floor of every classify/extract/synthesize pool is the same flash-lite model, and signals are derived cheaply + locally (query length, retrieved source count, multiEntity for comparison branches), so a simple single-entity query routes to the exact same model as before. Call sites wired (server/routes/search.ts): - classifyQueryWithLLM (query classification) -> routeLLM classify [single-candidate pool, guaranteed no-op] - agent_synthesize trace (synthesizeResults) -> routeLLM synthesize [surfaces chosen model + reason in trace; wire-level call lives in agentHarness.ts] - why-this-team credibility enrichment -> routeLLM extract - multi-entity comparison extraction -> routeLLM extract (multiEntity true) - single-entity extraction -> routeLLM extract - founder-direction extraction -> routeLLM extract Observability: the chosen model lands in each trace step tool field and the route reason is appended to the step detail, matching the existing SearchTraceEntry shape exactly. Reliability (.claude/rules/agentic_reliability.md): searchRouteSignals is a pure function (no Date/random) so routing is DETERMINISTIC + replay-safe; NaN/negative source counts are coerced to 0. The AbortController/Promise.race budget gates and the grounding pipeline are untouched. Tests: server/searchRouteLlmRouting.test.ts -- scenario-based (founder lookup, investor comparison, banker diligence), asserting the no-op floor for simple queries, escalation for hard turns, classify-never-escalates, and determinism. Verification: tsc --noEmit clean; vitest 21 routing + 25 existing search-route tests pass; npm run build clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

vercel · 2026-06-02T07:33:30Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
nodebench-ai	Ready	Preview, Comment	Jun 2, 2026 7:34am

augmentcode · 2026-06-02T07:37:03Z

🤖 Augment PR Summary

Summary: This PR completes “Track B” of the LLM-router rollout by routing the search pipeline’s Gemini calls through the shared deterministic router, centralizing model choice and enabling controlled escalation on hard turns.

Changes:

Added deterministic signal derivation in server/routes/search.ts via searchRouteSignals() (query length, source count, multi-entity flag, analytical hint).
Introduced routeSearchModel() to map task class + signals to a router decision (routeLLM) and emit a trace-friendly reason string.
Rewired search route call sites that previously hardcoded gemini-3.1-flash-lite-preview (classify, extract branches, and search-side synthesize decision) to use the router-selected model.
Enhanced trace observability by recording the chosen model in the trace step’s tool field and appending route reasoning to the step detail.
Added scenario-based Vitest coverage (server/searchRouteLlmRouting.test.ts) to assert the behavior-preserving floor, escalation on hard multi-entity/analytical/many-source turns, and determinism.

Technical Notes: Routing remains replay-safe (pure + deterministic), and the classify pool remains a guaranteed no-op (single-candidate floor).

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 1 suggestion posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-06-02T07:37:04Z

+                // flash-lite, escalates only on heavy local context).
+                const { model: credModel } = routeSearchModel(
+                  "extract",
+                  searchRouteSignals(query, 0),


searchRouteSignals(query, 0) here only reflects the raw query, but the actual Gemini prompt includes synthesized.answer plus potentially large localContext, so routing may stay on the floor even when the input is long/complex. Consider deriving signals from the real prompt/context size (or passing a meaningful sourceCount) so routing decisions match the workload.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

vercel Bot deployed to Preview June 2, 2026 07:34 View deployment

augmentcode Bot reviewed Jun 2, 2026

View reviewed changes

HomenShum enabled auto-merge (squash) June 2, 2026 07:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llm): route search.ts classify/extract/synthesize through the LLM router#463

feat(llm): route search.ts classify/extract/synthesize through the LLM router#463
HomenShum wants to merge 1 commit into
mainfrom
feat/llm-router-search

HomenShum commented Jun 2, 2026

Uh oh!

vercel Bot commented Jun 2, 2026 •

edited

Loading

Uh oh!

augmentcode Bot commented Jun 2, 2026

Uh oh!

augmentcode Bot left a comment

Uh oh!

augmentcode Bot Jun 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HomenShum commented Jun 2, 2026

What

Call sites wired (server/routes/search.ts)

Observability

Reliability (.claude/rules/agentic_reliability.md)

Tests

Verification

Uh oh!

vercel Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

augmentcode Bot commented Jun 2, 2026

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Call sites wired (`server/routes/search.ts`)

Reliability (`.claude/rules/agentic_reliability.md`)

vercel Bot commented Jun 2, 2026 •

edited

Loading

augmentcode Bot Jun 2, 2026 •

edited

Loading