Commit e80c7ae
authored
feat: add generic /v1/completions client and token debug visualization (#428)
* feat: add generic /v1/completions client and token debug visualization
Add a generic FireworksV1CompletionsClient that handles local tokenization
via HuggingFace transformers and calls the /v1/completions endpoint with
token-in/token-out. Tool-call parsing is pluggable via a callback rather
than hardcoded to any specific domain.
Also add TokenDebugView component for the evaluation dashboard with:
- Text view: readable colored text with mask overlay (prompt vs completion)
- Episode view: per-token visualization with mask or logprob coloring
- Turns view: per-turn breakdown of prompt/completion tokens
- Token ID chips with hover tooltips showing detokenized text and logprob
- Smooth gradient logprob coloring (green=high confidence, red=low)
Made-with: Cursor
* fix: address CI failures and review comments
- Rewrite tests to match the generic client API (remove references to
old domain-specific methods like _parse_tool_call_with_optional_fallback)
- Fix TokenDebugSection guard to also check extra?.full_episode
- Fix zero reward styled as red/negative — now uses neutral gray
- Fix tools=[] vs None: explicit empty list no longer falls back to
default_tools
Made-with: Cursor
* fix: pass thinking kwargs in build_assistant_turn_token_ids, handle missing fullEpisode
- build_assistant_turn_token_ids now passes _thinking_kwargs() to
apply_chat_template, consistent with _build_prompt_token_ids and
build_tool_response_suffix_token_ids
- View selector no longer silently renders nothing when fullEpisode is
null and user selected text/episode view — falls back to turns view
or shows a placeholder message
Made-with: Cursor
* fix: guard against None after Mapping extraction, prevent request_params overriding core fields
- Add second None check in _normalize_token_id_sequence after extracting
from a Mapping, since .get() can return None even when the key exists
- Move request_params spread before explicit keys (model, prompt,
temperature, max_tokens, logprobs) so they cannot be silently
overridden by stray entries in request_params
Made-with: Cursor1 parent 4aca272 commit e80c7ae
File tree
5 files changed
+1256
-1
lines changed- eval_protocol/integrations
- tests
- vite-app/src/components
5 files changed
+1256
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
6 | 11 | | |
7 | | - | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
0 commit comments