Skip to content

feat(tools): add automated LLM model sync tool and update provider profiles#651

Open
asclearuc wants to merge 2 commits intodevelopfrom
feat/llm-providers-discovery-4
Open

feat(tools): add automated LLM model sync tool and update provider profiles#651
asclearuc wants to merge 2 commits intodevelopfrom
feat/llm-providers-discovery-4

Conversation

@asclearuc
Copy link
Copy Markdown
Collaborator

@asclearuc asclearuc commented Apr 10, 2026

Summary

  • Add tools/src/sync_models.py — fetches available models from provider APIs, smoke-tests new ones, and merges results into each node's services.json with smart deprecation (missing models marked deprecated: true) and three-source token limit resolution (provider API → OpenRouter → LiteLLM)
  • Add builder models:update --models="<args>" build command that runs the sync tool then formats all services.json files with Prettier; add weekly CI workflow that opens a PR with the sync report as the body
  • Update all LLM node services.json profiles with current model lists and token limits; add timed protected_profiles (["key", "YYYY-MM-DD"]) to prevent short-lived aliases from being wrongly deprecated
  • Add modelSource field to every profile tracking where the model ID was discovered — "provider" (confirmed by the live provider API), "openrouter" (sourced from OpenRouter only), "litellm" (sourced from LiteLLM database only), or "manual" (hand-curated, no API confirmation). Profiles without the field are backfilled as "manual"; profiles with "openrouter" or "litellm" are upgraded to "provider" when the provider API subsequently confirms them
  • Remove OpenRouter-only model profiles from llm_openai/services.json (model IDs that exist on the OpenRouter routing layer but not on the native OpenAI API)
  • Live API tests (test_sync_live.py) treat missing provider/manual model IDs as test failures; missing openrouter/litellm IDs emit a non-failing warning

Type

feature

Testing

  • Tests added or updated
  • Tested locally
  • ./builder test passes

Checklist

  • Commit messages follow conventional commits
  • No secrets or credentials included
  • Wiki updated (if applicable)
  • Breaking changes documented (if applicable)

Linked Issue

Fixes #643

Summary by CodeRabbit

  • New Features

    • Automated LLM model synchronization workflow (weekly + manual) and a CLI/tool to discover, smoke-test, and sync provider model profiles.
    • Many new preconfigured models added across providers (Anthropic, DeepSeek, Gemini, Mistral, OpenAI, Perplexity, Qwen, xAI) with token limits and selection support.
  • Documentation

    • Added user guide for the model sync tool.
  • Tests

    • New unit and live-provider tests covering sync logic and smoke tests.
  • Chores

    • Added Python tooling, task entries, and CI integration for model sync.

…ofiles

## Summary

- Add `tools/src/sync_models.py` — fetches available models from provider APIs, smoke-tests new ones, and merges results into each node's `services.json` with smart deprecation (missing models marked `deprecated: true`) and three-source token limit resolution (provider API → OpenRouter → LiteLLM)
- Add `builder models:update --models="<args>"` build command that runs the sync tool then formats all `services.json` files with Prettier; add weekly CI workflow that opens a PR with the sync report as the body
- Update all LLM node `services.json` profiles with current model lists and token limits; add timed `protected_profiles` (`["key", "YYYY-MM-DD"]`) to prevent short-lived aliases from being wrongly deprecated
- Add `modelSource` field to every profile tracking where the model ID was discovered — `"provider"` (confirmed by the live provider API), `"openrouter"` (sourced from OpenRouter only), `"litellm"` (sourced from LiteLLM database only), or `"manual"` (hand-curated, no API confirmation). Profiles without the field are backfilled as `"manual"`; profiles with `"openrouter"` or `"litellm"` are upgraded to `"provider"` when the provider API subsequently confirms them
- Remove OpenRouter-only model profiles from `llm_openai/services.json` (model IDs that exist on the OpenRouter routing layer but not on the native OpenAI API)
- Live API tests (`test_sync_live.py`) treat missing `provider`/`manual` model IDs as test failures; missing `openrouter`/`litellm` IDs emit a non-failing warning

## Type

feature

## Testing

- [ ] Tests added or updated
- [ ] Tested locally
- [ ] `./builder test` passes

## Checklist

- [ ] Commit messages follow [conventional commits](https://www.conventionalcommits.org/)
- [ ] No secrets or credentials included
- [ ] Wiki updated (if applicable)
- [ ] Breaking changes documented (if applicable)

## Linked Issue

Fixes #643
@github-actions github-actions bot added docs Documentation ci/cd CI/CD and build system builder labels Apr 10, 2026
@github-actions
Copy link
Copy Markdown

No description provided.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 10, 2026

📝 Walkthrough

Walkthrough

Adds a new LLM model synchronization system: CLI tool, provider handlers, merge/patch/reporting logic, tests, docs, a weekly GitHub Actions workflow, and updates many provider service JSONs to include model metadata and new profiles. (Dry-run by default; can apply changes and create PRs.)

Changes

Cohort / File(s) Summary
Workflow
\.github/workflows/sync-models.yml
New weekly/manual workflow that runs the sync tool (dry-run then apply) and opens an automated PR (chore/sync-models) with labels.
Core sync tool & config
tools/src/sync_models.py, tools/src/sync_models.config.json, tools/SYNC_MODELS.md, tools/requirements.txt
New CLI entry, config file, documentation, and Python deps for model discovery/sync pipeline.
Core libraries
tools/src/core/merger.py, tools/src/core/patcher.py, tools/src/core/reporter.py, tools/src/core/smoke.py, tools/src/core/util.py
New smart-merge, comment-preserving JSON patcher, reporting/PR body formatting, smoke-test runners with retries, and retry-classification util.
Provider implementations
tools/src/providers/...
anthropic.py, deepseek.py, embedding_openai.py, gemini.py, mistral.py, openai.py, perplexity.py, qwen.py, xai.py
New CloudProvider base and concrete providers for each LLM node: client creation, model fetching, ID normalization, and provider-specific logic.
Tasks & build integration
tools/scripts/tasks.js, scripts/build.js, scripts/lib/registry.js
Adds models:update task, a --models CLI flag, and expands task discovery glob to include tools/**/scripts/tasks.js.
Node service JSON updates
nodes/src/nodes/llm_*/services.json
llm_anthropic, llm_deepseek, llm_gemini, llm_mistral, llm_openai, llm_perplexity, llm_qwen, llm_xai
Adds modelSource and modelOutputTokens/modelTotalTokens metadata, deprecation/migration entries, and many new preconfigured profiles; extends fields.*.profile enums/conditionals and adds matching field objects.
Patcher / test infra
tools/test/*, tools/test/conftest.py, tools/test/markers.py, tools/test/test_sync_live.py, tools/test/test_sync_logic.py
Pytest fixtures and markers, offline unit tests for merge/patch/smoke/report logic, and live-provider validation tests (skip via env markers).

Sequence Diagram(s)

sequenceDiagram
    participant GHA as GitHub Actions
    participant CLI as sync_models.py
    participant Config as sync_models.config.json
    participant Provider as Provider APIs (OpenAI/Anthropic/...)
    participant Smoke as Smoke Tester
    participant Merger as Merger Logic
    participant Patcher as JSON Patcher
    participant Repo as services.json files

    GHA->>CLI: invoke (--all --apply --pr-body)
    CLI->>Config: load config
    CLI->>Repo: read current profiles
    loop per provider
      CLI->>Provider: fetch models (API / OpenRouter / LiteLLM)
      Provider-->>CLI: model list (+context)
      CLI->>Smoke: smoke-test new models
      Smoke-->>CLI: pass/skip/error
      CLI->>Merger: merge models with current profiles
      Merger-->>CLI: added/updated/deprecated sets
      CLI->>Patcher: patch services.json (preserve comments)
      Patcher-->>Repo: write updated file
    end
    CLI->>CLI: render SyncReport (console or PR body)
    CLI->>GHA: emit PR body / env output
    GHA->>GHA: create PR with labels
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

module:nodes, module:ai

Suggested reviewers

  • jmaionchi
  • stepmikhaylov

Poem

🐇 Hop, hop—new models land each week,
I fetch and test while devs take a peek.
I merge with care and patch with grace,
Comments kept so nothing's erased.
A PR blooms—automated delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 69.06% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: adding an automated LLM model sync tool and updating provider profiles.
Linked Issues check ✅ Passed All key objectives from issue #643 are met: sync tool fetches from provider APIs, dry-run by default with --apply flag, preserves existing data, marks models as deprecated, resolves token limits, and includes CI/CD workflow.
Out of Scope Changes check ✅ Passed All changes directly support the sync tool implementation and model profile updates. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/llm-providers-discovery-4

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 20

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nodes/src/nodes/llm_anthropic/services.json`:
- Around line 139-194: Remove the OpenRouter-only model aliases from the
Anthropic services catalog: delete any entries in
nodes/src/nodes/llm_anthropic/services.json whose "modelSource" is "openrouter"
(examples: "claude-opus-4-6-fast", "claude-opus-4-1", "claude-3-5-haiku", etc.),
because anthropic.py passes the model ID straight into ChatAnthropic() and these
aliases are not valid Anthropic model IDs; if OpenRouter-backed models are
desired, add them to the llm_openrouter node instead and keep only native
Anthropic model entries in services.json.

In `@nodes/src/nodes/llm_deepseek/services.json`:
- Around line 103-106: The profile for model "deepseek-reasoner" in
services.json is marked deprecated due solely to OpenRouter not listing it,
which is incorrect because cloud-reasoner still uses the native DeepSeek API;
fix by removing or clearing the "deprecated": true flag and the
OpenRouter-specific "migration" text for the "deepseek-reasoner" entry (or
replace the migration text with one that indicates deprecation will occur only
when the native DeepSeek API endpoint is unreachable), ensuring the "model":
"deepseek-reasoner" and "modelTotalTokens" entries remain unchanged.

In `@nodes/src/nodes/llm_gemini/services.json`:
- Around line 167-183: Add the missing "modelSource": "manual" field to the
deprecated Gemini profile objects shown in the diff so they conform to the new
sync metadata contract — specifically update the "gemini-3-pro-preview" and
"gemini-3-pro-image" entries (and the other deprecated Gemini entries modified
elsewhere in this diff) by adding modelSource: "manual" alongside the existing
keys (title, model, modelTotalTokens, modelOutputTokens, deprecated, migration,
apikey).
- Around line 185-192: The "gemini-2_0-flash" profile in services.json points to
the wrong model and has an inconsistent migration note: update the
"gemini-2_0-flash" object so its "model" field matches the intended 2.0 model ID
if you want to preserve a deprecated compatibility alias (and add an explicit
"deprecated": true flag and keep a migration message pointing to
"gemini-2.5-flash"), otherwise remove the "migration" text and set "model" to
the correct 2.5 flash ID (or delete the entire "gemini-2_0-flash" entry if you
don't need the alias); key symbols to change are the "gemini-2_0-flash" profile,
its "model" property, and the "migration" property in services.json.

In `@nodes/src/nodes/llm_perplexity/services.json`:
- Around line 54-60: The "sonar-reasoning" profile is missing the new
first-class metadata field modelSource; update the "sonar-reasoning" entry in
services.json to include "modelSource": "manual" (alongside its existing keys
like "title", "model", "modelTotalTokens", "deprecated", "migration", and
"apikey") so it matches the new schema and backfill behavior.

In `@nodes/src/nodes/llm_xai/services.json`:
- Around line 98-103: The test fixture profiles still reference the deprecated
model "grok-4-1-fast-reasoning" in the test.profiles section, which will
exercise a retired path by default; update test.profiles to remove or replace
"grok-4-1-fast-reasoning" with a currently supported model key (choose one of
the non-deprecated model entries in the same services.json) so fixtures use an
active model, and ensure any tests or fixtures referencing that key are updated
accordingly.

In `@tools/scripts/tasks.js`:
- Around line 15-16: NODES_GLOB currently glob-matches all JSON files under
nodes/src/nodes/** which lets models:update rewrite unrelated node metadata;
restrict the glob to only services.json by changing the NODES_GLOB value (the
constant named NODES_GLOB created with path.join and PROJECT_ROOT) so it targets
'**/services.json' instead of '*.json', and apply the same narrower pattern
where the same glob is reused around lines 51–55 to ensure Prettier and
models:update only touch services.json files.

In `@tools/src/core/merger.py`:
- Around line 459-474: The code incorrectly treats 0 as missing by using truthy
checks (the "a or b" pattern and >0 checks), so preserve explicit zeroes by
changing the override selection to test dictionary membership/None explicitly
(e.g., check if _token_lookup_id in output_token_overrides or model_id in
output_token_overrides and then assign api_output_tokens =
int(output_token_overrides[key]) when present) and by replacing any "> 0" update
guard with a None-check (allow 0 as a valid value) in the existing-profile
update path; update logic references: _out_override, output_token_overrides,
_token_lookup_id, model_id, _api_entry_out, _api_entry_source, _or_out,
_litellm_out, and default_output_tokens.

In `@tools/src/core/reporter.py`:
- Around line 163-165: The early return emits "_No changes detected._" even when
providers were skipped or errored; update the condition around
report.has_any_changes() to also ensure no provider was skipped, errored, or
produced warnings/errors before returning. Replace the current check using
report.providers and p.skipped with a comprehensive check such as verifying not
any(p.skipped or getattr(p, 'errored', False) or getattr(p, 'failed', False) or
getattr(p, 'errors', None) or getattr(p, 'warnings', None) for p in
report.providers) so the message is only returned when there truly are no
changes and no skipped/errored/warning-producing providers.
- Around line 167-173: format_pr_body currently treats any pr.warning as a full
skip and continues, hiding real mutations; update format_pr_body so it does not
immediately `continue` when `pr.warning` is present—only treat it as skipped
when `pr.skipped` is true or when there are no changes—remove the early
`continue` after checking `pr.warning`, and instead emit the warning text (use
`pr.warning`) above the provider section and then proceed to the existing logic
that checks `pr.has_changes()` and `pr.skipped` to render
added/updated/deprecated profiles for `pr.provider`.

In `@tools/src/core/util.py`:
- Line 111: The util.py retry helper currently returns True for unknown
exceptions which conflicts with ChatBase.is_retryable_error() that defaults to
False; update the function (is_retryable_error in tools/src/core/util.py) to
return False for unmatched/unknown exceptions, preserve existing checks for
recognized retryable errors, and update the docstring to state the default False
behavior to keep it in sync with
packages/ai/src/ai/common/chat.py::ChatBase.is_retryable_error().
- Around line 32-48: The current non_retryable_patterns list in
is_retryable_error includes the bare substring 'forbidden', which over-matches
parameter validation messages like 'extra_forbidden'; update is_retryable_error
to avoid this by making the 'forbidden' check more specific — either replace the
plain 'forbidden' entry with a more precise token (e.g., '403 forbidden' or
'access forbidden') or change the matching logic for that pattern to use a
word-boundary/regex match (e.g., re.search(r'\bforbidden\b', error_str,
flags=re.I)) so 'extra_forbidden' is not matched; keep the rest of
non_retryable_patterns unchanged and ensure callers like smoke.py that
explicitly check 'extra_forbidden' still behave the same.

In `@tools/src/providers/base.py`:
- Around line 583-585: The merge is using raw global_protected_profiles which
may contain dated entries (lists/tuples) and causes unhashable or incorrect
keys; instead run global_protected_profiles through _active_protected_profiles()
and then merge. Concretely, replace protected.update(global_protected_profiles)
with protected.update(_active_protected_profiles(global_protected_profiles)) so
global entries are parsed/filtered the same way as the local protected_profiles
before updating the protected set/dict.

In `@tools/src/providers/mistral.py`:
- Around line 20-28: Docstring for MistralProvider is misleading: it claims
"Filters out embedding and moderation models" but fetch_models() does not
perform filtering (filtering comes from model_filter in config). Update the
MistralProvider class docstring to remove or rephrase that sentence and clarify
that model filtering is handled by the model_filter configuration (or
elsewhere), referencing the MistralProvider class and its fetch_models() method
so readers know where to look.

In `@tools/src/providers/openai.py`:
- Around line 43-45: Remove the temporary DEBUG comment text ("DEBUG lines below
log the raw model object so we can see what fields are actually available —
remove once confirmed.") from tools/src/providers/openai.py; delete that
standalone debugging-note comment (and any other leftover DEBUG/TODO comment in
the same function or near the code that queries /v1/models) so the file contains
no stray debug comments before merging.

In `@tools/src/sync_models.config.json`:
- Around line 310-313: The protected_profiles entry for
"grok-4-1-fast-reasoning" is expired; update the ["grok-4-1-fast-reasoning",
"2025-10-09"] item in the protected_profiles array so the alias remains
protected—either set a future ISO date well beyond 2026 or replace the expiry
with a sentinel value (e.g., a far-future date or a string indicating permanent
protection) consistent with how other entries are handled, ensuring you modify
the exact ["grok-4-1-fast-reasoning", "..."] tuple.

In `@tools/src/sync_models.py`:
- Around line 205-207: The default_output_tokens is using the chat fallback
(4096) which causes sync_provider()/merge() to assign that value to embedding
providers (e.g., embedding_openai) and corrupt modelOutputTokens in
services.json; change the code that sets default_output_tokens to read the
embedding default from config.get('model_output_tokens', {}).get('defaults',
{}).get('embedding', <appropriate-fallback>) instead of the hard-coded chat
value, and ensure sync_provider()/merge() uses that embedding default when the
provider type or name indicates an embedding model so the embedding sentinel
(0/omitted) is preserved.

In `@tools/SYNC_MODELS.md`:
- Around line 10-13: The unlabeled fenced code blocks in tools/SYNC_MODELS.md
(examples like the two python invocation lines and the other blocks around lines
75-77 and 111-121) need a language tag to satisfy markdownlint MD040; update
each opening fence from ``` to a labeled fence such as ```bash (or ```text for
non-shell output) so the three shown blocks and the ones at the other referenced
ranges are tagged consistently.

In `@tools/test/conftest.py`:
- Around line 38-43: The try/except currently swallows all errors when importing
or calling load_dotenv, hiding missing python-dotenv or malformed .env problems;
change the block so you only silence ImportError for "from dotenv import
load_dotenv" but if load_dotenv(_TOOLS_TEST.parent.parent / '.env') raises, emit
a warning or log (use the warnings module or logging) including the exception
details and the path (_TOOLS_TEST.parent.parent / '.env') so failures to load
ROCKETRIDE_APIKEY_* are visible; keep the import handling limited to ImportError
and ensure the load failure path reports the error rather than pass.

In `@tools/test/test_sync_live.py`:
- Around line 105-109: The test helper _fetch_gemini_model_ids is importing the
deprecated package; change the import to use the same SDK as the provider by
replacing "import google.generativeai as genai" with "from google import genai"
(remove the type: ignore), and keep the rest of the logic using genai's model
listing so the call that collects model names ({m.name for m in
genai.list_models()}) uses the current SDK used by
tools/src/providers/gemini.py.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5191fce2-4d1c-4b20-a1ce-5ae4d875a730

📥 Commits

Reviewing files that changed from the base of the PR and between 09943bf and c83122f.

📒 Files selected for processing (39)
  • .github/workflows/sync-models.yml
  • nodes/src/nodes/llm_anthropic/services.json
  • nodes/src/nodes/llm_deepseek/services.json
  • nodes/src/nodes/llm_gemini/services.json
  • nodes/src/nodes/llm_mistral/services.json
  • nodes/src/nodes/llm_openai/services.json
  • nodes/src/nodes/llm_perplexity/services.json
  • nodes/src/nodes/llm_qwen/services.json
  • nodes/src/nodes/llm_xai/services.json
  • scripts/build.js
  • scripts/lib/registry.js
  • tools/SYNC_MODELS.md
  • tools/requirements.txt
  • tools/scripts/tasks.js
  • tools/src/__init__.py
  • tools/src/core/__init__.py
  • tools/src/core/merger.py
  • tools/src/core/patcher.py
  • tools/src/core/reporter.py
  • tools/src/core/smoke.py
  • tools/src/core/util.py
  • tools/src/providers/__init__.py
  • tools/src/providers/anthropic.py
  • tools/src/providers/base.py
  • tools/src/providers/deepseek.py
  • tools/src/providers/embedding_openai.py
  • tools/src/providers/gemini.py
  • tools/src/providers/mistral.py
  • tools/src/providers/openai.py
  • tools/src/providers/perplexity.py
  • tools/src/providers/qwen.py
  • tools/src/providers/xai.py
  • tools/src/sync_models.config.json
  • tools/src/sync_models.py
  • tools/test/__init__.py
  • tools/test/conftest.py
  • tools/test/markers.py
  • tools/test/test_sync_live.py
  • tools/test/test_sync_logic.py

Comment on lines +139 to 194
"claude-opus-4-6-fast": {
"title": "Claude Opus 4.6 Fast",
"model": "claude-opus-4-6-fast",
"modelSource": "openrouter",
"modelTotalTokens": 1000000, // openrouter
"modelOutputTokens": 128000, // openrouter
"apikey": ""
},
"claude-opus-4-1": {
"title": "Claude Opus 4.1",
"model": "claude-opus-4-1",
"modelSource": "openrouter",
"modelTotalTokens": 200000, // openrouter
"modelOutputTokens": 32000, // openrouter
"apikey": ""
},
"claude-opus-4": {
"title": "Claude Opus 4",
"model": "claude-opus-4",
"modelSource": "openrouter",
"modelTotalTokens": 200000, // openrouter
"modelOutputTokens": 32000, // openrouter
"apikey": ""
},
"claude-sonnet-4": {
"title": "Claude Sonnet 4",
"model": "claude-sonnet-4",
"modelSource": "openrouter",
"modelTotalTokens": 200000, // openrouter
"modelOutputTokens": 64000, // openrouter
"apikey": ""
},
"claude-3-7-sonnet": {
"title": "Claude 3.7 Sonnet",
"model": "claude-3-7-sonnet",
"modelSource": "openrouter",
"modelTotalTokens": 200000, // openrouter
"modelOutputTokens": 64000, // openrouter
"apikey": ""
},
"claude-3-5-haiku": {
"title": "Claude 3.5 Haiku",
"model": "claude-3-5-haiku",
"modelSource": "openrouter",
"modelTotalTokens": 200000, // openrouter
"modelOutputTokens": 8192, // openrouter
"apikey": ""
},
"claude-3-haiku": {
"title": "Claude 3 Haiku",
"model": "claude-3-haiku",
"modelSource": "openrouter",
"modelTotalTokens": 200000, // openrouter
"modelOutputTokens": 4096, // openrouter
"apikey": ""
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- Anthropic-related runtime files ---'
fd -i 'anthropic' nodes tools scripts -t f | sort

echo
echo '--- Look for OpenRouter routing or model-id translation for Anthropic ---'
rg -n -C3 'openrouter|base_url|claude-opus-4-6-fast|claude-opus-4-1|claude-opus-4|claude-sonnet-4|claude-3-7-sonnet|claude-3-5-haiku|claude-3-haiku' nodes tools scripts

Repository: rocketride-org/rocketride-server

Length of output: 50391


🏁 Script executed:

cat -n nodes/src/nodes/llm_anthropic/anthropic.py

Repository: rocketride-org/rocketride-server

Length of output: 5055


🏁 Script executed:

cat -n tools/src/providers/anthropic.py | head -100

Repository: rocketride-org/rocketride-server

Length of output: 3477


🏁 Script executed:

# Search for how modelSource is used in runtime execution
rg -n 'modelSource.*openrouter|anthropic.*openrouter' nodes --type python -C 2

Repository: rocketride-org/rocketride-server

Length of output: 106


Remove OpenRouter-only aliases from the native Anthropic node.

The profiles added at lines 139-194 are marked "modelSource": "openrouter" but the Anthropic runtime does not handle OpenRouter profiles. The anthropic.py node passes the model ID directly to ChatAnthropic() without any OpenRouter routing, URL rewriting, or API key translation. Additionally, these IDs (e.g., claude-opus-4-6-fast, claude-opus-4-1, claude-3-5-haiku) are OpenRouter aliases that lack the version timestamps required by Anthropic's API (e.g., claude-opus-4-6-20250514). Selecting any of these profiles will pass an invalid model ID to Anthropic and fail at invocation time.

Per project precedent (as noted in the PR summary, analogous cleanup was done for llm_openai), remove these OpenRouter-only aliases from llm_anthropic/services.json. If OpenRouter model access is intended, use the dedicated llm_openrouter node instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/llm_anthropic/services.json` around lines 139 - 194, Remove
the OpenRouter-only model aliases from the Anthropic services catalog: delete
any entries in nodes/src/nodes/llm_anthropic/services.json whose "modelSource"
is "openrouter" (examples: "claude-opus-4-6-fast", "claude-opus-4-1",
"claude-3-5-haiku", etc.), because anthropic.py passes the model ID straight
into ChatAnthropic() and these aliases are not valid Anthropic model IDs; if
OpenRouter-backed models are desired, add them to the llm_openrouter node
instead and keep only native Anthropic model entries in services.json.

Comment on lines +103 to +106
"model": "deepseek-reasoner",
"modelTotalTokens": 128000,
"deprecated": true,
"migration": "Model no longer listed in OpenRouter. Please select a current model.",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid deprecating the default native profile from an OpenRouter-only signal.

cloud-reasoner still targets https://api.deepseek.com/v1, but the migration text deprecates it because OpenRouter no longer lists it. That makes this node default to a profile marked unavailable even though the native DeepSeek API is the actual backend here. Either keep this profile active until the DeepSeek API check says it is gone, or switch the defaults in the same change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/llm_deepseek/services.json` around lines 103 - 106, The
profile for model "deepseek-reasoner" in services.json is marked deprecated due
solely to OpenRouter not listing it, which is incorrect because cloud-reasoner
still uses the native DeepSeek API; fix by removing or clearing the
"deprecated": true flag and the OpenRouter-specific "migration" text for the
"deepseek-reasoner" entry (or replace the migration text with one that indicates
deprecation will occur only when the native DeepSeek API endpoint is
unreachable), ensuring the "model": "deepseek-reasoner" and "modelTotalTokens"
entries remain unchanged.

Comment on lines 167 to +183
"gemini-3-pro-preview": {
"model": "models/gemini-3-pro-preview",
"title": "Gemini 3 Pro Preview",
"model": "models/gemini-3-pro-preview",
"modelTotalTokens": 1048576,
"modelOutputTokens": 65536,
"apikey": "",
"deprecated": true,
"migration": "Please use 'gemini-3.1-pro-preview' instead"
"migration": "Please use 'gemini-3.1-pro-preview' instead",
"apikey": ""
},
"gemini-3-pro-image": {
"model": "models/gemini-3-pro-image",
"title": "Gemini 3 Pro Image",
"model": "models/gemini-3-pro-image",
"modelTotalTokens": 98304,
"modelOutputTokens": 32768,
"apikey": "",
"deprecated": true,
"migration": "Please use 'gemini-3-pro-image-preview' instead"
"migration": "Please use 'gemini-3-pro-image-preview' instead",
"apikey": ""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Backfill modelSource on the deprecated Gemini profiles too.

This PR makes modelSource a first-class field, but these modified deprecated entries still omit it. Please backfill them as "manual" as well so the file stays consistent with the new sync metadata contract.

Also applies to: 194-200

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/llm_gemini/services.json` around lines 167 - 183, Add the
missing "modelSource": "manual" field to the deprecated Gemini profile objects
shown in the diff so they conform to the new sync metadata contract —
specifically update the "gemini-3-pro-preview" and "gemini-3-pro-image" entries
(and the other deprecated Gemini entries modified elsewhere in this diff) by
adding modelSource: "manual" alongside the existing keys (title, model,
modelTotalTokens, modelOutputTokens, deprecated, migration, apikey).

Comment on lines 185 to +192
"gemini-2_0-flash": {
"model": "models/gemini-2.5-flash",
"title": "Gemini 2.0 Flash",
"model": "models/gemini-2.5-flash",
"modelSource": "manual",
"modelTotalTokens": 1048576,
"modelOutputTokens": 65535,
"apikey": "",
"deprecated": true,
"migration": "Please use 'gemini-2.5-flash' instead"
"migration": "Please use 'gemini-2.5-flash' instead",
"apikey": ""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

gemini-2_0-flash now points at the wrong model.

This profile is labeled "Gemini 2.0 Flash", but its model is models/gemini-2.5-flash. That silently changes behavior for any existing config that still selects this alias. If this entry is meant to be a deprecated compatibility profile, keep the old 2.0 model ID here and mark it deprecated; otherwise remove the migration text.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/llm_gemini/services.json` around lines 185 - 192, The
"gemini-2_0-flash" profile in services.json points to the wrong model and has an
inconsistent migration note: update the "gemini-2_0-flash" object so its "model"
field matches the intended 2.0 model ID if you want to preserve a deprecated
compatibility alias (and add an explicit "deprecated": true flag and keep a
migration message pointing to "gemini-2.5-flash"), otherwise remove the
"migration" text and set "model" to the correct 2.5 flash ID (or delete the
entire "gemini-2_0-flash" entry if you don't need the alias); key symbols to
change are the "gemini-2_0-flash" profile, its "model" property, and the
"migration" property in services.json.

Comment on lines 54 to 60
"sonar-reasoning": {
"model": "sonar-reasoning",
"title": "Sonar Reasoning",
"model": "sonar-reasoning",
"modelTotalTokens": 128000,
"deprecated": true,
"migration": "Model no longer listed in OpenRouter. Please select a current model.",
"apikey": ""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Keep deprecated profiles on the new schema.

sonar-reasoning is the only changed profile here without modelSource, even though the sync pipeline now treats that field as first-class metadata and backfills missing legacy entries to 'manual'. Adding it here avoids leaving one profile in the old shape.

Suggested fix
 			"sonar-reasoning": {
 				"title": "Sonar Reasoning",
 				"model": "sonar-reasoning",
+				"modelSource": "manual",
 				"modelTotalTokens": 128000,
 				"deprecated": true,
 				"migration": "Model no longer listed in OpenRouter. Please select a current model.",
 				"apikey": ""
 			},
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"sonar-reasoning": {
"model": "sonar-reasoning",
"title": "Sonar Reasoning",
"model": "sonar-reasoning",
"modelTotalTokens": 128000,
"deprecated": true,
"migration": "Model no longer listed in OpenRouter. Please select a current model.",
"apikey": ""
"sonar-reasoning": {
"title": "Sonar Reasoning",
"model": "sonar-reasoning",
"modelSource": "manual",
"modelTotalTokens": 128000,
"deprecated": true,
"migration": "Model no longer listed in OpenRouter. Please select a current model.",
"apikey": ""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/llm_perplexity/services.json` around lines 54 - 60, The
"sonar-reasoning" profile is missing the new first-class metadata field
modelSource; update the "sonar-reasoning" entry in services.json to include
"modelSource": "manual" (alongside its existing keys like "title", "model",
"modelTotalTokens", "deprecated", "migration", and "apikey") so it matches the
new schema and backfill behavior.

Comment on lines +310 to +313
"protected_profiles": [
["custom", "2126-04-09"],
["grok-4-1-fast-reasoning", "2025-10-09"],
["grok-4-1-fast-non-reasoning", "2026-10-09"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

grok-4-1-fast-reasoning protection is already expired.

On April 10, 2026, the ["grok-4-1-fast-reasoning", "2025-10-09"] entry is inactive, so the OpenRouter fallback can deprecate this alias despite the comment saying it should stay protected. Bump the expiry or make it effectively permanent if this alias still needs shielding.

Suggested fix
-				["grok-4-1-fast-reasoning", "2025-10-09"],
+				["grok-4-1-fast-reasoning", "2026-10-09"],
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"protected_profiles": [
["custom", "2126-04-09"],
["grok-4-1-fast-reasoning", "2025-10-09"],
["grok-4-1-fast-non-reasoning", "2026-10-09"]
"protected_profiles": [
["custom", "2126-04-09"],
["grok-4-1-fast-reasoning", "2026-10-09"],
["grok-4-1-fast-reasoning", "2026-10-09"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/src/sync_models.config.json` around lines 310 - 313, The
protected_profiles entry for "grok-4-1-fast-reasoning" is expired; update the
["grok-4-1-fast-reasoning", "2025-10-09"] item in the protected_profiles array
so the alias remains protected—either set a future ISO date well beyond 2026 or
replace the expiry with a sentinel value (e.g., a far-future date or a string
indicating permanent protection) consistent with how other entries are handled,
ensuring you modify the exact ["grok-4-1-fast-reasoning", "..."] tuple.

Comment on lines +205 to +207
title_mappings = config.get('title_mappings', {})
output_token_overrides = config.get('model_output_tokens', {}).get('overrides', {})
default_output_tokens = config.get('model_output_tokens', {}).get('defaults', {}).get('chat', 4096)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Load the embedding output-token default instead of hard-coding the chat default.

sync_provider() always passes the chat fallback (4096) into merge(). For embedding_openai, that turns the embedding sentinel (0 / omitted field) into a bogus modelOutputTokens value and pollutes services.json.

💡 Suggested fix
-    default_output_tokens = config.get('model_output_tokens', {}).get('defaults', {}).get('chat', 4096)
+    output_defaults = config.get('model_output_tokens', {}).get('defaults', {})
+    default_output_tokens = (
+        output_defaults.get('embedding', 0)
+        if provider_name.startswith('embedding_')
+        else output_defaults.get('chat', 4096)
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/src/sync_models.py` around lines 205 - 207, The default_output_tokens
is using the chat fallback (4096) which causes sync_provider()/merge() to assign
that value to embedding providers (e.g., embedding_openai) and corrupt
modelOutputTokens in services.json; change the code that sets
default_output_tokens to read the embedding default from
config.get('model_output_tokens', {}).get('defaults', {}).get('embedding',
<appropriate-fallback>) instead of the hard-coded chat value, and ensure
sync_provider()/merge() uses that embedding default when the provider type or
name indicates an embedding model so the embedding sentinel (0/omitted) is
preserved.

Comment on lines +10 to +13
```
python tools/src/sync_models.py --provider <PROVIDER> [--provider <PROVIDER> ...]
python tools/src/sync_models.py --all
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add languages to the unlabeled fenced blocks.

These three fences will keep tripping markdownlint MD040. Tag them as bash or text so the docs stay lint-clean.

Also applies to: 75-77, 111-121

🧰 Tools
🪛 markdownlint-cli2 (0.22.0)

[warning] 10-10: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/SYNC_MODELS.md` around lines 10 - 13, The unlabeled fenced code blocks
in tools/SYNC_MODELS.md (examples like the two python invocation lines and the
other blocks around lines 75-77 and 111-121) need a language tag to satisfy
markdownlint MD040; update each opening fence from ``` to a labeled fence such
as ```bash (or ```text for non-shell output) so the three shown blocks and the
ones at the other referenced ranges are tagged consistently.

Comment on lines +38 to +43
try:
from dotenv import load_dotenv

load_dotenv(_TOOLS_TEST.parent.parent / '.env')
except Exception:
pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t silently swallow .env loading failures.

If python-dotenv is missing or the repo .env is malformed, this leaves every ROCKETRIDE_APIKEY_* unset and the live-provider tests quietly skip. Catch only the expected import case, or at least warn on load failures so lost coverage is visible.

Suggested fix
-try:
-    from dotenv import load_dotenv
-
-    load_dotenv(_TOOLS_TEST.parent.parent / '.env')
-except Exception:
-    pass
+try:
+    from dotenv import load_dotenv
+except ImportError:
+    load_dotenv = None
+
+if load_dotenv is not None:
+    env_path = _TOOLS_TEST.parent.parent / '.env'
+    try:
+        load_dotenv(env_path)
+    except OSError as exc:
+        print(f'Warning: failed to load {env_path}: {exc}', file=sys.stderr)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
from dotenv import load_dotenv
load_dotenv(_TOOLS_TEST.parent.parent / '.env')
except Exception:
pass
try:
from dotenv import load_dotenv
except ImportError:
load_dotenv = None
if load_dotenv is not None:
env_path = _TOOLS_TEST.parent.parent / '.env'
try:
load_dotenv(env_path)
except OSError as exc:
print(f'Warning: failed to load {env_path}: {exc}', file=sys.stderr)
🧰 Tools
🪛 Ruff (0.15.9)

[error] 42-43: try-except-pass detected, consider logging the exception

(S110)


[warning] 42-42: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/test/conftest.py` around lines 38 - 43, The try/except currently
swallows all errors when importing or calling load_dotenv, hiding missing
python-dotenv or malformed .env problems; change the block so you only silence
ImportError for "from dotenv import load_dotenv" but if
load_dotenv(_TOOLS_TEST.parent.parent / '.env') raises, emit a warning or log
(use the warnings module or logging) including the exception details and the
path (_TOOLS_TEST.parent.parent / '.env') so failures to load
ROCKETRIDE_APIKEY_* are visible; keep the import handling limited to ImportError
and ensure the load failure path reports the error rather than pass.

Comment on lines +105 to +109
def _fetch_gemini_model_ids(api_key: str) -> Set[str]:
import google.generativeai as genai # type: ignore[import]

genai.configure(api_key=api_key)
return {m.name for m in genai.list_models()}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Official Google Gen AI Python SDK docs: for the google-genaipackage, what is the supported Python import path and API for listing models? Is itfrom google import genaiwithclient.models.list(), or import google.generativeaiwithlist_models()?

💡 Result:

The official Google Gen AI Python SDK is the google-genai package (install via pip install google-genai). The supported import path is from google import genai, create a client with client = genai.Client(...) (or genai.Client if using environment variables), and list models using client.models.list (or for model in client.models.list: ...). The alternative import google.generativeai with list_models is from the deprecated/legacy google-generativeai package, which ended support on November 30, 2025, and is no longer recommended or actively maintained.

Citations:


🏁 Script executed:

cat tools/src/providers/gemini.py | head -70

Repository: rocketride-org/rocketride-server

Length of output: 2247


Use the same Gemini SDK as the provider implementation.

This helper uses the deprecated google-generativeai package with import google.generativeai, while tools/src/providers/gemini.py correctly uses the current google-genai package with from google import genai. The deprecated package ended support on November 30, 2025. Update the test to match the provider implementation:

Suggested fix
 def _fetch_gemini_model_ids(api_key: str) -> Set[str]:
-    import google.generativeai as genai  # type: ignore[import]
-
-    genai.configure(api_key=api_key)
-    return {m.name for m in genai.list_models()}
+    from google import genai  # type: ignore[import]
+
+    client = genai.Client(api_key=api_key)
+    return {m.name for m in client.models.list()}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/test/test_sync_live.py` around lines 105 - 109, The test helper
_fetch_gemini_model_ids is importing the deprecated package; change the import
to use the same SDK as the provider by replacing "import google.generativeai as
genai" with "from google import genai" (remove the type: ignore), and keep the
rest of the logic using genai's model listing so the call that collects model
names ({m.name for m in genai.list_models()}) uses the current SDK used by
tools/src/providers/gemini.py.

Copy link
Copy Markdown
Collaborator

@kwit75 kwit75 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — sync_models automation tool

Alexandru, this is a substantial and well-thought-out piece of work. The architecture is solid: config-driven provider registry, three-source token resolution with clear priority, smart merge that preserves manual edits, and the protected_profiles / timed-expiry mechanism is clever. The documentation in SYNC_MODELS.md is genuinely excellent. I have a list of things to address before merge — most align with CodeRabbit's findings, but several are additional concerns.


Blockers (must fix)

1. OpenRouter-only aliases in llm_anthropic/services.json

CodeRabbit caught this and it is correct. The profiles at the bottom (claude-opus-4-6-fast, claude-opus-4-1, claude-opus-4, claude-sonnet-4, claude-3-7-sonnet, claude-3-5-haiku, claude-3-haiku) are all marked "modelSource": "openrouter" — these are OpenRouter routing aliases, not native Anthropic model IDs. The Anthropic node passes the model ID directly to ChatAnthropic() with no OpenRouter routing, so selecting any of these profiles will fail at invocation time. This is the exact pattern you correctly removed from llm_openai (as noted in the PR summary). Please apply the same treatment here: remove OpenRouter-only aliases from the native Anthropic node.

2. DeepSeek cloud-reasoner deprecated by OpenRouter signal, but it is the default profile

CodeRabbit flagged this. If you are running with no DeepSeek API key (which the CI workflow may encounter), the OpenRouter fallback may not list cloud-reasoner, causing it to be marked deprecated even though it works fine on the native DeepSeek API. This would leave the node defaulting to a deprecated profile. Either protect it in config, or only deprecate when the signal comes from the actual provider API (not a fallback source).

More generally: deprecation from a fallback source (OpenRouter/LiteLLM) should be treated differently from deprecation confirmed by the provider's own API. A model missing from OpenRouter does not mean the provider discontinued it. Consider either (a) not deprecating at all from fallback sources, or (b) adding a "deprecationSource" field and only hard-deprecating when confirmed by the provider API. This is the biggest architectural concern I have — the current behavior can cause false deprecations for any provider where the API key is missing.

3. Workflow: no base branch specified in peter-evans/create-pull-request

The workflow step using peter-evans/create-pull-request@v7 does not specify a base branch. By default it uses the branch checked out in the workflow, which is whatever the default branch is. This should explicitly set base: develop to match our branching model. Without it, if someone changes the repo default branch, synced PRs could target the wrong branch.

4. Workflow: no artifact retention / storage budget consideration

We literally just hit 100% Actions storage. The workflow installs litellm (which pulls in a big dependency tree) and runs pip install on every run. There are no caching steps for pip, so each weekly run re-downloads everything. More importantly:

  • Add actions/cache for the pip dependencies to save bandwidth and time
  • If the workflow ever creates artifacts (logs, reports), set retention to 1 day
  • Consider whether the dry-run + apply two-step is necessary in CI — the dry-run step doubles the API calls (and token spend on smoke tests). If the script handles errors gracefully (which it does via exit code 1), the dry-run may be redundant in CI where the apply step itself will fail fast on errors

5. is_retryable_error() defaults to True — inconsistent with engine

CodeRabbit caught this: tools/src/core/util.py defaults to return True for unrecognized exceptions, but the engine's ChatBase.is_retryable_error() defaults to return False. This means unknown errors in smoke tests retry 3 times instead of failing fast. The docstring says these must be kept in sync. Please fix the default to return False.

6. global_protected_profiles not defensively parsed in _run_merge()

In providers/base.py:_run_merge(), the global protected profiles list is passed to protected.update(global_protected_profiles). I see that sync_models.py:sync_provider() calls _active_protected_profiles() on the global list before passing it in, so the current call path is safe. However, _run_merge() is a public method on CloudProvider — if anyone calls it directly with raw config data containing timed entries like ["custom", "2126-04-09"], it would break. Either add a defensive parse in _run_merge(), or at minimum document the contract that global_protected_profiles must be pre-parsed.


Should fix (strongly recommended)

7. Reporter: "No changes detected" when all providers errored

format_pr_body() can return "No changes detected" even when every provider failed or was skipped. This would create a misleading PR body. Check for errors/warnings before emitting the all-clear.

8. Reporter: warning-backed syncs lose their changes in PR body

format_pr_body() treats any provider with a warning as fully skipped (continue), even if the fallback source produced real adds/updates/deprecations. This hides actual changes in the generated PR body, undermining the human review that this tool is built for.

9. gemini-2_0-flash profile points to models/gemini-2.5-flash

A profile labeled "Gemini 2.0 Flash" with key gemini-2_0-flash has its model field changed to models/gemini-2.5-flash. This silently changes behavior for anyone using this profile. If it is a deprecated alias, keep the old model ID and mark deprecated.

10. Expired protection: grok-4-1-fast-reasoning

The protected_profiles entry ["grok-4-1-fast-reasoning", "2025-10-09"] is already expired (April 2026 now). If it should still be protected, bump the date. If intentionally expired, update the xAI test.profiles reference (line 364) since it still points to this now-deprecated model.

11. Prettier glob too broad in tools/scripts/tasks.js

NODES_GLOB matches nodes/src/nodes/**/*.json which reformats all JSON files under nodes, not just services.json. Scope to **/services.json.

12. Embedding output token default

sync_provider() always passes the chat default (4096) for output tokens. For embedding_openai, this writes a bogus modelOutputTokens value. Add provider-type awareness or a per-provider output token default in config.

13. Backfill modelSource on deprecated Gemini profiles

The modified deprecated entries in Gemini services.json still omit modelSource. Backfill as "manual" for consistency with the new schema contract.


Suggestions (nice to have)

14. Pin litellm version in requirements.txt — it is a fast-moving dependency. An unpinned version means the weekly CI could break on a litellm release. Pin to a known-good version.

15. Document smoke test costs — smoke tests make real API calls. For 8 providers with new models, this has a per-run cost. Worth noting in SYNC_MODELS.md and considering a --no-smoke flag.

16. Clean up DEBUG comment in tools/src/providers/openai.py — lines 45+ have a "DEBUG — remove once confirmed" comment with no actual debug code. Clean it up.

17. Checklist items./builder test and conventional commits are unchecked. Please address before merge.


Summary

The core architecture is good — smart merge (not overwrite), deprecate-not-delete, protected profiles with expiry, human review via PR. The modelSource provenance tracking is a nice touch, and the three-source token resolution is well designed.

The main risk areas are: (1) false deprecation from fallback sources which can silently break default profiles for nodes, and (2) OpenRouter aliases in the Anthropic node which will fail at runtime. Fix those plus the workflow concerns and this is in good shape.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
tools/scripts/tasks.js (1)

12-12: ⚠️ Potential issue | 🟡 Minor

Restrict formatting target to services.json only.

Line 12 currently matches every JSON file under nodes/src/nodes/**, so models:update can rewrite unrelated JSON files. Scope this to services.json, which is what the sync flow mutates.

Proposed fix
-const NODES_GLOB = path.join(PROJECT_ROOT, 'nodes', 'src', 'nodes', '**', '*.json');
+const NODES_GLOB = path.join(PROJECT_ROOT, 'nodes', 'src', 'nodes', '**', 'services.json');

Also applies to: 48-48

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/scripts/tasks.js` at line 12, NODES_GLOB currently matches all JSON
files under the nodes folder causing models:update to rewrite unrelated files;
change the glob so it targets only services.json (e.g., replace NODES_GLOB =
path.join(PROJECT_ROOT, 'nodes', 'src', 'nodes', '**', '*.json') with a pattern
that ends with 'services.json') and make the same change for the other
occurrence referenced around the second spot so both uses only operate on
services.json; update any variables that reference NODES_GLOB to ensure they
still point to the renamed/filtered constant if you rename it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@tools/scripts/tasks.js`:
- Line 12: NODES_GLOB currently matches all JSON files under the nodes folder
causing models:update to rewrite unrelated files; change the glob so it targets
only services.json (e.g., replace NODES_GLOB = path.join(PROJECT_ROOT, 'nodes',
'src', 'nodes', '**', '*.json') with a pattern that ends with 'services.json')
and make the same change for the other occurrence referenced around the second
spot so both uses only operate on services.json; update any variables that
reference NODES_GLOB to ensure they still point to the renamed/filtered constant
if you rename it.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ff39a5a3-6d5d-4499-a0d3-8d32a621355b

📥 Commits

Reviewing files that changed from the base of the PR and between c83122f and 6ec3788.

📒 Files selected for processing (3)
  • scripts/build.js
  • scripts/lib/registry.js
  • tools/scripts/tasks.js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

builder ci/cd CI/CD and build system docs Documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dynamic Model Discovery for LLM Nodes

2 participants