Fix frontend SaaS detection broken after #78#81
Closed
juanmichelini wants to merge 3 commits into
Closed
Conversation
PR #78 pointed search_frontend_for_model() at the wrong SDK file. VERIFIED_OPENHANDS_MODELS lives in openhands-sdk/openhands/sdk/llm/utils/verified_models.py, not model_features.py (which only contains feature-flag allow-lists). As a result, models that only exist in verified_models.py — including DeepSeek-V3.2-Reasoner, GLM-5.1, MiniMax-M2.1, MiniMax-M2.7, Nemotron-3-Nano, Nemotron-3-Super and Qwen3.6-Plus — reported frontend_support_timestamp=null on the dashboard. Also fixes the SaaS check for Qwen3.6-Plus: the SaaS catalog and the SDK use the spelling 'qwen3-6-plus' (hyphen, no dot), but the model's aliases only listed 'dashscope/qwen3.6-plus' / 'qwen3.6-plus', so the SaaS verified-model lookup never matched. Regression tests: - search_frontend_for_model targets verified_models.py (and not model_features.py). - Qwen3.6-Plus aliases include 'qwen3-6-plus'. - check_saas_verified_model('Qwen3.6-Plus') matches a SaaS catalog containing 'openhands/qwen3-6-plus'. Fixes #79 Co-authored-by: openhands <openhands@all-hands.dev>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…d-saas-detection-79
The scheduled tracker action runs without a SaaS API key, so `check_saas_verified_model` was returning None for every model and `frontend_saas_available` was being coerced to False across the board (see run 25711661321 / draft PR #85, which flipped 21 of 28 models from true to false). The SDK's `VERIFIED_OPENHANDS_MODELS` in `openhands-sdk/openhands/sdk/llm/utils/verified_models.py` is the public source of truth for the openhands-provider 'Verified' subsection on both app.all-hands.dev and a self-hosted OpenHands build. Reading it from the cloned SDK repo (we already clone it for the frontend search) lets the tracker confirm SaaS availability with no credentials. `check_saas_verified_model` now: 1. Reads `VERIFIED_OPENHANDS_MODELS` from the SDK clone — public, no-auth, always available in the action. 2. Falls through to the existing SaaS API call only as a best-effort supplement for the 'Others' subsection (DB-only entries). 3. Returns None only when *both* sources are unreachable; if the SDK list is fetched and the model isn't in it (or in SaaS), the answer is definitively False. Live smoke test (no $OPENHANDS_CLOUD_API_KEY / $LLM_API_KEY): Qwen3.6-Plus -> True GPT-5.5 -> True trinity-large-thinking -> True Kimi-K2.6 -> True Qwen3-Coder-Next -> False (correctly absent from SDK list) DeepSeek-V3.2-Reasoner -> True GLM-5.1 -> True Nemotron-3-Super -> True Tests: - New `TestCheckSaasVerifiedModelViaSdk` covers the four SDK/SaaS-reachability combinations (SDK hits, SDK miss + SaaS unreachable -> False, both unreachable -> None, Others supplements SDK). - New `TestFetchSdkVerifiedOpenhandsModels` parses a fixture `verified_models.py` from disk. - Existing `TestCheckSaasVerifiedModel` SaaS-API tests get an autouse fixture that stubs the SDK reader to None, so they continue to exercise the SaaS-API branch as before. Co-authored-by: openhands <openhands@all-hands.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #79.
What was broken
After merging #78, several models on https://openhands-llm-support-tracker.vercel.app/ regressed to
frontend_support_timestamp = nulland/orfrontend_saas_available = falseeven though they are perfectly usable in app.all-hands.dev (SaaS) and in a self-hosted OpenHands build.Concretely, the dashboard currently shows the following regressions (excluding
GPT-5.5andtrinity-large-thinking, which were added very recently and may legitimately not yet be in a tagged release):DeepSeek-V3.2-ReasonerGLM-5.1MiniMax-M2.1MiniMax-M2.7Nemotron-3-NanoNemotron-3-SuperQwen3.6-PlusRoot cause
Two independent bugs introduced (or surfaced) by #78:
Bug A — wrong SDK file path
PR #78 made
search_frontend_for_model()look atopenhands-sdk/openhands/sdk/llm/utils/model_features.py, butVERIFIED_OPENHANDS_MODELSactually lives inopenhands-sdk/openhands/sdk/llm/utils/verified_models.py.model_features.pyonly contains feature-flag allow-lists (extended thinking, prompt cache, etc.). Some model names happen to appear there because they belong to those allow-lists, but that overlap is incidental and incomplete — every model that does not appear inmodel_features.pyshowed up as "no frontend support".Bug B — missing SaaS alias for
Qwen3.6-PlusThe SaaS catalog at
/api/v1/config/models/search?provider__eq=openhandslists this model asopenhands/qwen3-6-plus(hyphen, no dot), matching the SDK'sVERIFIED_QWEN_MODELSspelling. The tracker only knew aboutdashscope/qwen3.6-plusandqwen3.6-plus, so the bare-name comparison never matched.Fix
scripts/track_llm_support.py— pointsearch_frontend_for_model()atopenhands-sdk/openhands/sdk/llm/utils/verified_models.py(and update the docstring).scripts/track_llm_support.py— add"qwen3-6-plus"toMODEL_ALIASES["Qwen3.6-Plus"].Tests
Added/updated regression tests in
tests/test_track_llm_support.py:test_finds_model_in_sdk_verified_openhands_modelsnow asserts the SDK search hitsverified_models.pyusingNemotron-3-Super(a model that only exists inverified_models.py).test_search_targets_verified_models_not_model_features— new test that inspects the actual paths passed togit log --and fails fast if the search ever regresses back tomodel_features.py.test_picks_earliest_across_both_repos— updated to dispatch onverified_models.py.test_qwen36_plus_has_aliases— now assertsqwen3-6-plusis in the alias set.test_qwen36_plus_matches_sdk_spelling— new SaaS-side regression test that feeds a fake catalog containingopenhands/qwen3-6-plusand confirmscheck_saas_verified_model("Qwen3.6-Plus")returnsTrue.All 91 tracker tests pass locally:
After this lands
Once the scheduled (or a manual) tracker run refreshes
frontend/public/all_models.json, the dashboard should showfrontend ✅andsaas ✅for every model exceptGPT-5.5,Kimi-K2.6,Qwen3-Coder-Next, andtrinity-large-thinking, which are genuinely absent from the SaaS catalog at the moment.This pull request was created by an AI agent (OpenHands) on behalf of @juanmichelini.
@juanmichelini can click here to continue refining the PR