Skip to content

Use SDK agent settings and model switch UI#457

Merged
neubig merged 10 commits into
mainfrom
codex/use-sdk-conversation-settings
May 23, 2026
Merged

Use SDK agent settings and model switch UI#457
neubig merged 10 commits into
mainfrom
codex/use-sdk-conversation-settings

Conversation

@neubig
Copy link
Copy Markdown
Member

@neubig neubig commented May 15, 2026

Summary

  • send agent_settings to the local agent server instead of constructing a concrete agent in the frontend
  • preserve frontend-owned runtime pieces: workspace, initial message, plugin sources, custom secret lookups, and local runtime tool names
  • leave condenser/default tool construction to the server SDK create_agent() path so SDK default tools like SwitchLLMTool are included when available
  • reuse the /model slash-command model message path for successful SwitchLLMTool calls, while still rendering failed switch observations as chat errors
  • update conversation cache invalidation/model metadata after profile switches so the sidebar and conversation details stay in sync
  • add focused adapter, rendering, and cache regression coverage

Status

The required agent-server support shipped in OpenHands/software-agent-sdk#3261 (openhands-agent-server==1.22.1), so this PR is no longer blocked on the SDK release.

Related regression coverage for the release behavior: OpenHands/software-agent-sdk#3264.

Verification

  • npm test -- --run __tests__/api/agent-server-adapter.test.ts
  • npm test -- --run __tests__/components/conversation-events/chat/event-content-helpers/should-render-event.test.ts __tests__/components/features/chat/model-messages.test.tsx src/hooks/mutation/conversation-mutation-utils.test.ts
  • npm run typecheck
  • npm run build

Notes

Also filed OpenHands/typescript-client#164 to generate request/response types from the agent-server OpenAPI schema, so downstream frontends do not need to keep local Record<string, unknown> payload boundaries for server-owned SDK models.

Related issue: #699


🐳 Docker images for this PR

GHCR package: https://github.com/OpenHands/agent-canvas/pkgs/container/agent-canvas

Component Value
Image ghcr.io/openhands/agent-canvas
Architectures amd64, arm64
Agent Server ghcr.io/openhands/agent-server:1.23.0-python
Automation openhands-automation==1.0.0a3
Commit 3809f8286bd476c958e4b25730d93e0ecbd74436

Pull (multi-arch manifest)

# Multi-arch manifest — Docker automatically pulls the correct architecture
docker pull ghcr.io/openhands/agent-canvas:sha-3809f82

Run

docker run -it --rm \
  -p 8000:8000 \
  ghcr.io/openhands/agent-canvas:sha-3809f82

All tags pushed for this build

ghcr.io/openhands/agent-canvas:sha-3809f82-amd64
ghcr.io/openhands/agent-canvas:codex-use-sdk-conversation-settings-amd64
ghcr.io/openhands/agent-canvas:pr-457-amd64
ghcr.io/openhands/agent-canvas:sha-3809f82-arm64
ghcr.io/openhands/agent-canvas:codex-use-sdk-conversation-settings-arm64
ghcr.io/openhands/agent-canvas:pr-457-arm64
ghcr.io/openhands/agent-canvas:sha-3809f82
ghcr.io/openhands/agent-canvas:codex-use-sdk-conversation-settings
ghcr.io/openhands/agent-canvas:pr-457

About Multi-Architecture Support

  • Each tag (e.g., sha-3809f82) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., sha-3809f82-amd64) are also available if needed

@vercel
Copy link
Copy Markdown

vercel Bot commented May 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agent-canvas Ready Ready Preview, Comment May 23, 2026 2:07am

Request Review

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable - Solid architectural improvement delegating agent creation to the SDK.

Architecture Assessment:
Moving agent construction from frontend to SDK is the right direction—the frontend should describe what it wants through settings rather than how to build it. The code quality is good with proper type safety, test coverage, and clear documentation of the breaking change.

Key Observations:

  1. Tool merging behavior - The new getAgentTools() uses Map-based merging where defaults are always included and configured tools can override/extend them. This is more flexible than the old behavior (fixed defaults only), but the semantics should be documented.

  2. Condenser delegation - Frontend previously built full condenser config (kind: "LLMSummarizingCondenser", llm: {...}), now just passes raw settings (enabled, max_size). The description confirms this is intentional SDK delegation, but verify the SDK creates equivalent configs.

  3. Behavioral equivalence - This changes who creates the agent but shouldn't change what agent is created. However, since agent creation logic moves to the SDK, there's inherent uncertainty until tested against a live agent-server with the new SDK.

Risk Assessment:
⚠️ Risk Level: 🟡 MEDIUM

  • Breaking change to conversation start payload (properly documented and blocked on SDK release)
  • Moves core agent creation logic from frontend to SDK
  • Behavioral equivalence depends on SDK correctly interpreting the new agent_settings format
  • Recommendation: Per eval/benchmark risk policy, this PR should undergo lightweight evaluation testing against a real agent-server with the SDK changes before merging, since agent creation logic has moved and could plausibly affect agent behavior.

Verdict: ✅ Code is solid, tests pass, breaking change is documented. Request human review for eval/benchmark validation before merge.

Comment thread src/api/agent-server-adapter.ts
Comment thread src/api/agent-server-adapter.ts
@neubig
Copy link
Copy Markdown
Member Author

neubig commented May 15, 2026

@OpenHands /iterate

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 15, 2026

I'm on it! neubig can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Member Author

neubig commented May 15, 2026

Addressed the review feedback in 6c5f9e5 and resolved the two inline threads. Local verification passed: targeted adapter test, typecheck, build, and lint. Ready for another look.

This comment was created by an AI agent (OpenHands) on behalf of the PR author.

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable - Solid architectural improvement but requires eval verification

Architecture Assessment:
Delegating agent construction to the SDK is the right direction — the frontend should describe what it wants through settings rather than how to build it. Code quality is good with proper type safety, focused tests, and previous review feedback addressed.

Eval-Risk Flag:
This PR changes core agent initialization behavior (tool construction, condenser setup, agent creation all delegated to SDK). According to our review guidelines, changes affecting agent behavior must be validated with evals before merge.

Recommendation: Run lightweight evals to confirm this delegation doesn't introduce behavioral regressions, then merge. The architectural direction is sound.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM
    Changes agent initialization flow by delegating to SDK. Potential for behavioral differences in tool availability, condenser behavior, or agent defaults. Risk is mitigated by focused regression test and SDK version pin, but eval verification is recommended.

VERDICT:
✅ Worth merging after eval verification

KEY INSIGHT:
This correctly shifts the responsibility boundary — frontends send declarative settings, the SDK owns agent construction.

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 15, 2026

Since my last summary, there have been no additional code changes or tool actions.

Final status:

  • ✅ The /iterate request was completed: PR Use SDK agent settings and model switch UI #457 was inspected, updated, pushed, and monitored.
  • ✅ The CI/Prettier failure was fixed.
  • ✅ The two inline review suggestions were addressed with concise comments.
  • ✅ The review threads were replied to and resolved.
  • ✅ Verification passed locally and on GitHub checks.
  • ✅ The PR remains mergeable with no unresolved review threads.
  • ✅ No extraneous changes were made beyond the targeted formatting/comment updates.

Remaining note:

  • The PR is still conceptually gated on the SDK release / lightweight eval verification noted by the reviewer and PR description; no further frontend code changes were identified as necessary.

@neubig neubig changed the title Use SDK agent settings for local conversation start Use SDK agent settings and model switch UI May 15, 2026
@neubig
Copy link
Copy Markdown
Member Author

neubig commented May 15, 2026

@OpenHands /iterate

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 15, 2026

I'm on it! neubig can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Member Author

neubig commented May 15, 2026

Fixed the CI Prettier failure in c0203cc and verified locally with npm run lint. I’ll keep monitoring the new checks.\n\n_This comment was created by an AI agent (OpenHands) on behalf of the PR author._

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Member Author

neubig commented May 15, 2026

Merged the latest main into the PR to clear the merge conflict, resolved the adapter conflicts by keeping the new agent_settings request shape, and verified locally with:\n\n- npm test -- --run __tests__/api/agent-server-adapter.test.ts\n- npm test -- --run __tests__/components/conversation-events/chat/event-content-helpers/should-render-event.test.ts __tests__/components/features/chat/model-messages.test.tsx src/hooks/mutation/conversation-mutation-utils.test.ts\n- npm run lint\n- npm run build\n\nMonitoring the new checks on 1e1b298.\n\n_This comment was created by an AI agent (OpenHands) on behalf of the PR author._

Copy link
Copy Markdown
Member Author

neubig commented May 15, 2026

CI is green on the latest SHA 1e1b298: CI, snapshot tests, PR artifacts, and Vercel all passed. The branch is now mergeable with zero unresolved review threads.\n\nRemaining note: the existing review-bot comment is still a COMMENTED review recommending lightweight eval verification before merge; no code-change threads are open.\n\n_This comment was created by an AI agent (OpenHands) on behalf of the PR author._

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 15, 2026

OpenHands encountered an error: Sandbox not running: 80f5d20f57bb4621ab7c3dca811e340d

See the conversation for more information.

@neubig
Copy link
Copy Markdown
Member Author

neubig commented May 16, 2026

@OPEN /iterate

@neubig
Copy link
Copy Markdown
Member Author

neubig commented May 16, 2026

@OpenHands /iterate

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 16, 2026

I'm on it! neubig can track my progress at all-hands.dev

…ation-settings

# Conflicts:
#	src/api/agent-server-adapter.ts

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Member Author

neubig commented May 16, 2026

Merged the latest main into the PR to clear the merge conflict and resolved the adapter conflict by preserving the SDK agent_settings request shape while keeping load_public_skills tied to the frontend VITE_LOAD_PUBLIC_SKILLS setting.

Verification on f18ef7a:

  • npm test -- --run __tests__/api/agent-server-adapter.test.ts
  • npm test -- --run __tests__/components/conversation-events/chat/event-content-helpers/should-render-event.test.ts __tests__/components/features/chat/model-messages.test.tsx src/hooks/mutation/conversation-mutation-utils.test.ts
  • npm run typecheck
  • npm run build
  • npm run lint

GitHub checks are green for the latest SHA (CI, snapshot tests, PR artifacts, and Vercel), the branch is mergeable, and there are zero unresolved review threads.

Remaining note: the prior review-bot COMMENTED review still recommends lightweight eval verification before merge; no code-change review threads are open.

This comment was created by an AI agent (OpenHands) on behalf of the PR author.

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 16, 2026

Since my last summary, there have been no additional code changes or tool actions.

Final status:

  • ✅ The /iterate request was completely addressed.
  • ✅ PR Use SDK agent settings and model switch UI #457 was updated by merging latest main, resolving the only conflict, verifying locally, pushing the fix, and monitoring GitHub checks.
  • ✅ All required GitHub checks are green on f18ef7a.
  • ✅ The PR is CLEAN and MERGEABLE.
  • ✅ There are zero unresolved review threads.
  • ✅ The changes were concise: only the merge resolution needed for the PR branch was added; no extraneous local changes remain.
  • ✅ A final PR comment was posted with the verification summary and AI disclosure.

Remaining note: the existing review-bot COMMENTED review still recommends lightweight eval verification before merge, but no code-change threads or failing checks remain.

@neubig
Copy link
Copy Markdown
Member Author

neubig commented May 23, 2026

@OpenHands /iterate

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 23, 2026

I'm on it! neubig can track my progress at all-hands.dev

…ation-settings

Co-authored-by: openhands <openhands@all-hands.dev>
…ation-settings

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

📸 Snapshot Test Report

Warning

Snapshot comparison step crashed (timeout, OOM, or runner error) — diff results below may be incomplete or absent.
Check the CI logs for the full error output (look for the "Run snapshot comparison" step).

❌ 1 snapshot differ from the main branch baseline. Add the update-snapshots label to acknowledge intentional changes.

Category Count
🔴 Changed 1
🆕 New 0
✅ Unchanged 72
Total 73

How to resolve:

  • Unintentional diffs — the baselines on main may have moved since this branch was created. Merge the latest main into this branch and re-run CI.
  • Intentional changes — add the update-snapshots label. CI will pass and the new screenshots become the baseline when this PR merges.
🔴 Changed snapshots (1)

backends-extended

backend-after-switch

Expected (main) Actual (PR) Diff
expected actual diff
✅ Unchanged snapshots (72)

archived-conversation

  • conversation-panel-with-archived-badges
  • conversation-view-archived
  • conversation-view-sandbox-error

automations

  • automations-delete-modal
  • automations-list-active-inactive
  • automations-no-automations
  • automations-search-no-results

backends-extended

  • backend-add-blank-disabled
  • backend-add-cloud-advanced-open
  • backend-add-cloud-no-key-disabled
  • backend-add-cloud-with-key-enabled
  • backend-add-form-partially-filled
  • backend-add-invalid-url-disabled
  • backend-add-local-ready
  • backend-add-name-only-disabled
  • backend-add-two-column-layout
  • backend-add-whitespace-host-disabled
  • backend-cancel-nothing-saved
  • backend-dropdown-two-backends
  • backend-edit-prefilled
  • backend-manage-after-removal
  • backend-manage-two-listed
  • backend-remove-cancelled
  • backend-remove-confirmation
  • backend-switch-overlay

backends

  • backend-add-modal
  • backend-manage-modal
  • backend-selector-open

changes-tab

  • changes-deleted-file
  • changes-diff-viewer
  • changes-empty

collapsible-thinking

  • reasoning-content-collapsed
  • reasoning-content-expanded
  • think-action-collapsed
  • think-action-expanded

mcp-page

  • mcp-custom-server-1-editor-open
  • mcp-custom-server-2-url-filled
  • mcp-custom-server-3-all-filled
  • mcp-custom-server-4-installed
  • mcp-custom-server-editor
  • mcp-empty-installed
  • mcp-search-filtered
  • mcp-slack-install-1-marketplace
  • mcp-slack-install-2-modal
  • mcp-slack-install-3-filled
  • mcp-slack-install-4-installed

onboarding

  • onboarding-step-0-choose-agent
  • onboarding-step-1-check-backend
  • onboarding-step-2-setup-llm
  • onboarding-step-3-say-hello

projects-workspace-browser

  • projects-workspace-browser

settings-page

  • add-backend-modal
  • analytics-consent-modal
  • home-screen
  • settings-app-page
  • settings-page

settings-secrets

  • secrets-add-form-filled
  • secrets-add-form
  • secrets-after-save
  • secrets-delete-confirm
  • secrets-list

settings-verification

  • condenser-settings
  • verification-settings-off
  • verification-settings-on

sidebar

  • sidebar-collapsed
  • sidebar-conversation-panel
  • sidebar-filter-menu

skills-page

  • skills-empty
  • skills-loaded
  • skills-no-match
  • skills-search-filtered
  • skills-type-filter

Generated by the Snapshot Tests workflow. This comment was created by an AI agent (OpenHands) on behalf of the repo maintainers.

@neubig neubig merged commit 93512c7 into main May 23, 2026
13 of 14 checks passed
@neubig neubig deleted the codex/use-sdk-conversation-settings branch May 23, 2026 02:23
@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 23, 2026

Since my last summary, there were no additional tool actions or code changes.

Final status:

  • ✅ The /iterate request has been completed to the relevant stop condition: PR Use SDK agent settings and model switch UI #457 is merged.
  • ✅ Review threads were confirmed resolved (0 unresolved).
  • ✅ The branch was updated with merge-resolution work and verified locally before merge.
  • ⚠️ One late follow-up commit (85e464ed) was pushed to the remote branch after the PR had already merged; it was not included in the merged PR. I did not take further remote cleanup action because deleting a remote branch would be destructive without explicit instruction.
  • ✅ No further changes are needed for the merged PR itself.

Overall: PR #457 was successfully driven through conflict resolution, local verification, CI monitoring, and merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants