fix: surface friendly error message when LLM API key is invalid#3413
fix: surface friendly error message when LLM API key is invalid#3413erisfully wants to merge 6 commits into
Conversation
When LLMAuthenticationError propagates up from the LLM call, the generic except-Exception handler was baking the raw litellm error string (e.g. the full AnthropicException JSON) into the ConversationErrorEvent detail field, which was shown verbatim as a toast in the UI. Add an explicit except LLMAuthenticationError clause before the catch-all in both run() and arun(), emitting a user-readable message instead. The raw exception is still re-raised as ConversationRunError so logs remain unaffected.
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Four tests covering both the sync (run) and async (arun) paths: - ConversationRunError is still raised (logs unaffected) - ConversationErrorEvent.detail contains a friendly message - ConversationErrorEvent.detail does NOT contain the raw litellm string
Coverage Report •
|
||||||||||||||||||||
all-hands-bot
left a comment
There was a problem hiding this comment.
⚠️ QA Report: PASS WITH ISSUES
The invalid/expired LLM API key path was exercised through the SDK with a real Anthropic invalid-key response; the PR fixes the UI-facing detail in both sync and async paths, but CI currently has a failing pre-commit check.
Does this PR achieve its stated goal?
Yes. On origin/main, the same invalid-key conversation emitted ConversationErrorEvent.detail containing raw litellm.AuthenticationError: AnthropicException ... invalid x-api-key for both run() and arun(). On PR commit 4e851294b0ac38b4844305b1c28c9a179eab370f, both paths emitted Your LLM API key appears to be invalid or has expired. Please update it in Settings., while still raising ConversationRunError with LLMAuthenticationError as the cause.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully; no tests, linters, or pre-commit hooks were run locally. |
| CI Status | Pre-commit checks/pre-commit), 8 pending, and 14 skipped checks. |
| Functional Verification | ✅ Real SDK Conversation.run() and Conversation.arun() calls with a bogus Anthropic key produced the expected before/after behavior. |
Functional Verification
Test 1: Invalid LLM API key surfaces a friendly UI-facing conversation error
Step 1 — Reproduce / establish baseline without the fix:
I checked out origin/main at c6347949 and ran a temporary SDK script that creates LLM(model="anthropic/claude-3-haiku-20240307", api_key="qa-invalid-key-not-secret"), builds an Agent and Conversation, sends Say hello once., then executes both conv.run() and conv.arun().
Ran git fetch origin main && git checkout --detach origin/main && uv run python /tmp/qa_llm_auth_behavior.py 2>&1 | tee /tmp/qa_llm_auth_base.log, then extracted the behavior lines:
--- sync run() ---
raised=ConversationRunError
cause=LLMAuthenticationError
cause_is_llm_auth=True
error_event_count=1
event_code=LLMAuthenticationError
event_detail=litellm.AuthenticationError: AnthropicException - {"type":"error","error":{"type":"authentication_error","message":"invalid x-api-key"},"request_id":"req_011CbTwhhoRE8xeYhptJHHEE"}
detail_contains_raw_marker=True
--- async arun() ---
raised=ConversationRunError
cause=LLMAuthenticationError
cause_is_llm_auth=True
error_event_count=1
event_code=LLMAuthenticationError
event_detail=litellm.AuthenticationError: AnthropicException - {"type":"error","error":{"type":"authentication_error","message":"invalid x-api-key"},"request_id":"req_011CbTwhicXeyr9GYuRAPAsq"}
detail_contains_raw_marker=True
This confirms the reported bug exists on main: the UI-facing ConversationErrorEvent.detail includes raw provider/litellm authentication text, including AnthropicException and invalid x-api-key.
Step 2 — Apply the PR's changes:
I checked out the PR branch at 4e851294b0ac38b4844305b1c28c9a179eab370f.
Step 3 — Re-run with the fix in place:
Ran git checkout fix/llm-auth-error-friendly-message-3411 && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_llm_auth_behavior.py 2>&1 | tee /tmp/qa_llm_auth_pr.log, then extracted the behavior lines:
--- sync run() ---
raised=ConversationRunError
cause=LLMAuthenticationError
cause_is_llm_auth=True
error_event_count=1
event_code=LLMAuthenticationError
event_detail=Your LLM API key appears to be invalid or has expired. Please update it in Settings.
detail_contains_raw_marker=False
--- async arun() ---
raised=ConversationRunError
cause=LLMAuthenticationError
cause_is_llm_auth=True
error_event_count=1
event_code=LLMAuthenticationError
event_detail=Your LLM API key appears to be invalid or has expired. Please update it in Settings.
detail_contains_raw_marker=False
This shows the fix works in both user-facing entry points: the raw provider error is no longer placed in ConversationErrorEvent.detail, the friendly actionable message is present, and the raised exception chain remains intact for logs/debugging.
Issues Found
- 🟠 Issue: CI is not green at review time:
Pre-commit checks/pre-commitis failing, and 8 checks are still pending. Functional QA found no behavior issue with the PR goal, and I did not rerun CI-owned tests/linters locally per QA instructions.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
⚠️ QA Report: PASS WITH ISSUES
Functionally, the PR does achieve its invalid/expired LLM API-key goal for both sync and async SDK conversation runs; GitHub currently reports one failing CI check.
Does this PR achieve its stated goal?
Yes. I exercised the SDK as a user would by creating a real LLM with an intentionally invalid Anthropic API key, sending a message through Conversation, and running both run() and arun(). On main, the UI-facing ConversationErrorEvent.detail contained the raw litellm.AuthenticationError: AnthropicException ... invalid x-api-key provider blob; on commit 4e851294b0ac38b4844305b1c28c9a179eab370f, the same flow emitted exactly Your LLM API key appears to be invalid or has expired. Please update it in Settings. while still raising ConversationRunError caused by LLMAuthenticationError.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully with uv sync --dev. |
| CI Status | gh pr checks showed 32 passing, 1 pending (qa-changes), 16 skipped, and 1 failing (Pre-commit checks). |
| Functional Verification | ✅ Real SDK conversations with an invalid Anthropic key confirmed the before/after behavior in both sync and async paths. |
Functional Verification
Test 1: Invalid LLM API key surfaces friendly UI-facing error detail
Step 1 — Reproduce / establish baseline without the fix:
Checked out origin/main (c6347949c4dacbdf9db364fc902e2be216599747) and ran a temporary SDK script that:
- constructs
LLM(model="anthropic/claude-3-5-haiku-20241022", api_key="invalid-openhands-qa-key") - creates
Agent(tools=[])andConversation(...) - sends
"Say hello in one short sentence." - calls both
conv.run()andawait conv.arun() - prints the observed
ConversationErrorEventdetails and raised exception cause
Ran:
git checkout --detach origin/main
uv run python /tmp/qa_llm_auth_check.py > /tmp/qa_base_stdout.txt 2> /tmp/qa_base_stderr.txt
tail -80 /tmp/qa_base_stdout.txtObserved excerpt:
{
"sync": {
"raised": {
"type": "ConversationRunError",
"cause_type": "LLMAuthenticationError",
"cause_contains_anthropic": true
},
"execution_status": "ConversationExecutionStatus.ERROR",
"error_count": 1,
"last_error_code": "LLMAuthenticationError",
"last_error_detail": "litellm.AuthenticationError: AnthropicException - {"type":"error","error":{"type":"authentication_error","message":"invalid x-api-key"},"request_id":"req_011CbTx8teE3rm3o6aTGv9v6"}",
"friendly_exact_match": false,
"raw_provider_fragment_present": true
},
"async": {
"raised": {
"type": "ConversationRunError",
"cause_type": "LLMAuthenticationError",
"cause_contains_anthropic": true
},
"last_error_code": "LLMAuthenticationError",
"last_error_detail": "litellm.AuthenticationError: AnthropicException - {"type":"error","error":{"type":"authentication_error","message":"invalid x-api-key"},"request_id":"req_011CbTx8uBy8WtTtidqNEAMW"}",
"friendly_exact_match": false,
"raw_provider_fragment_present": true
}
}This confirms the reported bug existed on the base branch: both sync and async conversation runs exposed the raw provider/litellm authentication blob in the event detail that the UI consumes.
Step 2 — Apply the PR's changes:
Checked out the PR commit:
git checkout --detach 4e851294b0ac38b4844305b1c28c9a179eab370fStep 3 — Re-run with the fix in place:
Ran the same SDK script:
uv run python /tmp/qa_llm_auth_check.py > /tmp/qa_pr_stdout.txt 2> /tmp/qa_pr_stderr.txt
tail -80 /tmp/qa_pr_stdout.txtObserved excerpt:
{
"sync": {
"raised": {
"type": "ConversationRunError",
"cause_type": "LLMAuthenticationError",
"cause_contains_anthropic": true
},
"execution_status": "ConversationExecutionStatus.ERROR",
"error_count": 1,
"last_error_code": "LLMAuthenticationError",
"last_error_detail": "Your LLM API key appears to be invalid or has expired. Please update it in Settings.",
"friendly_exact_match": true,
"raw_provider_fragment_present": false
},
"async": {
"raised": {
"type": "ConversationRunError",
"cause_type": "LLMAuthenticationError",
"cause_contains_anthropic": true
},
"execution_status": "ConversationExecutionStatus.ERROR",
"error_count": 1,
"last_error_code": "LLMAuthenticationError",
"last_error_detail": "Your LLM API key appears to be invalid or has expired. Please update it in Settings.",
"friendly_exact_match": true,
"raw_provider_fragment_present": false
}
}This confirms the fix works end-to-end through the SDK conversation path: the UI-facing event detail is friendly and actionable, while the raised ConversationRunError still preserves LLMAuthenticationError as its cause for logging/debugging.
Issues Found
- 🟠 Issue: GitHub currently reports
Pre-commit checksas failing. I did not rerun or diagnose it because this QA pass was explicitly scoped away from tests/linters/formatters, but the PR should have green CI before merge.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
Fixes #3411
Problem
When a user has an invalid or expired LLM API key,
LLMAuthenticationErrorpropagated all the way up to the genericexcept Exceptionhandler in bothrun()andarun(). That handler useddetail=str(e), which baked the rawlitellmerror string (e.g. the fullAnthropicExceptionJSON blob) directly into theConversationErrorEvent— which was then shown verbatim as a toast in the UI.The result: every message the user sent silently failed with an opaque, provider-internal error string and no hint that their API key was the problem.
Fix
Add an explicit
except LLMAuthenticationErrorclause before the catch-allexcept Exceptionin bothrun()andarun()inlocal_conversation.py. The new handler emits aConversationErrorEventwith a clear, actionabledetailmessage:The raw exception is still re-raised as
ConversationRunErrorso server logs are unaffected.Changes
openhands-sdk/openhands/sdk/conversation/impl/local_conversation.pyLLMAuthenticationErrorfromopenhands.sdk.llm.exceptionsexcept LLMAuthenticationErrorhandler inrun()(sync path)except LLMAuthenticationErrorhandler inarun()(async path)Before / After
Before — toast shows raw litellm error:
After — toast shows actionable message:
@erisfully can click here to continue refining the PR
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:e0fd81b-pythonRun
All tags pushed for this build
About Multi-Architecture Support
e0fd81b-python) is a multi-arch manifest supporting both amd64 and arm64e0fd81b-python-amd64) are also available if needed