Skip to content

[codex] Accept ACP user messages during async turns#3376

Open
neubig wants to merge 27 commits into
mainfrom
codex/acp-live-message-deltas
Open

[codex] Accept ACP user messages during async turns#3376
neubig wants to merge 27 commits into
mainfrom
codex/acp-live-message-deltas

Conversation

@neubig
Copy link
Copy Markdown
Member

@neubig neubig commented May 24, 2026

Summary

  • include the condenser LLM usage-id/metrics fix from fix(sdk): assign condenser LLM usage id #3368 so condenser settings do not collide with the agent LLM in the agent server
  • stream ACP assistant text through transient StreamingDeltaEvents, including ACP providers that invoke token callbacks with plain string chunks
  • release the conversation state lock around long ACP astep() awaits so websocket/API user messages can be persisted immediately
  • wrap ACP async event callbacks with short state-lock acquires for emitted events, and send ACP session/cancel when an in-flight turn is interrupted
  • when a new user message arrives during a running ACP turn, interrupt the current prompt and restart so the new message is processed instead of staying stuck in "sending"

Regression Tests

  • test_acp_string_token_callback_publishes_delta fails on the prior implementation because ACP plain-string token callbacks raised before publishing any StreamingDeltaEvent
  • test_acp_arun_accepts_user_message_while_step_is_in_flight fails on main because send_message() waits behind the in-flight ACP prompt
  • server-level send-message tests cover interrupting and restarting ACP runs when a user message arrives during execution
  • tests/sdk/test_settings.py -k condenser covers the copied condenser LLM usage-id behavior from fix(sdk): assign condenser LLM usage id #3368

Validation

  • uv run pytest -q tests/sdk/test_settings.py -k condenser
  • uv run pytest -q tests/agent_server/test_event_streaming.py tests/agent_server/test_event_service.py::TestEventServiceSendMessage tests/sdk/agent/test_acp_agent.py::TestACPAgentAstep tests/sdk/conversation/local/test_conversation_send_message.py
  • uv run ruff check openhands-sdk/openhands/sdk/settings/model.py tests/sdk/test_settings.py openhands-agent-server/openhands/agent_server/event_service.py tests/agent_server/test_event_streaming.py

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:c988567-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-c988567-python \
  ghcr.io/openhands/agent-server:c988567-python

All tags pushed for this build

ghcr.io/openhands/agent-server:c988567-golang-amd64
ghcr.io/openhands/agent-server:c988567337de340b5f55ce0c3443aff7327ca171-golang-amd64
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-golang-amd64
ghcr.io/openhands/agent-server:c988567-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:c988567-golang-arm64
ghcr.io/openhands/agent-server:c988567337de340b5f55ce0c3443aff7327ca171-golang-arm64
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-golang-arm64
ghcr.io/openhands/agent-server:c988567-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:c988567-java-amd64
ghcr.io/openhands/agent-server:c988567337de340b5f55ce0c3443aff7327ca171-java-amd64
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-java-amd64
ghcr.io/openhands/agent-server:c988567-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:c988567-java-arm64
ghcr.io/openhands/agent-server:c988567337de340b5f55ce0c3443aff7327ca171-java-arm64
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-java-arm64
ghcr.io/openhands/agent-server:c988567-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:c988567-python-amd64
ghcr.io/openhands/agent-server:c988567337de340b5f55ce0c3443aff7327ca171-python-amd64
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-python-amd64
ghcr.io/openhands/agent-server:c988567-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:c988567-python-arm64
ghcr.io/openhands/agent-server:c988567337de340b5f55ce0c3443aff7327ca171-python-arm64
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-python-arm64
ghcr.io/openhands/agent-server:c988567-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:c988567-golang
ghcr.io/openhands/agent-server:c988567337de340b5f55ce0c3443aff7327ca171-golang
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-golang
ghcr.io/openhands/agent-server:c988567-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:c988567-java
ghcr.io/openhands/agent-server:c988567337de340b5f55ce0c3443aff7327ca171-java
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-java
ghcr.io/openhands/agent-server:c988567-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:c988567-python
ghcr.io/openhands/agent-server:c988567337de340b5f55ce0c3443aff7327ca171-python
ghcr.io/openhands/agent-server:codex-acp-live-message-deltas-python
ghcr.io/openhands/agent-server:c988567-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., c988567-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., c988567-python-amd64) are also available if needed

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 24, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 24, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 24, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server
   event_service.py52110879%96–97, 127, 130–131, 135–136, 143, 147, 153, 163–167, 170–173, 232, 253–254, 325, 376, 386, 410–411, 415, 423, 426, 461, 472, 479, 485, 539, 541, 545–547, 551, 560–561, 563, 567, 573, 575, 622, 652, 655, 712, 733, 917–920, 924, 933, 935–936, 940, 954–958, 960, 984, 989–992, 996–999, 1007–1010, 1016–1019, 1065–1066, 1068–1075, 1077–1078, 1087–1088, 1090–1091, 1098–1099, 1101–1102, 1122, 1128, 1134, 1143–1144
openhands-sdk/openhands/sdk/agent
   acp_agent.py8089088%358–360, 490–491, 524, 526, 530, 534, 542, 605–606, 611, 678, 833, 836–837, 854–855, 884, 889, 907, 917, 950–953, 1157–1160, 1164–1166, 1169–1173, 1175, 1328, 1342–1345, 1353, 1357, 1361–1362, 1368–1369, 1381–1382, 1385, 1394–1395, 1419, 1423–1425, 1429–1430, 1462, 1830, 1834, 1842–1844, 1882–1883, 1886, 1894–1896, 1898, 1900, 1904, 1907, 1916–1918, 1920, 1948–1950, 1961, 1965–1966, 2065–2066
openhands-sdk/openhands/sdk/conversation/impl
   local_conversation.py6065690%313, 318, 462, 508, 545, 561, 626, 775–776, 851–852, 855, 973, 984–987, 994–995, 998, 1004–1005, 1008, 1014, 1061, 1064, 1068–1069, 1073–1074, 1077, 1084, 1101, 1105, 1108, 1197, 1205, 1209–1211, 1218, 1239, 1324, 1329, 1439, 1441, 1445–1446, 1457–1458, 1483, 1678, 1682, 1752, 1759–1760
TOTAL28650666876% 

@neubig neubig marked this pull request as ready for review May 24, 2026 12:25
@neubig neubig requested a review from simonrosenberg May 24, 2026 12:25
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean approach to the live-streaming problem: reusing the existing StreamingDeltaEvent / _pub_sub path for ACP deltas is exactly right — transient chunks don't belong in state.events, and the new helper _publish_streaming_delta_from_thread correctly encapsulates the cross-thread scheduling so neither the LLM token path nor the ACP path has to repeat the run_coroutine_threadsafe / suppress boilerplate. Test coverage is solid (wiring, unwiring, and actual delta delivery are all exercised). Two items worth a look before merge are called out below.

This review was generated by an AI agent (OpenHands) on behalf of the repository owner via OpenHands Automation.

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated
Comment thread openhands-agent-server/openhands/agent_server/event_service.py Outdated
@neubig neubig changed the title [codex] Stream ACP assistant message deltas [codex] Accept ACP user messages during async turns May 24, 2026
@neubig neubig force-pushed the codex/acp-live-message-deltas branch from dc3d0a3 to ca04e8c Compare May 24, 2026 13:51
Copy link
Copy Markdown
Member Author

neubig commented May 24, 2026

Merged main into this branch and resolved the conflict in EventService.send_message() by preserving the ACP interrupt behavior while keeping the newer rerun-request handling from main. Local verification: env -u GITHUB_TOKEN uv run pytest tests/agent_server/test_event_service.py tests/agent_server/test_event_streaming.py tests/sdk/agent/test_acp_agent.py tests/sdk/conversation/local/test_conversation_send_message.py -q → 289 passed.\n\n_This comment was generated by an AI agent (OpenHands) on behalf of neubig._

@neubig neubig added the review-this This label triggers a PR review by OpenHands label May 24, 2026 — with OpenHands AI
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable, but I found two ACP concurrency races that should be addressed before merge: one can replay a just-arrived user prompt, and one can let cancelled prompt updates leak into the next turn. Risk: 🟡 medium, since this changes conversation loop/cancellation behavior.

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26368622109

Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py Outdated
Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated
Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Member Author

neubig commented May 24, 2026

Addressed the two ACP async concurrency review items in 6c89026 and verified locally with ruff plus the full EventService test file (83 passed). Ready for another automated review.\n\n_This comment was generated by an AI agent (OpenHands) on behalf of the user._

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
@neubig neubig removed the review-this This label triggers a PR review by OpenHands label May 24, 2026
@neubig neubig added the review-this This label triggers a PR review by OpenHands label May 24, 2026 — with OpenHands AI
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable direction, but I found a few ACP concurrency ordering gaps that should be addressed before merge. Risk: 🟡 medium because this changes async conversation loop/cancellation behavior.

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26369467316

Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py
Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py Outdated
Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated
Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Member Author

neubig commented May 24, 2026

Addressed the latest ACP concurrency review in c9bbb68: atomic async finalization, finish-gap queued-message reconciliation, and cancellation drain-before-failure ordering. Verified locally with ruff, pyright, and targeted ACP async tests.\n\n_This comment was generated by an AI agent (OpenHands) on behalf of the user._

@neubig neubig removed the review-this This label triggers a PR review by OpenHands label May 24, 2026
@neubig neubig added the review-this This label triggers a PR review by OpenHands label May 24, 2026 — with OpenHands AI
@neubig
Copy link
Copy Markdown
Member Author

neubig commented May 25, 2026

Added a follow-up fix in c3749ff after investigating conversation 1156d472-2173-4deb-bc07-ccd433337e92.

Root cause: after two rapid user interruptions, the ACP bridge timed out waiting for the cancelled prompt to drain. _restart_session_after_drain_timeout() then removed acp_session_id/acp_session_cwd from ConversationState.agent_state, so the restarted Codex ACP subprocess called new_session instead of load_session. That created a new Codex rollout with only the latest user message, losing the previous thread.

Fix: preserve the persisted ACP session id/cwd across restart-after-drain-timeout so the restarted subprocess attempts load_session first. The patch also tracks whether load_session actually succeeded, so stale acp_suffix_installed markers are cleared when we truly fall back to a replacement session.

Verification:

  • uv run pytest -q tests/sdk/agent/test_acp_agent.py -k "SessionIdPersistence or init_state_sets_installed_when_suffix_marker_persisted"
  • uv run pytest -q tests/sdk/agent/test_acp_agent.py
  • uv run ruff check openhands-sdk/openhands/sdk/agent/acp_agent.py tests/sdk/agent/test_acp_agent.py
  • uv run pre-commit run --files openhands-sdk/openhands/sdk/agent/acp_agent.py tests/sdk/agent/test_acp_agent.py

@neubig neubig requested a review from all-hands-bot May 25, 2026 02:02
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iterative hardening across this PR (FIFO message tracking, shielded drain, deferred cancel-path restart, suffix-marker clearing on fallback) is solid work. Earlier rounds of review + fixes addressed the major concurrency races. Three remaining points below — two in production code, one in tests.

This review was generated by an AI agent (OpenHands) on behalf of the user via OpenHands Automation.

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated
Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py Outdated
Comment thread tests/sdk/conversation/local/test_conversation_send_message.py Outdated
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

Verified ACP live message behavior through a real stdio ACP JSON-RPC harness; the PR achieves its stated goal with no QA issues found.

Does this PR achieve its stated goal?

Yes. I exercised the changed SDK and agent-server paths with a minimal ACP server instead of relying on unit tests: on the base commit, ACP send_message() waited behind an in-flight prompt and ACP string chunks did not produce streaming deltas; on the PR, the intervening user message persisted immediately, the queued ACP prompt ran, streaming deltas were published, and EventService sent session/cancel before restarting with the replacement prompt. The condenser settings path also constructs separate default and condenser LLM usage IDs on the PR.

Phase Result
Environment Setup uv sync --dev --frozen completed (Checked 235 packages in 2ms)
CI Status 🟡 Core build/test/pre-commit checks are green; OpenHands pr-review and qa-changes automation checks are still in progress
Functional Verification ✅ ACP queued messages, streaming deltas, interrupt/cancel restart, and condenser usage IDs verified
Functional Verification

Test 1: Local ACP conversation accepts a user message while a prompt is in flight

Step 1 — Reproduce baseline without the PR:
Checked out origin/main (2aa5256e2147a3252be8d1f96600f627ec27abbb) and ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_harness.py local_queue:

QA_RESULT={"cancel_events": [], "completed_within_250ms": false, "mode": "local_queue", "prompts": ["initial request", "intervening request"], "second_prompt_auto": true, "send_elapsed_if_prompt": null, "send_elapsed_total": 2.7864337420000425}

This confirms the old behavior: the intervening send_message() did not persist promptly; it waited ~2.8s for the in-flight ACP prompt/usage wait to complete.

Step 2 — Apply the PR's changes:
Checked out c3749ffa83e04f8f64ff102aa5529ef5c65e591c.

Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_harness.py local_queue:

QA_RESULT={"cancel_events": [], "completed_within_250ms": true, "mode": "local_queue", "prompts": ["initial request", "intervening request"], "second_prompt_auto": true, "send_elapsed_if_prompt": 0.0020447240000294187, "send_elapsed_total": 0.0020454750000453714}

This shows the PR fixes the core async-turn issue: the new user message persisted in ~2ms while the first ACP prompt was still active, and the second ACP prompt was processed automatically in FIFO order.

Test 2: ACP plain-string token callbacks publish streaming deltas

Step 1 — Reproduce baseline without the PR:
Checked out origin/main and ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_harness.py event_streaming:

QA_RESULT={"delta_contents": [], "delta_count": 0, "mode": "event_streaming", "prompts": ["stream this"]}

This confirms the previous agent-server streaming path did not surface ACP plain-string chunks as StreamingDeltaEvents.

Step 2 — Apply the PR's changes:
Checked out c3749ffa83e04f8f64ff102aa5529ef5c65e591c.

Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_harness.py event_streaming:

QA_RESULT={"delta_contents": ["delta for stream this"], "delta_count": 1, "mode": "event_streaming", "prompts": ["stream this"]}

This shows the ACP string chunk reached subscribers as a transient streaming delta.

Test 3: Agent-server interrupts an in-flight ACP turn and sends session/cancel

Step 1 — Reproduce baseline without the PR:
Checked out origin/main and ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_harness.py event_interrupt:

QA_RESULT={"cancel_count": 0, "mode": "event_interrupt", "prompts": ["initial long request", "replacement request"], "replacement_prompt_seen": true, "send_elapsed": 7.073722684000018}

This shows the old EventService path did not send session/cancel; the second send_message(run=True) waited for the first long ACP prompt and then for the replacement run.

Step 2 — Apply the PR's changes:
Checked out c3749ffa83e04f8f64ff102aa5529ef5c65e591c.

Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_harness.py event_interrupt:

QA_RESULT={"cancel_count": 1, "mode": "event_interrupt", "prompts": ["initial long request", "replacement request"], "replacement_prompt_seen": true, "send_elapsed": 2.009888352999951}

This confirms the PR sends ACP session/cancel during the interrupted turn and restarts against the replacement user prompt instead of waiting for the old long prompt to complete.

Test 4: Condenser LLM usage IDs stay separated

Step 1 — Establish baseline:
On origin/main, ran a real settings construction script using OpenHandsAgentSettings, CondenserSettings(enabled=True), and create_agent_from_settings(...):

Agent
agent_llm_usage_id= default
condenser= LLMSummarizingCondenser
all_llm_usage_ids= ['default', 'condenser']

Step 2 — Apply the PR's changes and re-run:
On c3749ffa83e04f8f64ff102aa5529ef5c65e591c, ran the same script:

Agent
agent_llm_usage_id= default
condenser= LLMSummarizingCondenser
all_llm_usage_ids= ['default', 'condenser']

This verifies the PR branch exposes distinct runtime usage IDs for agent and condenser LLMs; the current origin/main already has the same behavior, so this was a no-regression check for the copied fix.

Issues Found

None.


This review was created by an AI agent (OpenHands) on behalf of the user.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable direction, but I found a few ACP async queue/cancellation edge cases that should be addressed before merge. Risk: 🟡 medium because this changes agent/conversation loop behavior; a human maintainer should decide eval coverage before approval.

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26379406959

Comment thread openhands-agent-server/openhands/agent_server/event_service.py Outdated
Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py Outdated
Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated
Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated
Comment thread tests/sdk/agent/test_acp_agent.py Outdated
@neubig neubig added review-this This label triggers a PR review by OpenHands and removed review-this This label triggers a PR review by OpenHands labels May 25, 2026
@neubig neubig added review-this This label triggers a PR review by OpenHands and removed review-this This label triggers a PR review by OpenHands labels May 25, 2026
@neubig neubig added review-this This label triggers a PR review by OpenHands and removed review-this This label triggers a PR review by OpenHands labels May 25, 2026
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable direction, but I found two ACP cancellation races that can strand a just-appended user message or desync local history from the ACP session. Risk: 🟡 medium because this changes conversation loop/cancellation behavior; please have a human maintainer decide eval coverage before merge.

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26411480756

Comment thread openhands-agent-server/openhands/agent_server/event_service.py Outdated
Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated
@neubig neubig added review-this This label triggers a PR review by OpenHands and removed review-this This label triggers a PR review by OpenHands labels May 25, 2026
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Taste Rating: Acceptable, but I found a few ACP async edge cases worth fixing before merge.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM — this changes ACP cancellation/restart and conversation loop behavior, so a human maintainer should decide eval coverage before approval.

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26412087756

Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py
Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated
Comment thread openhands-agent-server/openhands/agent_server/event_service.py
@neubig neubig added review-this This label triggers a PR review by OpenHands and removed review-this This label triggers a PR review by OpenHands labels May 25, 2026
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Taste Rating: Acceptable, but I found a few remaining ACP async races to address. Risk: 🟡 MEDIUM because this changes ACP conversation-loop/cancellation behavior; a human maintainer should decide eval coverage before merge.

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26415333311

Comment thread openhands-agent-server/openhands/agent_server/event_service.py
Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py
Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated
@neubig neubig added review-this This label triggers a PR review by OpenHands and removed review-this This label triggers a PR review by OpenHands labels May 25, 2026
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable direction, but I found a few remaining ACP async edge cases around restart/cancellation and cursor handling. Risk: 🟡 MEDIUM because this changes ACP conversation-loop/cancellation behavior; a human maintainer should decide eval coverage before approval.

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26416390628

interrupted_acp = True
await self.interrupt(internal_acp_rerun=True)
try:
await self.run()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: After an internal ACP supersede, this coroutine awaits interrupt(internal_acp_rerun=True) and then always calls run(). If the user explicitly presses Stop/Pause while that internal interrupt is draining, that explicit call clears the rerun flags, but this suspended send_message() still resumes here and restarts from PAUSED. Please re-check an internal rerun generation/flag after interrupt() returns, or let _run_and_publish own the restart, so explicit user stop intent wins.

# within our short grace window; it does not prove the ACP server lost
# its persisted session. Preserve the session id so the restarted
# subprocess can load_session() and retain conversation memory.
self._restart_session_on_next_turn = False
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: This clears _restart_session_on_next_turn before the replacement session has actually initialized. If _cleanup()/init_state() fails, the exception escapes before the prompt try block, the agent is left cleaned up, _agent_ready remains true, and future runs won't retry the deferred restart. Clear the flag only after init_state() succeeds, or restore it on failure.

)
raise
if drain_result.completed and drain_result.error is not None:
self._emit_turn_error(drain_result.error, state, on_event)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: In the explicit cancellation path, a drained prompt error is surfaced as an ACP error before re-raising CancelledError. Normal Stop can therefore append ACP error/ConversationErrorEvent before arun() marks the conversation PAUSED; cancellation during retry backoff can also surface a stale retriable attempt because prompt_future still points at the handled failure. For user cancellation, close in-flight tool cards/quarantine the session as needed and re-raise without emitting a turn error.

event
for event in self._state.events
if isinstance(event, MessageEvent)
and event.source == "user"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: Stop-hook feedback is appended above as MessageEvent(source="environment", role="user"), but the ACP prompt queue filters only source == "user". After an ACP turn finishes and a stop hook returns should_stop=False, the cursor is already current, so this loop finds no prompt for the feedback and finishes anyway. Include the feedback event in the ACP queue, or otherwise send it to ACP, before allowing FINISHED.

user_messages[0] if user_messages else None
)
else:
last_prompt_index = next(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: If persisted acp_last_prompt_user_message_id no longer exists in user_messages (for example after condensation/truncation or a bad persisted state), last_prompt_index is None and no prompt is selected. The no-prompt branch then sees last_user_message_id != last_acp_prompt_user_message_id, sets RUNNING, and continues without incrementing iteration, causing a tight loop. Treat a missing cursor as a recoverable mismatch by resetting/repairing it or selecting a deterministic next prompt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review-this This label triggers a PR review by OpenHands

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants