feat: add Conversation.fork() as a first-class SDK primitive#2841
Conversation
Add fork() to BaseConversation (abstract), LocalConversation, and
RemoteConversation. The method deep-copies the event log, agent config,
workspace metadata, and runtime state into a new conversation with its
own ID and persistence directory. By default metrics start fresh on the
fork; set reset_metrics=False to carry them over.
Expose the primitive through the agent-server REST API as
POST /api/conversations/{id}/fork with an optional
ForkConversationRequest body (id, title, tags, reset_metrics).
Closes #2840
Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Coverage Report •
|
||||||||||||||||||||||||||||||||||||||||
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable - Solves a real problem with a pragmatic approach, but has correctness and API consistency issues that should be addressed.
[RISK ASSESSMENT]
New feature with limited blast radius, but potential for data corruption via shared mutable objects and API surface inconsistency between local/remote implementations that could confuse users.
- Deep-copy events via model_copy(deep=True) so source stays immutable - Deep-copy agent_state via copy.deepcopy for mutable values - RemoteConversation.fork() now raises NotImplementedError when agent is passed (server doesn't support agent replacement yet) Co-authored-by: openhands <openhands@all-hands.dev>
|
Addressed all review feedback in 71b9751. Ready for another look. |
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Successfully verified Conversation.fork() creates independent conversation copies with full event history preservation and configurable metric handling.
Does this PR achieve its stated goal?
Yes. The PR set out to add fork() as a first-class SDK primitive for branching off conversations while preserving full event history, and it delivers on all claims:
- ✅ Abstract method added:
fork()is now part of theBaseConversationcontract - ✅ LocalConversation implementation: Deep-copies events, agent config (via serialization to avoid thread-lock issues), workspace metadata, and runtime state (
agent_state,activated_knowledge_skills) - ✅ RemoteConversation implementation: Delegates to server API and returns new
RemoteConversationinstance - ✅ REST API endpoint:
POST /api/conversations/{id}/forkwithForkConversationRequestbody - ✅ Comprehensive testing: 12 unit tests covering all fork behaviors
- ✅ Functional verification: End-to-end testing confirms all claimed features work correctly
The implementation enables downstream apps to branch off system-generated conversations (CI runs, scheduled pipelines) for human follow-up exploration without contaminating the original audit trail — exactly the use case described in the PR.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv sync completed, 12 fork tests available |
| CI & Tests | ✅ sdk-tests: SUCCESS, tools-tests: SUCCESS, pre-commit: SUCCESS, 12/12 fork tests pass |
| Functional Verification | ✅ All claimed fork behaviors verified end-to-end |
Functional Verification
Test 1: Baseline — fork() does not exist on main branch
Before (main branch):
$ git show origin/main:openhands-sdk/openhands/sdk/conversation/base.py | grep "def fork"
(exit code 1 — method does not exist)This confirms fork() is a new feature, not a modification of existing behavior.
After (PR branch):
$ git show HEAD:openhands-sdk/openhands/sdk/conversation/base.py | grep -A 3 "def fork"
def fork(
self,
*,
conversation_id: ConversationID | None = None,✅ fork() is now an abstract method on BaseConversation.
Test 2: Unit tests for LocalConversation.fork()
Command:
$ uv run pytest tests/sdk/conversation/local/test_fork.py -vOutput:
tests/sdk/conversation/local/test_fork.py::test_fork_creates_new_id PASSED [ 8%]
tests/sdk/conversation/local/test_fork.py::test_fork_with_explicit_id PASSED [ 16%]
tests/sdk/conversation/local/test_fork.py::test_fork_copies_events PASSED [ 25%]
tests/sdk/conversation/local/test_fork.py::test_fork_source_unmodified PASSED [ 33%]
tests/sdk/conversation/local/test_fork.py::test_fork_execution_status_is_idle PASSED [ 41%]
tests/sdk/conversation/local/test_fork.py::test_fork_resets_metrics_by_default PASSED [ 50%]
tests/sdk/conversation/local/test_fork.py::test_fork_preserves_metrics_when_requested PASSED [ 58%]
tests/sdk/conversation/local/test_fork.py::test_fork_copies_agent_state PASSED [ 66%]
tests/sdk/conversation/local/test_fork.py::test_fork_accepts_replacement_agent PASSED [ 75%]
tests/sdk/conversation/local/test_fork.py::test_fork_with_tags PASSED [ 83%]
tests/sdk/conversation/local/test_fork.py::test_fork_with_title_sets_tag PASSED [ 91%]
tests/sdk/conversation/local/test_fork.py::test_fork_shares_workspace PASSED [100%]
12 passed in 0.09s
✅ All 12 unit tests pass, covering:
- Auto-generated and explicit fork IDs
- Event copying
- Source immutability
- Execution status reset to idle
- Metric reset and preservation
- Agent state copying
- Agent replacement
- Tags and title
- Workspace sharing
Test 3: End-to-end functional verification
Created a realistic scenario with a conversation containing:
- 2 message events (user + assistant)
- Agent state:
{"task_progress": 0.5, "current_step": "analysis"} - Metrics: $0.05 accumulated cost, 100 prompt tokens, 50 completion tokens
Test 3a: Basic fork with auto-generated ID
✓ Forked conversation: 3e1485d2-5a1e-4516-b7b4-c2f379461371
Source ID: ce82ec46-f25e-4af0-a9e1-38e68195c89b
Fork ID: 3e1485d2-5a1e-4516-b7b4-c2f379461371
✓ IDs are different
✓ Events copied: 2
✓ Agent state copied: {'task_progress': 0.5, 'current_step': 'analysis'}
✓ Metrics reset: $0.0
✓ Execution status: ConversationExecutionStatus.IDLE
✅ Fork created with unique ID, events and agent state copied, metrics reset by default, execution status is idle.
Test 3b: Fork with metrics preservation
✓ Fork with preserved metrics: 53041ca9-d814-4998-92c5-5071cf27b5e6
Source cost: $0.05
Fork cost: $0.05
✓ Metrics preserved
✅ When reset_metrics=False, cost metrics are preserved in the fork.
Test 3c: Fork with custom ID, title, and tags
✓ Fork with custom params: b3f38780-2509-4311-8c2b-9f1384ac510d
✓ Custom ID: b3f38780-2509-4311-8c2b-9f1384ac510d
✓ Title tag: Forked Investigation
✓ Custom tags: forked=true, priority=high
✅ Custom ID, title, and tags are correctly applied to the fork.
Test 3d: Fork with replacement agent
✓ Fork with replacement agent: abfbed38-63f5-4941-8089-16ba59e44bb8
Source model: gpt-4o-mini
Fork model: gpt-4o
✓ Agent replaced
✅ Fork can use a different agent while preserving events from the source.
Test 3e: Isolation — fork modifications don't affect source
✓ Modified fork: c8f63514-ff95-46f6-be67-01f2c48ee753
Source events: 2
Fork events: 3
✓ Source events unchanged
Source agent_state: {'task_progress': 0.5, 'current_step': 'analysis'}
Fork agent_state: {'task_progress': 0.5, 'current_step': 'analysis', 'new_key': 'new_value'}
✓ Source agent state unchanged
✅ Adding events to the fork does not modify the source. Agent state modifications in the fork do not affect the source. This confirms true isolation.
Test 3f: Workspace sharing
✓ Fork workspace check: af00852c-e278-4d90-b629-240c389292e1
Source workspace: /tmp/tmp8b9ocsje
Fork workspace: /tmp/tmp8b9ocsje
✓ Workspace shared
✅ Fork shares the same workspace directory as the source (as designed).
Test 3g: Independent persistence directories
✓ Persistence paths:
Source: /tmp/tmp8b9ocsje/ce82ec46f25e4af0a9e138e68195c89b
Fork: /tmp/tmp8b9ocsje/7c5eb130b5274ad28519ecec61addfae/7c5eb130b5274ad28519ecec61addfae
✓ Fork has independent persistence directory
✓ Fork persistence is in same workspace
✅ Fork has its own persistence directory (for independent state storage) while remaining in the same workspace.
Test 4: Implementation details verification
Agent deep-copy uses serialization round-trip:
# From local_conversation.py lines 343-348
agent_cls = type(self.agent)
fork_agent = agent_cls.model_validate(
self.agent.model_dump(context={"expose_secrets": True}),
)✅ Implementation matches PR claim: uses model_validate(model_dump(...)) instead of model_copy(deep=True) to avoid TypeError: cannot pickle '_thread.lock' object.
Test 5: CI validation
SDK tests:
sdk-tests: COMPLETED - SUCCESS
✅ All SDK tests pass, including the 12 new fork tests.
Other relevant checks:
tools-tests: COMPLETED - SUCCESS
pre-commit: COMPLETED - SUCCESS
Python API: COMPLETED - SUCCESS
REST API (OpenAPI): COMPLETED - SUCCESS
build-binary-and-test (ubuntu-latest): COMPLETED - SUCCESS
✅ No regressions detected in related test suites.
Issues Found
None.
Conclusion: This PR successfully adds Conversation.fork() as a first-class SDK primitive with all claimed functionality working correctly. The implementation is clean, well-tested, and ready for merge.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable - Solid implementation of a useful feature, but has a correctness bug in the remote fork path and missing test coverage.
[RISK ASSESSMENT]
New feature with limited blast radius, but has a data inconsistency bug in RemoteConversation.fork() and lacks test coverage for remote fork and server endpoint. Not in the eval-risk category since it's a pure addition that doesn't affect existing conversation behavior.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Successfully verified conversation fork functionality works end-to-end. The PR delivers on its promise to enable branching conversations for human follow-up without contaminating the original audit trail.
Does this PR achieve its stated goal?
Yes. The PR successfully adds Conversation.fork() as a first-class SDK primitive that enables downstream apps to branch off system-generated conversations (CI runs, scheduled pipelines, agent-on-agent tasks) for human exploration while preserving the original audit trail intact. Testing confirms events, agent state, and runtime state are deep-copied, the fork operates independently, and metrics can be reset or preserved as needed.
| Phase | Result |
|---|---|
| Environment Setup | ✅ Dependencies installed, project builds successfully |
| CI & Tests | ✅ 19/19 core checks pass, 12/12 fork unit tests pass |
| Functional Verification | ✅ Fork creates independent conversations, deep-copy works, REST API functional |
Functional Verification
Test 1: LocalConversation.fork() — Core Behavior
Setup — Create source conversation with history:
src = Conversation(agent=agent, persistence_dir=tmpdir, workspace=tmpdir)
src.state.events.append(MessageEvent(id="msg-1", ...))
src.state.events.append(MessageEvent(id="msg-2", ...))
src._state.agent_state = {"deployment_id": "deploy-123", "status": "success"}
src._state.stats.usage_to_metrics["test"] = Metrics(accumulated_cost=2.5)Source conversation created with:
- 2 events
- Agent state:
{deployment_id: deploy-123, status: success} - Cost: $2.50
Action — Fork with reset_metrics=True (default):
fork1 = src.fork(
title="Human follow-up on deploy-123",
tags={"type": "manual", "parent": "ci-run"},
)Result:
Fork conversation ID: 84d8da92-d079-401a-b666-54afae00592a (different from source)
Events: 2 (copied from source)
Agent state: {deployment_id: deploy-123, status: success} (deep-copied)
Cost: $0.0 (reset as expected)
Tags: {type: manual, parent: ci-run, title: Human follow-up on deploy-123}
✅ Fork created with new ID, events copied, metrics reset, title and tags applied.
Test 2: Fork Independence — Mutation Isolation
Action — Modify fork without affecting source:
fork1.state.events.append(MessageEvent(id="fork-msg-1", ...))
fork1._state.agent_state["investigation"] = "active"Result:
Source events: 2 (unchanged)
Fork events: 3 (added 1)
Source agent state: {deployment_id: deploy-123, status: success}
Fork agent state: {deployment_id: deploy-123, status: success, investigation: active}
✅ Fork modifications do NOT affect source — deep-copy works correctly.
Test 3: Metrics Preservation with reset_metrics=False
Action — Fork with metrics preserved:
fork2 = src.fork(title="Another fork with metrics", reset_metrics=False)Result:
Fork2 cost: $2.5 (preserved from source)
✅ Metrics are preserved when reset_metrics=False.
Test 4: Event Deep-Copy Isolation
Action — Verify events are different objects:
src_event = src.state.events[0]
fork1_event = fork1.state.events[0]
assert src_event.id == fork1_event.id # Same data
assert src_event is not fork1_event # Different objectsResult:
Source event: id=msg-1, obj=139942274651536
Fork event: id=msg-1, obj=139942274656016
✅ Events are deep-copied (different object IDs, same data).
Test 5: Additional Behaviors
Workspace sharing:
Both conversations share workspace: /tmp/tmpokgwgw93
✅ Fork shares workspace with source.
Execution status:
Fork execution status: idle
✅ Fork starts in idle status.
Custom conversation ID:
custom_id = uuid.uuid4()
fork3 = src.fork(conversation_id=custom_id)
assert fork3.id == custom_id✅ Custom ID is honored.
Test 6: Unit Tests Coverage
Ran the full fork test suite:
uv run pytest tests/sdk/conversation/local/test_fork.py -vResult:
12 passed in 0.10s
test_fork_creates_new_id PASSED
test_fork_with_explicit_id PASSED
test_fork_copies_events PASSED
test_fork_source_unmodified PASSED
test_fork_execution_status_is_idle PASSED
test_fork_resets_metrics_by_default PASSED
test_fork_preserves_metrics_when_requested PASSED
test_fork_copies_agent_state PASSED
test_fork_accepts_replacement_agent PASSED
test_fork_with_tags PASSED
test_fork_with_title_sets_tag PASSED
test_fork_shares_workspace PASSED
✅ All 12 unit tests pass.
Test 7: REST API Implementation
Verified the REST API endpoint is properly implemented:
Endpoint: POST /api/conversations/{id}/fork
Implementation verified:
- ✅
ForkConversationRequestmodel added with proper validation - ✅
conversation_router.fork_conversation()endpoint implemented - ✅
conversation_service.fork_conversation()delegates toLocalConversation.fork() - ✅ Server persists fork and returns new
ConversationInfo - ✅ OpenAPI schema validation passed (CI check: Agent Server/Check OpenAPI Schema ✅)
RemoteConversation.fork() implementation:
# Sends POST request to server /api/conversations/{id}/fork
resp = _send_request(self._client, "POST",
f"{base_path}/{self._id}/fork", json=body)
fork_info = resp.json()
return RemoteConversation(agent=fork_agent,
conversation_id=fork_uuid, ...)✅ RemoteConversation properly delegates to server API.
Issues Found
None.
- Fix tags inconsistency: RemoteConversation.fork() now uses tags from the server response (which includes merged title) instead of the raw input kwargs - Add RemoteConversation.fork() tests (POST request, server tags, agent param rejection, body fields) - Add fork endpoint tests (201 success, 404 not found) - Add event deep-copy isolation test for LocalConversation.fork() Co-authored-by: openhands <openhands@all-hands.dev>
|
Addressed both remaining review comments in 06919d4: fixed tags inconsistency bug in RemoteConversation.fork() and added comprehensive tests for remote fork, server endpoint, and deep-copy event isolation. Ready for another look. |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good taste - Clean implementation of a useful feature with proper deep-copying and good test coverage.
[RISK ASSESSMENT]
New feature with persistence implications (creates new conversation directories, copies state). Well-tested with 18 tests covering local/remote implementations and endpoint behavior. Limited blast radius since it only affects code that explicitly calls fork(). Not in eval-risk category - pure API addition that doesn't modify agent behavior or planning logic.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified: Conversation.fork() successfully creates independent conversation copies that preserve event history without contaminating the source audit trail.
Does this PR achieve its stated goal?
Yes. The PR goal was to enable downstream apps to let humans branch off system-generated conversations (CI runs, scheduled pipelines) for follow-up exploration without contaminating the original audit trail. The implementation delivers on this by deep-copying all events, creating an independent fork with its own ID and persistence directory, preserving runtime state (agent_state, activated_knowledge_skills), and optionally resetting metrics for fresh cost tracking. Testing confirmed that modifications to a fork do not affect the source conversation.
| Phase | Result |
|---|---|
| Environment Setup | ✅ Build successful, dependencies installed |
| CI & Tests | ✅ 19/19 fork tests pass (13 local + 4 remote + 2 router), pre-commit/API checks pass |
| Functional Verification | ✅ End-to-end fork workflow verified with realistic CI → debug session scenario |
Functional Verification
Test 1: LocalConversation.fork() end-to-end
Step 1 — Establish baseline (create source conversation):
Created a conversation simulating a CI run with:
- 2 events (test trigger + results)
- Agent state:
{"last_test_run": "2024-04-16T02:00:00Z", "failed_tests": [...]} - Accumulated cost: $0.05
- Tags:
{"type": "ci-run", "build": "1234"}
Step 2 — Fork for human follow-up:
Ran fork_conv = source_conv.fork(title="Debug test failures from CI run #1234", tags={"type": "debug-session", "user": "developer"}, reset_metrics=True)
Result:
Forked conversation 61e6915e-00a9-48e7-8862-ba232747dd39 → 67ee52d7-6159-4172-a3c7-8e5020d7fb56
(2 events copied, reset_metrics=True)
Fork received:
- Unique ID:
67ee52d7-6159-4172-a3c7-8e5020d7fb56 - All 2 events from source
- Deep copy of agent_state
- Reset cost: $0.00
- Merged tags:
{"type": "debug-session", "user": "developer", "title": "Debug test failures..."}
Step 3 — Verify independence:
Added new event to fork and modified fork's agent_state to include "debug_session_started": True
Result:
- Source events: 2 (unchanged)
- Fork events: 3 (new event added)
- Source agent_state:
{"last_test_run": ..., "failed_tests": [...]}(unchanged) - Fork agent_state:
{..., "debug_session_started": True}(modified)
✓ Confirmed: Fork is fully independent; source audit trail remains uncontaminated.
Test 2: Metrics preservation option
Step 1 — Fork with reset_metrics=False:
Ran fork_with_metrics = source_conv.fork(title="Debug session with cost tracking", reset_metrics=False)
Result:
Forked conversation 61e6915e-00a9-48e7-8862-ba232747dd39 → 47146d51-2b95-4204-9404-98f007791a24
(2 events copied, reset_metrics=False)
- Source cost: $0.05
- Fork cost: $0.05 (preserved)
✓ Confirmed: Metrics are preserved when reset_metrics=False.
Test 3: Unit test coverage
All 19 fork tests passed:
tests/sdk/conversation/local/test_fork.py::test_fork_creates_new_id PASSED
tests/sdk/conversation/local/test_fork.py::test_fork_with_explicit_id PASSED
tests/sdk/conversation/local/test_fork.py::test_fork_copies_events PASSED
tests/sdk/conversation/local/test_fork.py::test_fork_source_unmodified PASSED
tests/sdk/conversation/local/test_fork.py::test_fork_execution_status_is_idle PASSED
tests/sdk/conversation/local/test_fork.py::test_fork_resets_metrics_by_default PASSED
tests/sdk/conversation/local/test_fork.py::test_fork_preserves_metrics_when_requested PASSED
tests/sdk/conversation/local/test_fork.py::test_fork_copies_agent_state PASSED
tests/sdk/conversation/local/test_fork.py::test_fork_accepts_replacement_agent PASSED
tests/sdk/conversation/local/test_fork.py::test_fork_with_tags PASSED
tests/sdk/conversation/local/test_fork.py::test_fork_with_title_sets_tag PASSED
tests/sdk/conversation/local/test_fork.py::test_fork_shares_workspace PASSED
tests/sdk/conversation/local/test_fork.py::test_fork_event_deep_copy_isolation PASSED
tests/sdk/conversation/remote/test_remote_fork.py::test_remote_fork_sends_post_request PASSED
tests/sdk/conversation/remote/test_remote_fork.py::test_remote_fork_uses_server_returned_tags PASSED
tests/sdk/conversation/remote/test_remote_fork.py::test_remote_fork_raises_on_agent_param PASSED
tests/sdk/conversation/remote/test_remote_fork.py::test_remote_fork_passes_body_fields PASSED
tests/agent_server/test_conversation_router.py::test_fork_conversation_success PASSED
tests/agent_server/test_conversation_router.py::test_fork_conversation_not_found PASSEDTests cover:
- ID generation (auto and explicit)
- Event copying and deep-copy isolation
- Source immutability
- Execution status (starts idle)
- Metrics (reset/preserve)
- Agent state copying and replacement
- Tags and title handling
- Workspace sharing
- REST API endpoint (success and 404)
- RemoteConversation delegation
Issues Found
None.
Demonstrates the key fork() use cases: 1. Basic fork — branch off after some interaction 2. Independent continuation — fork runs without affecting source 3. Agent replacement — fork with a different agent (tool-change) 4. Tags and title — metadata on the fork Co-authored-by: openhands <openhands@all-hands.dev>
Rewrote example to demonstrate fork() end-to-end without needing LLM credentials: 1. Basic fork — same events, different ID, title tag 2. Source isolation — fork changes don't affect source 3. Deep-copy isolation — event lists are independent 4. Agent replacement — fork with a different agent (A/B testing) 5. Metrics reset — reset_metrics flag behavior Evidence of successful run included in .pr/ (auto-removed on merge). Co-authored-by: openhands <openhands@all-hands.dev>
|
✅ PR Artifacts Cleaned Up The |
Add SDK guide page for the new Conversation.fork() primitive that lets
users branch off an existing conversation for follow-up exploration
without contaminating the original audit trail.
Covers:
- Basic usage (fork, source isolation, deep-copy semantics)
- Fork with a different agent (A/B testing, tool-change)
- Tags, metadata, and metrics reset
- Agent-server REST endpoint (POST /api/conversations/{id}/fork)
- Full ready-to-run example (no LLM calls needed)
Added to Conversation Features nav group in docs.json.
Related SDK PR: OpenHands/software-agent-sdk#2841
Co-authored-by: openhands <openhands@all-hands.dev>
Rewrote example to use actual LLM calls (via LLM_API_KEY / LLM_BASE_URL env vars) instead of a fake demo key. The example now: 1. Runs the source conversation with a real agent (echo hello-from-source) 2. Forks and continues with a new message (echo hello-from-fork) 3. Verifies source isolation (source events unchanged) 4. Forks with an alternate agent (A/B testing scenario) Evidence of successful run against openhands/claude-haiku included in .pr/. Co-authored-by: openhands <openhands@all-hands.dev>
The fork was created with visualizer=None, so fork.run() produced no console output. Fixed to inherit the source's visualizer type so forked conversations show the same agent action/observation output as the source. Updated example evidence shows all three runs producing full visualizer output. Co-authored-by: openhands <openhands@all-hands.dev>
This comment was marked as duplicate.
This comment was marked as duplicate.
The RemoteEventsList default callback was including FULL_STATE_KEY ConversationStateUpdateEvents delivered over WebSocket that are NOT stored in the server-side EventLog. This caused the client-side event count to diverge from the server, breaking fork event-count parity in RemoteConversation.fork(). Co-authored-by: openhands <openhands@all-hands.dev>
This reverts commit 858cbfe.
The source conversation's client-side event list includes transient WebSocket-only events (e.g. full-state snapshots) that are not persisted to the EventLog. The fork copies only persisted events, so exact count parity is not expected. Replace the strict equality assertion with a >0 check and verify the fork grows after its own run() instead. Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands review the latest data of this PR and then help me update this documentation PR. |
|
I'm on it! xingyaoww can track my progress at all-hands.dev |
Summary of WorkI reviewed the final merged state of PR #2841 ( Checklist — Was the request fully addressed?
What was changed (commit
|
* docs: add Conversation.fork() guide
Add SDK guide page for the new Conversation.fork() primitive that lets
users branch off an existing conversation for follow-up exploration
without contaminating the original audit trail.
Covers:
- Basic usage (fork, source isolation, deep-copy semantics)
- Fork with a different agent (A/B testing, tool-change)
- Tags, metadata, and metrics reset
- Agent-server REST endpoint (POST /api/conversations/{id}/fork)
- Full ready-to-run example (no LLM calls needed)
Added to Conversation Features nav group in docs.json.
Related SDK PR: OpenHands/software-agent-sdk#2841
Co-authored-by: openhands <openhands@all-hands.dev>
* docs: sync fork guide with real-LLM example code
Update the ready-to-run example to match the real-LLM version from
the SDK repo, and add the RunExampleCode shared snippet.
Co-authored-by: openhands <openhands@all-hands.dev>
* docs(sdk): add remote fork example
Co-authored-by: openhands <openhands@all-hands.dev>
* docs: sync fork guide with final merged code
- Update embedded remote example (02_remote_agent_server/11_conversation_fork.py)
to match bug-fixed version: relaxed event count assertions for remote
forks since WebSocket-only events aren't persisted server-side
- Fix 'What Gets Copied' table: removed incorrect entries for
confirmation_policy and security_analyzer (not copied in fork),
added accurate entries for agent_state, activated_knowledge_skills,
and tags
Co-authored-by: openhands <openhands@all-hands.dev>
* fix: adjust focus range to highlight fork call (lines 4-8)
Co-authored-by: openhands <openhands@all-hands.dev>
* chore: teach code-review skill to use APPROVE event for clean reviews
Add explicit guidance on GitHub review event values (APPROVE,
REQUEST_CHANGES, COMMENT) to the repo-specific code-review skill.
This mirrors the fix in OpenHands/extensions#185 that teaches the
general github-pr-review skill the same thing.
Previously the bot always submitted COMMENTED reviews even for clean
PRs because no skill told it to use the APPROVE event.
Co-authored-by: openhands <openhands@all-hands.dev>
---------
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: enyst <engel.nyst@gmail.com>
Why
Downstream apps that surface system-generated conversations (CI runs, scheduled pipelines, agent-on-agent tasks) need a way to let humans branch off a run for follow-up exploration without contaminating the original audit trail. The only workaround today is to start a fresh conversation and replay the original transcript, which loses event-level fidelity (tool_call_id pairing, LLM token/cost metrics, condenser state, thinking blocks, hook execution records).
Summary
fork()as an abstract method onBaseConversationand implement it onLocalConversationandRemoteConversationagent_state,activated_knowledge_skills) into a new conversation with its own ID and persistence directoryreset_metrics=True); setreset_metrics=Falseto carry them overPOST /api/conversations/{id}/forkwith an optionalForkConversationRequestbody (id,title,tags,reset_metrics)Issue Number
Closes #2840
Docs PR: OpenHands/docs#459
How to Test
All 12 tests pass:
Type
Notes
model_validate(model_dump(...))serialization round-trip instead ofmodel_copy(deep=True)to avoidTypeError: cannot pickle '_thread.lock' objectfrom Pydantic's deepcopy of private attributes containing thread locks._start_event_service, keeping the approach consistent with how conversations are normally created and resumed.RemoteConversation.fork()delegates to the server'sPOST .../forkendpoint and returns a newRemoteConversationpointing at the fork.This PR was created by an AI assistant (OpenHands) on behalf of the user.
@xingyaoww can click here to continue refining the PR
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:80bebdc-pythonRun
All tags pushed for this build
About Multi-Architecture Support
80bebdc-python) is a multi-arch manifest supporting both amd64 and arm6480bebdc-python-amd64) are also available if needed