From dcf4aca6477d441976ae3b987fa2ccf9c0028a5c Mon Sep 17 00:00:00 2001 From: openhands Date: Thu, 16 Apr 2026 02:41:13 +0000 Subject: [PATCH 1/6] docs: add Conversation.fork() guide Add SDK guide page for the new Conversation.fork() primitive that lets users branch off an existing conversation for follow-up exploration without contaminating the original audit trail. Covers: - Basic usage (fork, source isolation, deep-copy semantics) - Fork with a different agent (A/B testing, tool-change) - Tags, metadata, and metrics reset - Agent-server REST endpoint (POST /api/conversations/{id}/fork) - Full ready-to-run example (no LLM calls needed) Added to Conversation Features nav group in docs.json. Related SDK PR: OpenHands/software-agent-sdk#2841 Co-authored-by: openhands --- docs.json | 1 + sdk/guides/convo-fork.mdx | 326 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 327 insertions(+) create mode 100644 sdk/guides/convo-fork.mdx diff --git a/docs.json b/docs.json index 62905d8d..4337398f 100644 --- a/docs.json +++ b/docs.json @@ -250,6 +250,7 @@ { "group": "Conversation Features", "pages": [ + "sdk/guides/convo-fork", "sdk/guides/convo-pause-and-resume", "sdk/guides/convo-custom-visualizer", "sdk/guides/convo-send-message-while-running", diff --git a/sdk/guides/convo-fork.mdx b/sdk/guides/convo-fork.mdx new file mode 100644 index 00000000..fca97ce4 --- /dev/null +++ b/sdk/guides/convo-fork.mdx @@ -0,0 +1,326 @@ +--- +title: Fork a Conversation +description: Branch off an existing conversation for follow-up exploration without contaminating the original. +--- + +> A ready-to-run example is available [here](#ready-to-run-example)! + +## Overview + +`Conversation.fork()` deep-copies a conversation — events, agent config, workspace metadata — into a new conversation with its own ID. The fork starts in `idle` status and retains the full event memory of the source, so calling `run()` picks up right where the original left off. + +**Use cases:** +- **CI debugging** — an agent produced a wrong patch; fork to debug without losing the original run's audit trail +- **A/B testing** — fork at a given turn, change one variable, compare downstream outcomes +- **Tool-change** — fork and swap in a different agent with new tools mid-conversation + +## Basic Usage + +### Create a fork + +```python icon="python" focus={6} wrap +source = Conversation(agent=agent, workspace=workspace) +source.send_message("Analyse the sales report.") +source.run() + +# Fork the conversation with a title +fork = source.fork(title="Follow-up exploration") + +# The fork has the same events — agent remembers the full history +fork.send_message("Now focus on the EMEA region.") +fork.run() # Continues from the source's state +``` + +### Source stays immutable + +Forking deep-copies events and state. Anything you do on the fork never touches the source: + +```python icon="python" wrap +source_events_before = len(source.state.events) + +fork = source.fork() +fork.send_message("Extra question") + +assert len(source.state.events) == source_events_before # unchanged +``` + +### Fork with a different agent + +Swap the agent on fork — useful for A/B testing models or adding/removing tools: + +```python icon="python" focus={8-11} wrap +alt_llm = LLM(model="openai/gpt-4o", api_key=api_key, usage_id="alt") +alt_agent = Agent(llm=alt_llm, tools=[Tool(name=TerminalTool.name)]) + +fork = source.fork( + agent=alt_agent, + title="GPT-4o experiment", + tags={"variant": "B"}, +) +fork.run() # Same history, different model +``` + +### Tags and metadata + +Forks support `title` and arbitrary `tags` for organization: + +```python icon="python" wrap +fork = source.fork( + title="Debug investigation", + tags={"purpose": "debugging", "triggered_by": "ci-pipeline"}, +) + +print(fork.state.tags) +# {'title': 'Debug investigation', 'purpose': 'debugging', 'triggered_by': 'ci-pipeline'} +``` + +### Metrics reset + +By default, cost/token stats start fresh on the fork. Pass `reset_metrics=False` to preserve them: + +```python icon="python" wrap +# Cost starts at 0 on the fork (default) +fork_fresh = source.fork() + +# Cost carries over from source +fork_with_history = source.fork(reset_metrics=False) +``` + +## API Reference + +```python icon="python" wrap +def fork( + self, + *, + conversation_id: ConversationID | None = None, # auto-generated if None + agent: AgentBase | None = None, # deep-copy of source agent if None + title: str | None = None, # sets tags["title"] + tags: dict[str, str] | None = None, # arbitrary metadata + reset_metrics: bool = True, # cost/tokens start fresh +) -> Conversation: +``` + +| Parameter | Default | Description | +|-----------|---------|-------------| +| `conversation_id` | auto-generated UUID | ID for the forked conversation | +| `agent` | deep-copy of source | Agent for the fork (swap model, tools, etc.) | +| `title` | `None` | Sets `tags["title"]` on the fork | +| `tags` | `None` | Arbitrary key-value metadata | +| `reset_metrics` | `True` | Whether cost/token stats start at zero | + +**Returns:** A new `Conversation` with the same event history but independent state. + +## What Gets Copied + +| Component | Behavior | +|-----------|----------| +| **Events** | Deep-copied; source is never modified | +| **Agent** | Deep-copied by default, or replaced via the `agent` kwarg | +| **Workspace** | Shared (same working directory) | +| **Confirmation policy** | Copied from source | +| **Security analyzer** | Copied from source | +| **Stats / Metrics** | Reset by default (`reset_metrics=True`) | +| **Execution status** | Always `idle` on the fork | +| **Conversation ID** | New UUID (or explicit via `conversation_id`) | + +## Agent-Server REST Endpoint + +When using the [agent-server](/sdk/guides/agent-server/overview), forks are available via REST: + +```bash icon="terminal" +POST /api/conversations/{id}/fork +``` + +**Request body** (all fields optional): + +```json +{ + "id": "custom-uuid-or-null", + "title": "Debug investigation", + "tags": {"purpose": "debugging"}, + "reset_metrics": true +} +``` + +**Response:** Standard `ConversationInfo` for the newly created fork. + +## Ready-to-run Example + + +This example is available on GitHub: [examples/01_standalone_sdk/48_conversation_fork.py](https://github.com/OpenHands/software-agent-sdk/blob/main/examples/01_standalone_sdk/48_conversation_fork.py) + + +This example demonstrates the fork API without calling an LLM, focusing on the +state-management primitive: + +```python icon="python" expandable examples/01_standalone_sdk/48_conversation_fork.py +"""Fork a conversation to branch off for follow-up exploration. + +``Conversation.fork()`` deep-copies a conversation — events, agent config, +workspace metadata — into a new conversation with its own ID. The fork +starts in ``idle`` status and retains full event memory of the source, so +calling ``run()`` picks up right where the original left off. + +Use cases: + - CI agents that produced a wrong patch — engineer forks to debug + without losing the original run's audit trail + - A/B-testing prompts — fork at a given turn, change one variable, + compare downstream + - Swapping tools mid-conversation (fork-on-tool-change) + +This example demonstrates the fork API end-to-end without calling an LLM, +focusing on the state-management primitive itself. In a real workflow you +would call ``fork.run()`` to resume agentic execution. +""" + +import tempfile + +from pydantic import SecretStr + +from openhands.sdk import LLM, Agent, Conversation + + +# ----------------------------------------------------------------- +# Setup — minimal agent (no real LLM calls needed for the demo) +# ----------------------------------------------------------------- +llm = LLM(model="gpt-4o-mini", api_key=SecretStr("demo-key"), usage_id="demo") +agent = Agent(llm=llm, tools=[]) + +with tempfile.TemporaryDirectory() as workspace: + # ============================================================= + # 1. Create a source conversation and populate it with events + # ============================================================= + source = Conversation(agent=agent, workspace=workspace) + + # send_message() adds events to the conversation state without + # calling an LLM. + source.send_message("Analyse the sales report and list top trends.") + source.send_message("Focus on the EMEA region specifically.") + + print("=" * 64) + print(" Conversation.fork() — SDK Example") + print("=" * 64) + + print(f"\nSource conversation ID : {source.id}") + print(f"Source events count : {len(source.state.events)}") + print(f"Source status : {source.state.execution_status}") + + # ============================================================= + # 2. Basic fork — full event history is deep-copied + # ============================================================= + fork = source.fork(title="Follow-up exploration") + + print("\n--- Basic fork ---") + print(f"Fork conversation ID : {fork.id}") + print(f"Fork events count : {len(fork.state.events)}") + print(f"Fork title tag : {fork.state.tags.get('title')}") + print(f"Fork status : {fork.state.execution_status}") + + assert fork.id != source.id, "Fork must have a different ID" + assert len(fork.state.events) == len(source.state.events), ( + "Fork must copy all events" + ) + assert fork.state.tags.get("title") == "Follow-up exploration" + print("OK: Fork has same event count, different ID, correct title") + + # ============================================================= + # 3. Source isolation — changes to fork don't affect source + # ============================================================= + source_event_count = len(source.state.events) + fork.send_message("Also compare with last quarter.") + + assert len(source.state.events) == source_event_count, ( + "Source must remain unmodified" + ) + assert len(fork.state.events) > source_event_count, ( + "Fork should have more events" + ) + + print("\n--- Source isolation ---") + print(f"Source events (unchanged): {len(source.state.events)}") + print(f"Fork events (grew) : {len(fork.state.events)}") + print("OK: Source is immutable after fork") + + # ============================================================= + # 4. Deep-copy isolation — event lists are independent + # ============================================================= + fork2 = source.fork() + fork2_initial = len(fork2.state.events) + fork2.send_message("Extra message only in fork2.") + + assert len(source.state.events) == source_event_count + assert len(fork2.state.events) == fork2_initial + 1 + print("\n--- Deep-copy isolation ---") + print("OK: Fork event list is independent from source") + + # ============================================================= + # 5. Fork with a different agent (tool-change / A/B testing) + # ============================================================= + alt_llm = LLM( + model="gpt-4o", + api_key=SecretStr("demo-key"), + usage_id="alt", + ) + alt_agent = Agent(llm=alt_llm, tools=[]) + + fork_alt = source.fork( + agent=alt_agent, + title="Tool-change experiment", + tags={"purpose": "a/b-test", "variant": "B"}, + ) + + print("\n--- Fork with alternate agent ---") + print(f"Fork ID : {fork_alt.id}") + print(f"Fork model : {fork_alt.agent.llm.model}") + print(f"Fork tags : {dict(fork_alt.state.tags)}") + print(f"Fork events : {len(fork_alt.state.events)}") + + assert fork_alt.agent.llm.model == "gpt-4o", ( + "Alternate agent should be used" + ) + assert fork_alt.state.tags.get("purpose") == "a/b-test" + assert len(fork_alt.state.events) == len(source.state.events) + print("OK: Fork uses alternate agent, retains event history") + + # ============================================================= + # 6. Metrics reset (default behaviour) + # ============================================================= + fork_reset = source.fork() + fork_keep = source.fork(reset_metrics=False) + + reset_cost = fork_reset.state.stats.get_combined_metrics().accumulated_cost + keep_cost = fork_keep.state.stats.get_combined_metrics().accumulated_cost + + print("\n--- Metrics ---") + print(f"Fork (reset=True) accumulated_cost: {reset_cost}") + print(f"Fork (reset=False) accumulated_cost: {keep_cost}") + print("OK: Metrics respect reset_metrics flag") + + # ============================================================= + # Summary + # ============================================================= + print(f"\n{'=' * 64}") + print("All assertions passed — fork() works correctly.") + print( + "\nIn a real workflow, call fork.run() to resume agentic execution" + "\nfrom the copied state. The agent will have full memory of the" + "\nsource conversation." + ) + print("=" * 64) + +# No LLM calls were made +print("EXAMPLE_COST: 0") +``` + +Since this example doesn't require LLM calls, you can run it directly: + +```bash icon="terminal" +cd software-agent-sdk +uv run python examples/01_standalone_sdk/48_conversation_fork.py +``` + +## Next Steps + +- **[Persistence](/sdk/guides/convo-persistence)** — Save and restore conversation state +- **[Pause and Resume](/sdk/guides/convo-pause-and-resume)** — Control execution flow +- **[Agent Server](/sdk/guides/agent-server/overview)** — Deploy agents with the REST API From f98174a22d35231873f4bcb44b9c62d9fa63261e Mon Sep 17 00:00:00 2001 From: openhands Date: Thu, 16 Apr 2026 04:24:47 +0000 Subject: [PATCH 2/6] docs: sync fork guide with real-LLM example code Update the ready-to-run example to match the real-LLM version from the SDK repo, and add the RunExampleCode shared snippet. Co-authored-by: openhands --- sdk/guides/convo-fork.mdx | 231 ++++++++++++++------------------------ 1 file changed, 87 insertions(+), 144 deletions(-) diff --git a/sdk/guides/convo-fork.mdx b/sdk/guides/convo-fork.mdx index fca97ce4..bdec0e32 100644 --- a/sdk/guides/convo-fork.mdx +++ b/sdk/guides/convo-fork.mdx @@ -3,6 +3,8 @@ title: Fork a Conversation description: Branch off an existing conversation for follow-up exploration without contaminating the original. --- +import RunExampleCode from "/sdk/shared-snippets/how-to-run-example.mdx"; + > A ready-to-run example is available [here](#ready-to-run-example)! ## Overview @@ -150,9 +152,6 @@ POST /api/conversations/{id}/fork This example is available on GitHub: [examples/01_standalone_sdk/48_conversation_fork.py](https://github.com/OpenHands/software-agent-sdk/blob/main/examples/01_standalone_sdk/48_conversation_fork.py) -This example demonstrates the fork API without calling an LLM, focusing on the -state-management primitive: - ```python icon="python" expandable examples/01_standalone_sdk/48_conversation_fork.py """Fork a conversation to branch off for follow-up exploration. @@ -167,158 +166,102 @@ Use cases: - A/B-testing prompts — fork at a given turn, change one variable, compare downstream - Swapping tools mid-conversation (fork-on-tool-change) - -This example demonstrates the fork API end-to-end without calling an LLM, -focusing on the state-management primitive itself. In a real workflow you -would call ``fork.run()`` to resume agentic execution. """ -import tempfile +import os -from pydantic import SecretStr - -from openhands.sdk import LLM, Agent, Conversation +from openhands.sdk import LLM, Agent, Conversation, Tool +from openhands.tools.terminal import TerminalTool # ----------------------------------------------------------------- -# Setup — minimal agent (no real LLM calls needed for the demo) +# Setup # ----------------------------------------------------------------- -llm = LLM(model="gpt-4o-mini", api_key=SecretStr("demo-key"), usage_id="demo") -agent = Agent(llm=llm, tools=[]) - -with tempfile.TemporaryDirectory() as workspace: - # ============================================================= - # 1. Create a source conversation and populate it with events - # ============================================================= - source = Conversation(agent=agent, workspace=workspace) - - # send_message() adds events to the conversation state without - # calling an LLM. - source.send_message("Analyse the sales report and list top trends.") - source.send_message("Focus on the EMEA region specifically.") - - print("=" * 64) - print(" Conversation.fork() — SDK Example") - print("=" * 64) - - print(f"\nSource conversation ID : {source.id}") - print(f"Source events count : {len(source.state.events)}") - print(f"Source status : {source.state.execution_status}") - - # ============================================================= - # 2. Basic fork — full event history is deep-copied - # ============================================================= - fork = source.fork(title="Follow-up exploration") - - print("\n--- Basic fork ---") - print(f"Fork conversation ID : {fork.id}") - print(f"Fork events count : {len(fork.state.events)}") - print(f"Fork title tag : {fork.state.tags.get('title')}") - print(f"Fork status : {fork.state.execution_status}") - - assert fork.id != source.id, "Fork must have a different ID" - assert len(fork.state.events) == len(source.state.events), ( - "Fork must copy all events" - ) - assert fork.state.tags.get("title") == "Follow-up exploration" - print("OK: Fork has same event count, different ID, correct title") - - # ============================================================= - # 3. Source isolation — changes to fork don't affect source - # ============================================================= - source_event_count = len(source.state.events) - fork.send_message("Also compare with last quarter.") - - assert len(source.state.events) == source_event_count, ( - "Source must remain unmodified" - ) - assert len(fork.state.events) > source_event_count, ( - "Fork should have more events" - ) - - print("\n--- Source isolation ---") - print(f"Source events (unchanged): {len(source.state.events)}") - print(f"Fork events (grew) : {len(fork.state.events)}") - print("OK: Source is immutable after fork") - - # ============================================================= - # 4. Deep-copy isolation — event lists are independent - # ============================================================= - fork2 = source.fork() - fork2_initial = len(fork2.state.events) - fork2.send_message("Extra message only in fork2.") - - assert len(source.state.events) == source_event_count - assert len(fork2.state.events) == fork2_initial + 1 - print("\n--- Deep-copy isolation ---") - print("OK: Fork event list is independent from source") - - # ============================================================= - # 5. Fork with a different agent (tool-change / A/B testing) - # ============================================================= - alt_llm = LLM( - model="gpt-4o", - api_key=SecretStr("demo-key"), - usage_id="alt", - ) - alt_agent = Agent(llm=alt_llm, tools=[]) - - fork_alt = source.fork( - agent=alt_agent, - title="Tool-change experiment", - tags={"purpose": "a/b-test", "variant": "B"}, - ) - - print("\n--- Fork with alternate agent ---") - print(f"Fork ID : {fork_alt.id}") - print(f"Fork model : {fork_alt.agent.llm.model}") - print(f"Fork tags : {dict(fork_alt.state.tags)}") - print(f"Fork events : {len(fork_alt.state.events)}") - - assert fork_alt.agent.llm.model == "gpt-4o", ( - "Alternate agent should be used" - ) - assert fork_alt.state.tags.get("purpose") == "a/b-test" - assert len(fork_alt.state.events) == len(source.state.events) - print("OK: Fork uses alternate agent, retains event history") - - # ============================================================= - # 6. Metrics reset (default behaviour) - # ============================================================= - fork_reset = source.fork() - fork_keep = source.fork(reset_metrics=False) - - reset_cost = fork_reset.state.stats.get_combined_metrics().accumulated_cost - keep_cost = fork_keep.state.stats.get_combined_metrics().accumulated_cost - - print("\n--- Metrics ---") - print(f"Fork (reset=True) accumulated_cost: {reset_cost}") - print(f"Fork (reset=False) accumulated_cost: {keep_cost}") - print("OK: Metrics respect reset_metrics flag") - - # ============================================================= - # Summary - # ============================================================= - print(f"\n{'=' * 64}") - print("All assertions passed — fork() works correctly.") - print( - "\nIn a real workflow, call fork.run() to resume agentic execution" - "\nfrom the copied state. The agent will have full memory of the" - "\nsource conversation." - ) - print("=" * 64) - -# No LLM calls were made -print("EXAMPLE_COST: 0") -``` +llm = LLM( + model=os.getenv("LLM_MODEL", "anthropic/claude-sonnet-4-5-20250929"), + api_key=os.getenv("LLM_API_KEY"), + base_url=os.getenv("LLM_BASE_URL", None), +) -Since this example doesn't require LLM calls, you can run it directly: +agent = Agent(llm=llm, tools=[Tool(name=TerminalTool.name)]) +cwd = os.getcwd() -```bash icon="terminal" -cd software-agent-sdk -uv run python examples/01_standalone_sdk/48_conversation_fork.py +# ================================================================= +# 1. Run the source conversation +# ================================================================= +source = Conversation(agent=agent, workspace=cwd) +source.send_message("Run `echo hello-from-source` in the terminal.") +source.run() + +print("=" * 64) +print(" Conversation.fork() — SDK Example") +print("=" * 64) +print(f"\nSource conversation ID : {source.id}") +print(f"Source events count : {len(source.state.events)}") + +# ================================================================= +# 2. Fork and continue independently +# ================================================================= +fork = source.fork(title="Follow-up fork") +source_event_count = len(source.state.events) + +print("\n--- Fork created ---") +print(f"Fork ID : {fork.id}") +print(f"Fork events (copied) : {len(fork.state.events)}") +print(f"Fork title : {fork.state.tags.get('title')}") + +assert fork.id != source.id +assert len(fork.state.events) == source_event_count + +fork.send_message("Now run `echo hello-from-fork` in the terminal.") +fork.run() + +# Source is untouched +assert len(source.state.events) == source_event_count +print("\n--- After running fork ---") +print(f"Source events (unchanged): {source_event_count}") +print(f"Fork events (grew) : {len(fork.state.events)}") + +# ================================================================= +# 3. Fork with a different agent (tool-change / A/B testing) +# ================================================================= +alt_llm = LLM( + model=os.getenv("LLM_MODEL", "anthropic/claude-sonnet-4-5-20250929"), + api_key=os.getenv("LLM_API_KEY"), + base_url=os.getenv("LLM_BASE_URL", None), + usage_id="alt", +) +alt_agent = Agent(llm=alt_llm, tools=[Tool(name=TerminalTool.name)]) + +fork_alt = source.fork( + agent=alt_agent, + title="Tool-change experiment", + tags={"purpose": "a/b-test"}, +) + +print("\n--- Fork with alternate agent ---") +print(f"Fork ID : {fork_alt.id}") +print(f"Fork tags : {dict(fork_alt.state.tags)}") + +fork_alt.send_message("What command did you run earlier? Just tell me, no tools.") +fork_alt.run() + +print(f"Fork events : {len(fork_alt.state.events)}") + +# ================================================================= +# Summary +# ================================================================= +print(f"\n{'=' * 64}") +print("All done — fork() works end-to-end.") +print("=" * 64) + +# Report cost +cost = llm.metrics.accumulated_cost + alt_llm.metrics.accumulated_cost +print(f"EXAMPLE_COST: {cost}") ``` + + ## Next Steps - **[Persistence](/sdk/guides/convo-persistence)** — Save and restore conversation state From c9b359c95fe8e9659cff459eee036000892118e1 Mon Sep 17 00:00:00 2001 From: enyst Date: Fri, 17 Apr 2026 17:43:36 +0000 Subject: [PATCH 3/6] docs(sdk): add remote fork example Co-authored-by: openhands --- sdk/guides/convo-fork.mdx | 210 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 210 insertions(+) diff --git a/sdk/guides/convo-fork.mdx b/sdk/guides/convo-fork.mdx index bdec0e32..6279da41 100644 --- a/sdk/guides/convo-fork.mdx +++ b/sdk/guides/convo-fork.mdx @@ -146,6 +146,216 @@ POST /api/conversations/{id}/fork **Response:** Standard `ConversationInfo` for the newly created fork. +When you call `fork()` on a `RemoteConversation`, the SDK sends this request for +you and returns a new `RemoteConversation` pointing at the server-side copy. +Remote forks always reuse the server-managed agent configuration, so +`RemoteConversation.fork(agent=...)` is intentionally unsupported. + +## Agent-Server Example + + +This example is available on GitHub: [examples/02_remote_agent_server/11_conversation_fork.py](https://github.com/OpenHands/software-agent-sdk/blob/main/examples/02_remote_agent_server/11_conversation_fork.py) + + +```python icon="python" expandable examples/02_remote_agent_server/11_conversation_fork.py +"""Fork a conversation through the agent server REST API. + +Demonstrates ``RemoteConversation.fork()`` which delegates to the server's +``POST /api/conversations/{id}/fork`` endpoint. The fork deep-copies +events and state on the server side, then returns a new +``RemoteConversation`` pointing at the copy. + +Scenarios covered: + 1. Run a source conversation on the server + 2. Fork it — verify independent event histories + 3. Fork with a title and custom tags +""" + +import os +import subprocess +import sys +import tempfile +import threading +import time + +from pydantic import SecretStr + +from openhands.sdk import LLM, Agent, Conversation, RemoteConversation, Tool, Workspace +from openhands.tools.terminal import TerminalTool + + +# ----------------------------------------------------------------- +# Managed server helper (reused from example 01) +# ----------------------------------------------------------------- +def _stream_output(stream, prefix, target_stream): + try: + for line in iter(stream.readline, ""): + if line: + target_stream.write(f"[{prefix}] {line}") + target_stream.flush() + except Exception as e: + print(f"Error streaming {prefix}: {e}", file=sys.stderr) + finally: + stream.close() + + +class ManagedAPIServer: + """Context manager that starts and stops a local agent-server.""" + + def __init__(self, port: int = 8000, host: str = "127.0.0.1"): + self.port = port + self.host = host + self.process: subprocess.Popen[str] | None = None + self.base_url = f"http://{host}:{port}" + + def __enter__(self): + print(f"Starting agent-server on {self.base_url} ...") + self.process = subprocess.Popen( + [ + "python", + "-m", + "openhands.agent_server", + "--port", + str(self.port), + "--host", + self.host, + ], + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + text=True, + env={"LOG_JSON": "true", **os.environ}, + ) + assert self.process.stdout is not None + assert self.process.stderr is not None + threading.Thread( + target=_stream_output, + args=(self.process.stdout, "SERVER", sys.stdout), + daemon=True, + ).start() + threading.Thread( + target=_stream_output, + args=(self.process.stderr, "SERVER", sys.stderr), + daemon=True, + ).start() + + import httpx + + for _ in range(30): + try: + if httpx.get(f"{self.base_url}/health", timeout=1.0).status_code == 200: + print(f"Agent-server ready at {self.base_url}") + return self + except Exception: + pass + assert self.process.poll() is None, "Server exited unexpectedly" + time.sleep(1) + raise RuntimeError("Server failed to start in 30 s") + + def __exit__(self, *args): + if self.process: + self.process.terminate() + try: + self.process.wait(timeout=5) + except subprocess.TimeoutExpired: + self.process.kill() + self.process.wait() + time.sleep(0.5) + print("Agent-server stopped.") + + +# ----------------------------------------------------------------- +# Config +# ----------------------------------------------------------------- +api_key = os.getenv("LLM_API_KEY") +assert api_key, "LLM_API_KEY must be set" + +llm = LLM( + model=os.getenv("LLM_MODEL", "anthropic/claude-sonnet-4-5-20250929"), + api_key=SecretStr(api_key), + base_url=os.getenv("LLM_BASE_URL"), +) +agent = Agent(llm=llm, tools=[Tool(name=TerminalTool.name)]) + +# ----------------------------------------------------------------- +# Run +# ----------------------------------------------------------------- +with ManagedAPIServer(port=8002) as server: + workspace_dir = tempfile.mkdtemp(prefix="fork_demo_") + workspace = Workspace(host=server.base_url, working_dir=workspace_dir) + + # ============================================================= + # 1. Source conversation + # ============================================================= + source = Conversation(agent=agent, workspace=workspace) + assert isinstance(source, RemoteConversation) + + source.send_message("Run `echo hello-from-source` in the terminal.") + source.run() + + print("=" * 64) + print(" RemoteConversation.fork() — Agent-Server Example") + print("=" * 64) + print(f"\nSource conversation ID : {source.id}") + source_event_count = len(source.state.events) + print(f"Source events count : {source_event_count}") + + # ============================================================= + # 2. Fork and continue independently + # ============================================================= + fork = source.fork(title="Follow-up fork") + assert isinstance(fork, RemoteConversation) + + print("\n--- Fork created ---") + print(f"Fork ID : {fork.id}") + print(f"Fork events (copied) : {len(fork.state.events)}") + + assert fork.id != source.id + assert len(fork.state.events) == source_event_count + + fork.send_message("Now run `echo hello-from-fork` in the terminal.") + fork.run() + + # Source must be untouched + assert len(source.state.events) == source_event_count + print("\n--- After running fork ---") + print(f"Source events (unchanged): {source_event_count}") + print(f"Fork events (grew) : {len(fork.state.events)}") + + # ============================================================= + # 3. Fork with tags + # ============================================================= + fork_tagged = source.fork( + title="Tagged experiment", + tags={"purpose": "a/b-test"}, + ) + assert isinstance(fork_tagged, RemoteConversation) + + print("\n--- Fork with tags ---") + print(f"Fork ID : {fork_tagged.id}") + + fork_tagged.send_message( + "What command did you run earlier? Just tell me, no tools." + ) + fork_tagged.run() + + print(f"Fork events : {len(fork_tagged.state.events)}") + + # ============================================================= + # Summary + # ============================================================= + print(f"\n{'=' * 64}") + print("All done — RemoteConversation.fork() works end-to-end.") + print("=" * 64) + + # Cleanup + fork.close() + fork_tagged.close() + source.close() + +cost = llm.metrics.accumulated_cost +print(f"EXAMPLE_COST: {cost}") +``` + ## Ready-to-run Example From dcc201933c1cb71d303333470f6396f4f2495675 Mon Sep 17 00:00:00 2001 From: openhands Date: Sun, 19 Apr 2026 15:49:59 +0000 Subject: [PATCH 4/6] docs: sync fork guide with final merged code - Update embedded remote example (02_remote_agent_server/11_conversation_fork.py) to match bug-fixed version: relaxed event count assertions for remote forks since WebSocket-only events aren't persisted server-side - Fix 'What Gets Copied' table: removed incorrect entries for confirmation_policy and security_analyzer (not copied in fork), added accurate entries for agent_state, activated_knowledge_skills, and tags Co-authored-by: openhands --- sdk/guides/convo-fork.mdx | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/sdk/guides/convo-fork.mdx b/sdk/guides/convo-fork.mdx index 6279da41..1f8ff7cc 100644 --- a/sdk/guides/convo-fork.mdx +++ b/sdk/guides/convo-fork.mdx @@ -119,9 +119,10 @@ def fork( | **Events** | Deep-copied; source is never modified | | **Agent** | Deep-copied by default, or replaced via the `agent` kwarg | | **Workspace** | Shared (same working directory) | -| **Confirmation policy** | Copied from source | -| **Security analyzer** | Copied from source | -| **Stats / Metrics** | Reset by default (`reset_metrics=True`) | +| **Agent state** | Deep-copied (custom runtime data accumulated during the conversation) | +| **Activated knowledge skills** | Copied (list of skill names activated in the source) | +| **Stats / Metrics** | Reset by default (`reset_metrics=True`); pass `False` to carry over | +| **Tags** | Fresh from kwargs; source tags are **not** inherited | | **Execution status** | Always `idle` on the fork | | **Conversation ID** | New UUID (or explicit via `conversation_id`) | @@ -307,19 +308,24 @@ with ManagedAPIServer(port=8002) as server: print("\n--- Fork created ---") print(f"Fork ID : {fork.id}") - print(f"Fork events (copied) : {len(fork.state.events)}") + fork_event_count = len(fork.state.events) + print(f"Fork events (copied) : {fork_event_count}") assert fork.id != source.id - assert len(fork.state.events) == source_event_count + # The fork copies all persisted events from the server-side EventLog. + # The source's client-side list may additionally contain transient + # WebSocket-only events (e.g. full-state snapshots) that are never + # persisted, so we only assert the fork has a non-trivial number of + # events rather than exact parity. + assert fork_event_count > 0 fork.send_message("Now run `echo hello-from-fork` in the terminal.") fork.run() - # Source must be untouched - assert len(source.state.events) == source_event_count print("\n--- After running fork ---") - print(f"Source events (unchanged): {source_event_count}") - print(f"Fork events (grew) : {len(fork.state.events)}") + print(f"Source events : {len(source.state.events)}") + print(f"Fork events (grew) : {len(fork.state.events)}") + assert len(fork.state.events) > fork_event_count # ============================================================= # 3. Fork with tags From f4f016ab2ae386919270cb35be71ac2a069ad37f Mon Sep 17 00:00:00 2001 From: openhands Date: Mon, 20 Apr 2026 17:09:34 +0000 Subject: [PATCH 5/6] fix: adjust focus range to highlight fork call (lines 4-8) Co-authored-by: openhands --- sdk/guides/convo-fork.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sdk/guides/convo-fork.mdx b/sdk/guides/convo-fork.mdx index 1f8ff7cc..bf4c13f6 100644 --- a/sdk/guides/convo-fork.mdx +++ b/sdk/guides/convo-fork.mdx @@ -50,7 +50,7 @@ assert len(source.state.events) == source_events_before # unchanged Swap the agent on fork — useful for A/B testing models or adding/removing tools: -```python icon="python" focus={8-11} wrap +```python icon="python" focus={4-8} wrap alt_llm = LLM(model="openai/gpt-4o", api_key=api_key, usage_id="alt") alt_agent = Agent(llm=alt_llm, tools=[Tool(name=TerminalTool.name)]) From d8df1f3a48c2932d267ddd1903afc715d7e9f391 Mon Sep 17 00:00:00 2001 From: openhands Date: Mon, 20 Apr 2026 17:26:24 +0000 Subject: [PATCH 6/6] chore: teach code-review skill to use APPROVE event for clean reviews Add explicit guidance on GitHub review event values (APPROVE, REQUEST_CHANGES, COMMENT) to the repo-specific code-review skill. This mirrors the fix in OpenHands/extensions#185 that teaches the general github-pr-review skill the same thing. Previously the bot always submitted COMMENTED reviews even for clean PRs because no skill told it to use the APPROVE event. Co-authored-by: openhands --- .agents/skills/code-review.md | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/.agents/skills/code-review.md b/.agents/skills/code-review.md index 20b1438f..0d690867 100644 --- a/.agents/skills/code-review.md +++ b/.agents/skills/code-review.md @@ -60,17 +60,29 @@ grep -rn "function_name" /tmp/agent-sdk/ ## Review Decisions -### When to APPROVE +You **must** use the correct GitHub review `event` value when submitting your review. +Match the event to the severity of your findings: + +- **`APPROVE`** — Use when the PR is good and has no blocking issues. You can still include non-blocking inline comments with an APPROVE event. +- **`REQUEST_CHANGES`** — Use when there are critical issues that must be fixed before merging (e.g., hallucinated APIs, incorrect signatures, broken examples). +- **`COMMENT`** — Use when you have feedback but are not explicitly approving or requesting changes (e.g., unverifiable claims, minor suggestions). + +### When to APPROVE (`event: "APPROVE"`) - Documentation-only style/formatting changes - Accurate content verified against source code - Changes that correctly sync with upstream code changes - **Release PRs from @mamoodi**: If the PR author is @mamoodi and the changes are standard release updates (version bumps, changelog entries, etc.) with nothing suspicious, approve without requiring full source verification -### When to COMMENT -- Documentation claims that cannot be verified against source code -- Potentially hallucinated API surfaces (functions, parameters, classes that don't exist) -- Inaccurate signatures, return types, or field names -- Missing context that could mislead users +### When to REQUEST_CHANGES (`event: "REQUEST_CHANGES"`) +- Hallucinated API surfaces (functions, parameters, classes that don't exist in source) +- Inaccurate signatures, return types, or field names verified against source code +- Example code that would not run or produces incorrect results +- Broken internal links or navigation entries + +### When to COMMENT (`event: "COMMENT"`) +- Documentation claims that cannot be verified against source code (upstream not available) +- Minor suggestions or style nits that don't block merging +- Missing context that could mislead users but isn't critical ## General Guidelines