From dcf4aca6477d441976ae3b987fa2ccf9c0028a5c Mon Sep 17 00:00:00 2001 From: openhands Date: Thu, 16 Apr 2026 02:41:13 +0000 Subject: [PATCH 1/2] docs: add Conversation.fork() guide Add SDK guide page for the new Conversation.fork() primitive that lets users branch off an existing conversation for follow-up exploration without contaminating the original audit trail. Covers: - Basic usage (fork, source isolation, deep-copy semantics) - Fork with a different agent (A/B testing, tool-change) - Tags, metadata, and metrics reset - Agent-server REST endpoint (POST /api/conversations/{id}/fork) - Full ready-to-run example (no LLM calls needed) Added to Conversation Features nav group in docs.json. Related SDK PR: OpenHands/software-agent-sdk#2841 Co-authored-by: openhands --- docs.json | 1 + sdk/guides/convo-fork.mdx | 326 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 327 insertions(+) create mode 100644 sdk/guides/convo-fork.mdx diff --git a/docs.json b/docs.json index 62905d8d..4337398f 100644 --- a/docs.json +++ b/docs.json @@ -250,6 +250,7 @@ { "group": "Conversation Features", "pages": [ + "sdk/guides/convo-fork", "sdk/guides/convo-pause-and-resume", "sdk/guides/convo-custom-visualizer", "sdk/guides/convo-send-message-while-running", diff --git a/sdk/guides/convo-fork.mdx b/sdk/guides/convo-fork.mdx new file mode 100644 index 00000000..fca97ce4 --- /dev/null +++ b/sdk/guides/convo-fork.mdx @@ -0,0 +1,326 @@ +--- +title: Fork a Conversation +description: Branch off an existing conversation for follow-up exploration without contaminating the original. +--- + +> A ready-to-run example is available [here](#ready-to-run-example)! + +## Overview + +`Conversation.fork()` deep-copies a conversation — events, agent config, workspace metadata — into a new conversation with its own ID. The fork starts in `idle` status and retains the full event memory of the source, so calling `run()` picks up right where the original left off. + +**Use cases:** +- **CI debugging** — an agent produced a wrong patch; fork to debug without losing the original run's audit trail +- **A/B testing** — fork at a given turn, change one variable, compare downstream outcomes +- **Tool-change** — fork and swap in a different agent with new tools mid-conversation + +## Basic Usage + +### Create a fork + +```python icon="python" focus={6} wrap +source = Conversation(agent=agent, workspace=workspace) +source.send_message("Analyse the sales report.") +source.run() + +# Fork the conversation with a title +fork = source.fork(title="Follow-up exploration") + +# The fork has the same events — agent remembers the full history +fork.send_message("Now focus on the EMEA region.") +fork.run() # Continues from the source's state +``` + +### Source stays immutable + +Forking deep-copies events and state. Anything you do on the fork never touches the source: + +```python icon="python" wrap +source_events_before = len(source.state.events) + +fork = source.fork() +fork.send_message("Extra question") + +assert len(source.state.events) == source_events_before # unchanged +``` + +### Fork with a different agent + +Swap the agent on fork — useful for A/B testing models or adding/removing tools: + +```python icon="python" focus={8-11} wrap +alt_llm = LLM(model="openai/gpt-4o", api_key=api_key, usage_id="alt") +alt_agent = Agent(llm=alt_llm, tools=[Tool(name=TerminalTool.name)]) + +fork = source.fork( + agent=alt_agent, + title="GPT-4o experiment", + tags={"variant": "B"}, +) +fork.run() # Same history, different model +``` + +### Tags and metadata + +Forks support `title` and arbitrary `tags` for organization: + +```python icon="python" wrap +fork = source.fork( + title="Debug investigation", + tags={"purpose": "debugging", "triggered_by": "ci-pipeline"}, +) + +print(fork.state.tags) +# {'title': 'Debug investigation', 'purpose': 'debugging', 'triggered_by': 'ci-pipeline'} +``` + +### Metrics reset + +By default, cost/token stats start fresh on the fork. Pass `reset_metrics=False` to preserve them: + +```python icon="python" wrap +# Cost starts at 0 on the fork (default) +fork_fresh = source.fork() + +# Cost carries over from source +fork_with_history = source.fork(reset_metrics=False) +``` + +## API Reference + +```python icon="python" wrap +def fork( + self, + *, + conversation_id: ConversationID | None = None, # auto-generated if None + agent: AgentBase | None = None, # deep-copy of source agent if None + title: str | None = None, # sets tags["title"] + tags: dict[str, str] | None = None, # arbitrary metadata + reset_metrics: bool = True, # cost/tokens start fresh +) -> Conversation: +``` + +| Parameter | Default | Description | +|-----------|---------|-------------| +| `conversation_id` | auto-generated UUID | ID for the forked conversation | +| `agent` | deep-copy of source | Agent for the fork (swap model, tools, etc.) | +| `title` | `None` | Sets `tags["title"]` on the fork | +| `tags` | `None` | Arbitrary key-value metadata | +| `reset_metrics` | `True` | Whether cost/token stats start at zero | + +**Returns:** A new `Conversation` with the same event history but independent state. + +## What Gets Copied + +| Component | Behavior | +|-----------|----------| +| **Events** | Deep-copied; source is never modified | +| **Agent** | Deep-copied by default, or replaced via the `agent` kwarg | +| **Workspace** | Shared (same working directory) | +| **Confirmation policy** | Copied from source | +| **Security analyzer** | Copied from source | +| **Stats / Metrics** | Reset by default (`reset_metrics=True`) | +| **Execution status** | Always `idle` on the fork | +| **Conversation ID** | New UUID (or explicit via `conversation_id`) | + +## Agent-Server REST Endpoint + +When using the [agent-server](/sdk/guides/agent-server/overview), forks are available via REST: + +```bash icon="terminal" +POST /api/conversations/{id}/fork +``` + +**Request body** (all fields optional): + +```json +{ + "id": "custom-uuid-or-null", + "title": "Debug investigation", + "tags": {"purpose": "debugging"}, + "reset_metrics": true +} +``` + +**Response:** Standard `ConversationInfo` for the newly created fork. + +## Ready-to-run Example + + +This example is available on GitHub: [examples/01_standalone_sdk/48_conversation_fork.py](https://github.com/OpenHands/software-agent-sdk/blob/main/examples/01_standalone_sdk/48_conversation_fork.py) + + +This example demonstrates the fork API without calling an LLM, focusing on the +state-management primitive: + +```python icon="python" expandable examples/01_standalone_sdk/48_conversation_fork.py +"""Fork a conversation to branch off for follow-up exploration. + +``Conversation.fork()`` deep-copies a conversation — events, agent config, +workspace metadata — into a new conversation with its own ID. The fork +starts in ``idle`` status and retains full event memory of the source, so +calling ``run()`` picks up right where the original left off. + +Use cases: + - CI agents that produced a wrong patch — engineer forks to debug + without losing the original run's audit trail + - A/B-testing prompts — fork at a given turn, change one variable, + compare downstream + - Swapping tools mid-conversation (fork-on-tool-change) + +This example demonstrates the fork API end-to-end without calling an LLM, +focusing on the state-management primitive itself. In a real workflow you +would call ``fork.run()`` to resume agentic execution. +""" + +import tempfile + +from pydantic import SecretStr + +from openhands.sdk import LLM, Agent, Conversation + + +# ----------------------------------------------------------------- +# Setup — minimal agent (no real LLM calls needed for the demo) +# ----------------------------------------------------------------- +llm = LLM(model="gpt-4o-mini", api_key=SecretStr("demo-key"), usage_id="demo") +agent = Agent(llm=llm, tools=[]) + +with tempfile.TemporaryDirectory() as workspace: + # ============================================================= + # 1. Create a source conversation and populate it with events + # ============================================================= + source = Conversation(agent=agent, workspace=workspace) + + # send_message() adds events to the conversation state without + # calling an LLM. + source.send_message("Analyse the sales report and list top trends.") + source.send_message("Focus on the EMEA region specifically.") + + print("=" * 64) + print(" Conversation.fork() — SDK Example") + print("=" * 64) + + print(f"\nSource conversation ID : {source.id}") + print(f"Source events count : {len(source.state.events)}") + print(f"Source status : {source.state.execution_status}") + + # ============================================================= + # 2. Basic fork — full event history is deep-copied + # ============================================================= + fork = source.fork(title="Follow-up exploration") + + print("\n--- Basic fork ---") + print(f"Fork conversation ID : {fork.id}") + print(f"Fork events count : {len(fork.state.events)}") + print(f"Fork title tag : {fork.state.tags.get('title')}") + print(f"Fork status : {fork.state.execution_status}") + + assert fork.id != source.id, "Fork must have a different ID" + assert len(fork.state.events) == len(source.state.events), ( + "Fork must copy all events" + ) + assert fork.state.tags.get("title") == "Follow-up exploration" + print("OK: Fork has same event count, different ID, correct title") + + # ============================================================= + # 3. Source isolation — changes to fork don't affect source + # ============================================================= + source_event_count = len(source.state.events) + fork.send_message("Also compare with last quarter.") + + assert len(source.state.events) == source_event_count, ( + "Source must remain unmodified" + ) + assert len(fork.state.events) > source_event_count, ( + "Fork should have more events" + ) + + print("\n--- Source isolation ---") + print(f"Source events (unchanged): {len(source.state.events)}") + print(f"Fork events (grew) : {len(fork.state.events)}") + print("OK: Source is immutable after fork") + + # ============================================================= + # 4. Deep-copy isolation — event lists are independent + # ============================================================= + fork2 = source.fork() + fork2_initial = len(fork2.state.events) + fork2.send_message("Extra message only in fork2.") + + assert len(source.state.events) == source_event_count + assert len(fork2.state.events) == fork2_initial + 1 + print("\n--- Deep-copy isolation ---") + print("OK: Fork event list is independent from source") + + # ============================================================= + # 5. Fork with a different agent (tool-change / A/B testing) + # ============================================================= + alt_llm = LLM( + model="gpt-4o", + api_key=SecretStr("demo-key"), + usage_id="alt", + ) + alt_agent = Agent(llm=alt_llm, tools=[]) + + fork_alt = source.fork( + agent=alt_agent, + title="Tool-change experiment", + tags={"purpose": "a/b-test", "variant": "B"}, + ) + + print("\n--- Fork with alternate agent ---") + print(f"Fork ID : {fork_alt.id}") + print(f"Fork model : {fork_alt.agent.llm.model}") + print(f"Fork tags : {dict(fork_alt.state.tags)}") + print(f"Fork events : {len(fork_alt.state.events)}") + + assert fork_alt.agent.llm.model == "gpt-4o", ( + "Alternate agent should be used" + ) + assert fork_alt.state.tags.get("purpose") == "a/b-test" + assert len(fork_alt.state.events) == len(source.state.events) + print("OK: Fork uses alternate agent, retains event history") + + # ============================================================= + # 6. Metrics reset (default behaviour) + # ============================================================= + fork_reset = source.fork() + fork_keep = source.fork(reset_metrics=False) + + reset_cost = fork_reset.state.stats.get_combined_metrics().accumulated_cost + keep_cost = fork_keep.state.stats.get_combined_metrics().accumulated_cost + + print("\n--- Metrics ---") + print(f"Fork (reset=True) accumulated_cost: {reset_cost}") + print(f"Fork (reset=False) accumulated_cost: {keep_cost}") + print("OK: Metrics respect reset_metrics flag") + + # ============================================================= + # Summary + # ============================================================= + print(f"\n{'=' * 64}") + print("All assertions passed — fork() works correctly.") + print( + "\nIn a real workflow, call fork.run() to resume agentic execution" + "\nfrom the copied state. The agent will have full memory of the" + "\nsource conversation." + ) + print("=" * 64) + +# No LLM calls were made +print("EXAMPLE_COST: 0") +``` + +Since this example doesn't require LLM calls, you can run it directly: + +```bash icon="terminal" +cd software-agent-sdk +uv run python examples/01_standalone_sdk/48_conversation_fork.py +``` + +## Next Steps + +- **[Persistence](/sdk/guides/convo-persistence)** — Save and restore conversation state +- **[Pause and Resume](/sdk/guides/convo-pause-and-resume)** — Control execution flow +- **[Agent Server](/sdk/guides/agent-server/overview)** — Deploy agents with the REST API From f98174a22d35231873f4bcb44b9c62d9fa63261e Mon Sep 17 00:00:00 2001 From: openhands Date: Thu, 16 Apr 2026 04:24:47 +0000 Subject: [PATCH 2/2] docs: sync fork guide with real-LLM example code Update the ready-to-run example to match the real-LLM version from the SDK repo, and add the RunExampleCode shared snippet. Co-authored-by: openhands --- sdk/guides/convo-fork.mdx | 231 ++++++++++++++------------------------ 1 file changed, 87 insertions(+), 144 deletions(-) diff --git a/sdk/guides/convo-fork.mdx b/sdk/guides/convo-fork.mdx index fca97ce4..bdec0e32 100644 --- a/sdk/guides/convo-fork.mdx +++ b/sdk/guides/convo-fork.mdx @@ -3,6 +3,8 @@ title: Fork a Conversation description: Branch off an existing conversation for follow-up exploration without contaminating the original. --- +import RunExampleCode from "/sdk/shared-snippets/how-to-run-example.mdx"; + > A ready-to-run example is available [here](#ready-to-run-example)! ## Overview @@ -150,9 +152,6 @@ POST /api/conversations/{id}/fork This example is available on GitHub: [examples/01_standalone_sdk/48_conversation_fork.py](https://github.com/OpenHands/software-agent-sdk/blob/main/examples/01_standalone_sdk/48_conversation_fork.py) -This example demonstrates the fork API without calling an LLM, focusing on the -state-management primitive: - ```python icon="python" expandable examples/01_standalone_sdk/48_conversation_fork.py """Fork a conversation to branch off for follow-up exploration. @@ -167,158 +166,102 @@ Use cases: - A/B-testing prompts — fork at a given turn, change one variable, compare downstream - Swapping tools mid-conversation (fork-on-tool-change) - -This example demonstrates the fork API end-to-end without calling an LLM, -focusing on the state-management primitive itself. In a real workflow you -would call ``fork.run()`` to resume agentic execution. """ -import tempfile +import os -from pydantic import SecretStr - -from openhands.sdk import LLM, Agent, Conversation +from openhands.sdk import LLM, Agent, Conversation, Tool +from openhands.tools.terminal import TerminalTool # ----------------------------------------------------------------- -# Setup — minimal agent (no real LLM calls needed for the demo) +# Setup # ----------------------------------------------------------------- -llm = LLM(model="gpt-4o-mini", api_key=SecretStr("demo-key"), usage_id="demo") -agent = Agent(llm=llm, tools=[]) - -with tempfile.TemporaryDirectory() as workspace: - # ============================================================= - # 1. Create a source conversation and populate it with events - # ============================================================= - source = Conversation(agent=agent, workspace=workspace) - - # send_message() adds events to the conversation state without - # calling an LLM. - source.send_message("Analyse the sales report and list top trends.") - source.send_message("Focus on the EMEA region specifically.") - - print("=" * 64) - print(" Conversation.fork() — SDK Example") - print("=" * 64) - - print(f"\nSource conversation ID : {source.id}") - print(f"Source events count : {len(source.state.events)}") - print(f"Source status : {source.state.execution_status}") - - # ============================================================= - # 2. Basic fork — full event history is deep-copied - # ============================================================= - fork = source.fork(title="Follow-up exploration") - - print("\n--- Basic fork ---") - print(f"Fork conversation ID : {fork.id}") - print(f"Fork events count : {len(fork.state.events)}") - print(f"Fork title tag : {fork.state.tags.get('title')}") - print(f"Fork status : {fork.state.execution_status}") - - assert fork.id != source.id, "Fork must have a different ID" - assert len(fork.state.events) == len(source.state.events), ( - "Fork must copy all events" - ) - assert fork.state.tags.get("title") == "Follow-up exploration" - print("OK: Fork has same event count, different ID, correct title") - - # ============================================================= - # 3. Source isolation — changes to fork don't affect source - # ============================================================= - source_event_count = len(source.state.events) - fork.send_message("Also compare with last quarter.") - - assert len(source.state.events) == source_event_count, ( - "Source must remain unmodified" - ) - assert len(fork.state.events) > source_event_count, ( - "Fork should have more events" - ) - - print("\n--- Source isolation ---") - print(f"Source events (unchanged): {len(source.state.events)}") - print(f"Fork events (grew) : {len(fork.state.events)}") - print("OK: Source is immutable after fork") - - # ============================================================= - # 4. Deep-copy isolation — event lists are independent - # ============================================================= - fork2 = source.fork() - fork2_initial = len(fork2.state.events) - fork2.send_message("Extra message only in fork2.") - - assert len(source.state.events) == source_event_count - assert len(fork2.state.events) == fork2_initial + 1 - print("\n--- Deep-copy isolation ---") - print("OK: Fork event list is independent from source") - - # ============================================================= - # 5. Fork with a different agent (tool-change / A/B testing) - # ============================================================= - alt_llm = LLM( - model="gpt-4o", - api_key=SecretStr("demo-key"), - usage_id="alt", - ) - alt_agent = Agent(llm=alt_llm, tools=[]) - - fork_alt = source.fork( - agent=alt_agent, - title="Tool-change experiment", - tags={"purpose": "a/b-test", "variant": "B"}, - ) - - print("\n--- Fork with alternate agent ---") - print(f"Fork ID : {fork_alt.id}") - print(f"Fork model : {fork_alt.agent.llm.model}") - print(f"Fork tags : {dict(fork_alt.state.tags)}") - print(f"Fork events : {len(fork_alt.state.events)}") - - assert fork_alt.agent.llm.model == "gpt-4o", ( - "Alternate agent should be used" - ) - assert fork_alt.state.tags.get("purpose") == "a/b-test" - assert len(fork_alt.state.events) == len(source.state.events) - print("OK: Fork uses alternate agent, retains event history") - - # ============================================================= - # 6. Metrics reset (default behaviour) - # ============================================================= - fork_reset = source.fork() - fork_keep = source.fork(reset_metrics=False) - - reset_cost = fork_reset.state.stats.get_combined_metrics().accumulated_cost - keep_cost = fork_keep.state.stats.get_combined_metrics().accumulated_cost - - print("\n--- Metrics ---") - print(f"Fork (reset=True) accumulated_cost: {reset_cost}") - print(f"Fork (reset=False) accumulated_cost: {keep_cost}") - print("OK: Metrics respect reset_metrics flag") - - # ============================================================= - # Summary - # ============================================================= - print(f"\n{'=' * 64}") - print("All assertions passed — fork() works correctly.") - print( - "\nIn a real workflow, call fork.run() to resume agentic execution" - "\nfrom the copied state. The agent will have full memory of the" - "\nsource conversation." - ) - print("=" * 64) - -# No LLM calls were made -print("EXAMPLE_COST: 0") -``` +llm = LLM( + model=os.getenv("LLM_MODEL", "anthropic/claude-sonnet-4-5-20250929"), + api_key=os.getenv("LLM_API_KEY"), + base_url=os.getenv("LLM_BASE_URL", None), +) -Since this example doesn't require LLM calls, you can run it directly: +agent = Agent(llm=llm, tools=[Tool(name=TerminalTool.name)]) +cwd = os.getcwd() -```bash icon="terminal" -cd software-agent-sdk -uv run python examples/01_standalone_sdk/48_conversation_fork.py +# ================================================================= +# 1. Run the source conversation +# ================================================================= +source = Conversation(agent=agent, workspace=cwd) +source.send_message("Run `echo hello-from-source` in the terminal.") +source.run() + +print("=" * 64) +print(" Conversation.fork() — SDK Example") +print("=" * 64) +print(f"\nSource conversation ID : {source.id}") +print(f"Source events count : {len(source.state.events)}") + +# ================================================================= +# 2. Fork and continue independently +# ================================================================= +fork = source.fork(title="Follow-up fork") +source_event_count = len(source.state.events) + +print("\n--- Fork created ---") +print(f"Fork ID : {fork.id}") +print(f"Fork events (copied) : {len(fork.state.events)}") +print(f"Fork title : {fork.state.tags.get('title')}") + +assert fork.id != source.id +assert len(fork.state.events) == source_event_count + +fork.send_message("Now run `echo hello-from-fork` in the terminal.") +fork.run() + +# Source is untouched +assert len(source.state.events) == source_event_count +print("\n--- After running fork ---") +print(f"Source events (unchanged): {source_event_count}") +print(f"Fork events (grew) : {len(fork.state.events)}") + +# ================================================================= +# 3. Fork with a different agent (tool-change / A/B testing) +# ================================================================= +alt_llm = LLM( + model=os.getenv("LLM_MODEL", "anthropic/claude-sonnet-4-5-20250929"), + api_key=os.getenv("LLM_API_KEY"), + base_url=os.getenv("LLM_BASE_URL", None), + usage_id="alt", +) +alt_agent = Agent(llm=alt_llm, tools=[Tool(name=TerminalTool.name)]) + +fork_alt = source.fork( + agent=alt_agent, + title="Tool-change experiment", + tags={"purpose": "a/b-test"}, +) + +print("\n--- Fork with alternate agent ---") +print(f"Fork ID : {fork_alt.id}") +print(f"Fork tags : {dict(fork_alt.state.tags)}") + +fork_alt.send_message("What command did you run earlier? Just tell me, no tools.") +fork_alt.run() + +print(f"Fork events : {len(fork_alt.state.events)}") + +# ================================================================= +# Summary +# ================================================================= +print(f"\n{'=' * 64}") +print("All done — fork() works end-to-end.") +print("=" * 64) + +# Report cost +cost = llm.metrics.accumulated_cost + alt_llm.metrics.accumulated_cost +print(f"EXAMPLE_COST: {cost}") ``` + + ## Next Steps - **[Persistence](/sdk/guides/convo-persistence)** — Save and restore conversation state