Skip to content

[Refactor] Introduce deep_copy() and replace() methods for Agent and LLM #2935

@VascoSch92

Description

@VascoSch92

Problem

Agent and LLM copying/updating is scattered across the codebase using raw Pydantic methods (model_copy, model_dump/model_validate). This leads to:

  1. Inconsistent patterns - Some places use model_copy(update={...}), others use model_dump()/model_validate() roundtrip
  2. Hidden footguns - PR fix(conversation): prevent fork(agent=...) from clobbering source prompt_cache_key #2923 fixed a bug where aliased agents could clobber each other's _prompt_cache_key because the deep-copy requirement was not obvious
  3. No single place for documentation - The reason we use JSON roundtrip (threading.Lock cannot be pickled) is documented in scattered comments
  4. Violates "agents are immutable" principle - Raw model_copy makes mutation patterns non-obvious

Solution

Add 4 methods to provide a clean, consistent API:

# On AgentBase
def deep_copy(self) -> "AgentBase"
def replace(self, *, llm=UNSET, agent_context=UNSET, mcp_config=UNSET) -> "AgentBase"

# On LLM  
def deep_copy(self) -> "LLM"
def replace(self, *, stream=UNSET, usage_id=UNSET) -> "LLM"

Implementation Sketch

# In openhands/sdk/utils/copy.py (or inline in each class)

class _Unset:
    """Sentinel for distinguishing 'not provided' from None."""
    __slots__ = ()
    def __repr__(self) -> str:
        return "UNSET"

UNSET = _Unset()
# In openhands/sdk/agent/base.py

class AgentBase(DiscriminatedUnionMixin, ABC):
    # ... existing code ...

    def deep_copy(self) -> "AgentBase":
        """Create an independent deep-copy of this agent.
        
        Returns a new agent with its own object graph, including a new LLM
        instance. Safe for use in independent conversations (e.g., forks).
        
        Uses JSON roundtrip because agent/LLM private attributes hold
        threading.Lock objects that cannot be pickled with copy.deepcopy()
        or model_copy(deep=True).
        """
        cls = type(self)
        return cls.model_validate(
            self.model_dump(context={"expose_secrets": True}),
        )

    def replace(
        self,
        *,
        llm: LLM | _Unset = UNSET,
        agent_context: AgentContext | None | _Unset = UNSET,
        mcp_config: dict[str, Any] | _Unset = UNSET,
    ) -> "AgentBase":
        """Return a new agent with the specified field(s) replaced.
        
        The original agent is unchanged (agents are immutable).
        """
        updates: dict[str, Any] = {}
        if not isinstance(llm, _Unset):
            updates["llm"] = llm
        if not isinstance(agent_context, _Unset):
            updates["agent_context"] = agent_context
        if not isinstance(mcp_config, _Unset):
            updates["mcp_config"] = mcp_config
        
        if not updates:
            return self
        return self.model_copy(update=updates)
# In openhands/sdk/llm/llm.py

class LLM(BaseModel, RetryMixin, NonNativeToolCallingMixin):
    # ... existing code ...

    def deep_copy(self) -> "LLM":
        """Create an independent deep-copy of this LLM.
        
        Returns a new LLM with its own metrics, tokenizer, and other
        internal state.
        """
        cls = type(self)
        return cls.model_validate(
            self.model_dump(context={"expose_secrets": True}),
        )

    def replace(
        self,
        *,
        stream: bool | _Unset = UNSET,
        usage_id: str | _Unset = UNSET,
    ) -> "LLM":
        """Return a new LLM with the specified field(s) replaced.
        
        The original LLM is unchanged.
        """
        updates: dict[str, Any] = {}
        if not isinstance(stream, _Unset):
            updates["stream"] = stream
        if not isinstance(usage_id, _Unset):
            updates["usage_id"] = usage_id
        
        if not updates:
            return self
        return self.model_copy(update=updates)

Call Sites to Refactor

Agent deep_copy() (3 sites)

File Line Current
local_conversation.py 349-350 agent_cls.model_validate(self.agent.model_dump(...))
remote_conversation.py 1365-1366 agent_cls.model_validate(self.agent.model_dump(...))
event_service.py 473-474 agent_cls.model_validate(self.stored.agent.model_dump(...))

Agent replace() (5 sites)

File Line Fields
local_conversation.py 476-481 agent_context, mcp_config
local_conversation.py 627 llm
plugin/loader.py 104-109 agent_context, mcp_config
task/manager.py 314-316 llm
delegate/impl.py 173-177 llm

LLM deep_copy() (1 site)

File Line Current
delegate/impl.py 164 parent_llm.model_copy()

LLM replace() (10 sites)

File Line Fields
local_conversation.py 624 usage_id
local_conversation.py 1015-1020 usage_id (currently uses deep=True, change to deep_copy().replace())
task/manager.py 306 stream
task/manager.py 315 stream
delegate/impl.py 175 stream
preset/default.py 85 usage_id
preset/gpt5.py 72 usage_id
preset/planning.py 175 usage_id
preset/gemini.py 101 usage_id

Total: 19 call sites across 10 files

Example Transformations

# Deep copy
# Before:
agent_cls = type(self.agent)
fork_agent = agent_cls.model_validate(
    self.agent.model_dump(context={"expose_secrets": True}),
)
# After:
fork_agent = self.agent.deep_copy()

# Replace single field
# Before:
self.agent = self.agent.model_copy(update={"llm": new_llm})
# After:
self.agent = self.agent.replace(llm=new_llm)

# Replace multiple fields
# Before:
self.agent = self.agent.model_copy(
    update={"agent_context": ctx, "mcp_config": mcp}
)
# After:
self.agent = self.agent.replace(agent_context=ctx, mcp_config=mcp)

# LLM with deep copy + replace (line 1015)
# Before:
question_llm = self.agent.llm.model_copy(
    update={"usage_id": "ask-agent-llm"},
    deep=True,
)
# After:
question_llm = self.agent.llm.deep_copy().replace(usage_id="ask-agent-llm")

Notes

  • deep_copy() uses JSON roundtrip to handle threading.Lock in private attrs
  • replace() uses model_copy(update={...}) internally (shallow copy is fine when replacing with new objects)
  • UNSET sentinel allows distinguishing "not provided" from None
  • Consider adding a ruff/pre-commit rule to flag raw model_copy() on Agent/LLM

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    proposalproposal for discussion

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions