refactor: make agent loop support parallel and update docs by cytown · Pull Request #2503 · sipeed/picoclaw

cytown · 2026-04-13T08:34:21Z

📝 Description

Base on PR#2481, please review after the PR was merged.

Refactored the AgentLoop in /Users/billy/github/picoclaw/pkg/agent/loop.go to support parallel processing of user messages with correct response routing. Here's what was implemented:

Key Changes

 1. Worker Pool Architecture (loop.go):
    - Added workerPoolSize and workerSem (semaphore) fields to AgentLoop struct
    - Added activeWorkers counter for monitoring
    - Workers are dispatched via goroutines with semaphore-based concurrency limiting

 2. Enhanced `Run()` Method (loop.go:468-543):
    - Replaced sequential message processing with worker pool dispatcher
    - Messages with different session keys are processed in parallel (up to MaxParallelTurns)
    - Messages with same session key are enqueued to steering queue (preserves conversation integrity)
    - System messages are processed immediately in synchronous mode

 3. New Helper Methods:
    - processMessageSync(): Handles non-routable messages synchronously
    - runTurnWithSteering(): Runs a complete turn and drains steering queue in worker context

 4. Configuration Support (config.go:262):
    - Added MaxParallelTurns field to AgentDefaults struct
    - Default value: 1 (preserves backward compatibility - sequential processing)
    - Environment variable: PICOCLAW_AGENTS_DEFAULTS_MAX_PARALLEL_TURNS

 5. Removed Obsolete Code:
    - Removed drainBusToSteering() - no longer needed with direct steering queue enrollment
    - Removed associated test TestDrainBusToSteering_RequeuesDifferentScopeMessage

 6. Comprehensive Tests (loop_test.go):
    - TestParallelMessageProcessing_DifferentSessionsProcessedConcurrently: Verifies parallel execution (achieved 3
      concurrent turns)
    - TestParallelMessageProcessing_SameSessionProcessedSequentially: Verifies same-session messages are queued

How It Works

  1 Inbound Message Flow:
  2 ┌─────────────────────────────────────────┐
  3 │  bus.InboundChan()                      │
  4 └──────────────┬──────────────────────────┘
  5                │
  6                ▼
  7 ┌─────────────────────────────────────────┐
  8 │  resolveSteeringTarget(msg)             │
  9 │  Returns: sessionKey, agentID, ok       │
 10 └──────────────┬──────────────────────────┘
 11                │
 12         ┌──────┴──────┐
 13         │             │
 14         ▼             ▼
 15    Not routable   Has session
 16    (system msg)   key
 17         │             │
 18         │             ├─► activeTurnStates.Load(sessionKey)?
 19         │             │         │
 20         │             │    Yes  │  No
 21         │             │         │     │
 22         ▼             │         ▼     ▼
 23  processMessageSync   │   Enqueue   Acquire workerSem
 24                       │   to        slot
 25                       │   steering  │
 26                       │   queue     ▼
 27                       │         runTurnWithSteering()
 28                       │         (processes message +
 29                       │          drains steering queue)

🗣️ Type of Change

🐞 Bug fix (non-breaking change which fixes an issue)
✨ New feature (non-breaking change which adds functionality)
📖 Documentation update
⚡ Code refactoring (no functional changes, no api changes)

🤖 AI Code Generation

🤖 Fully AI-generated (100% AI, 0% Human)
🛠️ Mostly AI-generated (AI draft, Human verified/modified)
👨‍💻 Mostly Human-written (Human lead, AI assisted or none)

🔗 Related Issue

📚 Technical Context (Skip for Docs)

Reference URL:
Reasoning:

🧪 Test Environment

Hardware:
OS:
Model/Provider:
Channels:

📸 Evidence (Optional)

Click to view Logs/Screenshots

☑️ Checklist

My code/docs follow the style of this project.
I have performed a self-review of my own changes.
I have updated the documentation accordingly.

yinwm

Thanks for the thorough refactor! The worker pool architecture is clean and well-documented. However, I found two critical issues with the shared placeholderTurnState singleton that need to be addressed before merging:

1. Worker placeholder cleanup can delete another worker's placeholder (breaks session serialization)

In loop.go:546-553, the safety-net defer checks ts.turnID == "pending" to detect a stale placeholder. Since placeholderTurnState is a package-level singleton shared across all sessions, Worker A's cleanup defer can accidentally delete Worker B's freshly-stored placeholder for the same session:

Worker A finishes → clearActiveTurn deletes session key
Run() main loop → LoadOrStore stores new placeholder → spawns Worker B
Worker A's cleanup defer → Load finds placeholder (Worker B's) → deletes it
Run() main loop → LoadOrStore succeeds again → spawns Worker C
→ Worker B and Worker C run concurrently for the same session ❌

Fix: Create a unique placeholder instance per LoadOrStore call (e.g., &turnState{turnID: "pending-" + sessionKey}), or use a unique ID to distinguish "my placeholder" from "someone else's placeholder" in the cleanup defer.

2. HardAbort/InterruptHard can mutate the shared placeholder singleton

HardAbort() (steering.go:489) does a type assertion tsInterface.(*turnState) which succeeds for placeholderTurnState since it's also *turnState. This calls ts.Finish(true) which permanently marks the global singleton as finished, affecting all future sessions that use it. Similarly, InterruptHard() via getAnyActiveTurnState() can call requestHardAbort() on the placeholder.

Fix: Add a guard in HardAbort and InterruptHard:

if ts.turnID == "pending" {
    return fmt.Errorf("turn is still initializing for session %s", sessionKey)
}

Additional non-blocking suggestions

Continue() placeholder handling: In steering.go:355-357, GetActiveTurnBySession returns non-nil for placeholder (turnID="pending"), causing Continue() to return an error instead of gracefully yielding. This can prematurely break the steering drain loop in runTurnWithSteering.
sentTargets memory leak: message.go:68 — ResetSentInRound truncates the slice but never deletes the map key. Over time with many unique sessions, the map grows unbounded.
Dead code: drainBusToSteering is no longer called from Run() (only referenced in steering_test.go).

These issues don't manifest when max_parallel_turns=1 (default), but will break session serialization when parallelism is enabled.

cytown · 2026-04-16T06:45:10Z

Thanks for the thorough refactor! The worker pool architecture is clean and well-documented. However, I found two critical issues with the shared placeholderTurnState singleton that need to be addressed before merging:

1. Worker placeholder cleanup can delete another worker's placeholder (breaks session serialization)

In loop.go:546-553, the safety-net defer checks ts.turnID == "pending" to detect a stale placeholder. Since placeholderTurnState is a package-level singleton shared across all sessions, Worker A's cleanup defer can accidentally delete Worker B's freshly-stored placeholder for the same session:
Worker A finishes → clearActiveTurn deletes session key
Run() main loop → LoadOrStore stores new placeholder → spawns Worker B
Worker A's cleanup defer → Load finds placeholder (Worker B's) → deletes it
Run() main loop → LoadOrStore succeeds again → spawns Worker C
→ Worker B and Worker C run concurrently for the same session ❌
Fix: Create a unique placeholder instance per LoadOrStore call (e.g., &turnState{turnID: "pending-" + sessionKey}), or use a unique ID to distinguish "my placeholder" from "someone else's placeholder" in the cleanup defer.

2. HardAbort/InterruptHard can mutate the shared placeholder singleton

HardAbort() (steering.go:489) does a type assertion tsInterface.(*turnState) which succeeds for placeholderTurnState since it's also *turnState. This calls ts.Finish(true) which permanently marks the global singleton as finished, affecting all future sessions that use it. Similarly, InterruptHard() via getAnyActiveTurnState() can call requestHardAbort() on the placeholder.

Fix: Add a guard in HardAbort and InterruptHard:
if ts.turnID == "pending" {
    return fmt.Errorf("turn is still initializing for session %s", sessionKey)
}
Additional non-blocking suggestions

Continue() placeholder handling: In steering.go:355-357, GetActiveTurnBySession returns non-nil for placeholder (turnID="pending"), causing Continue() to return an error instead of gracefully yielding. This can prematurely break the steering drain loop in runTurnWithSteering.

sentTargets memory leak: message.go:68 — ResetSentInRound truncates the slice but never deletes the map key. Over time with many unique sessions, the map grows unbounded.

Dead code: drainBusToSteering is no longer called from Run() (only referenced in steering_test.go).

These issues don't manifest when max_parallel_turns=1 (default), but will break session serialization when parallelism is enabled.

all done.

yinwm

LGTM. Both CRITICAL issues from the first round are properly fixed:

Placeholder is now a unique per-claim instance (makePendingTurnID with sequence number)
HardAbort guards against pending turns with prefix check
sentTargets memory leak fixed with delete()
Dead code (drainBusToSteering, requeueInboundMessage) cleaned up

The extra cleanup (vision retry inline, code reorganization) is a nice bonus.

sipeed-bot bot added type: enhancement New feature or request domain: agent domain: channel labels Apr 13, 2026

cytown force-pushed the loop branch 3 times, most recently from eba955a to c525a2e Compare April 13, 2026 16:26

github-actions bot mentioned this pull request Apr 14, 2026

🦞 OpenClaw 生态日报 2026-04-14 gsscsd/big_model_radar#184

Open

cytown force-pushed the loop branch 2 times, most recently from 9bd89cb to 5c2cfb5 Compare April 14, 2026 08:54

alexhoshina requested changes Apr 14, 2026

View reviewed changes

Comment thread pkg/agent/loop.go Outdated

Comment thread pkg/agent/loop.go Outdated

Comment thread pkg/agent/loop.go

Comment thread pkg/agent/loop.go

cytown force-pushed the loop branch from 5c2cfb5 to 3a7b5ad Compare April 15, 2026 03:13

alexhoshina approved these changes Apr 15, 2026

View reviewed changes

yinwm requested changes Apr 16, 2026

View reviewed changes

refactor: make agent loop support parallel and update docs

f5e779e

cytown force-pushed the loop branch from 3a7b5ad to f5e779e Compare April 16, 2026 06:43

yinwm approved these changes Apr 16, 2026

View reviewed changes

yinwm merged commit eb24269 into sipeed:main Apr 16, 2026
4 checks passed

cytown deleted the loop branch April 16, 2026 16:46

github-actions bot mentioned this pull request Apr 17, 2026

🦞 OpenClaw 生态日报 2026-04-17 gsscsd/big_model_radar#199

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: make agent loop support parallel and update docs#2503

refactor: make agent loop support parallel and update docs#2503
yinwm merged 1 commit intosipeed:mainfrom
cytown:loop

cytown commented Apr 13, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yinwm left a comment

Uh oh!

cytown commented Apr 16, 2026

1. Worker placeholder cleanup can delete another worker's placeholder (breaks session serialization)

2. HardAbort/InterruptHard can mutate the shared placeholder singleton

Additional non-blocking suggestions

Uh oh!

yinwm left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cytown commented Apr 13, 2026

📝 Description

🗣️ Type of Change

🤖 AI Code Generation

🔗 Related Issue

📚 Technical Context (Skip for Docs)

🧪 Test Environment

📸 Evidence (Optional)

☑️ Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yinwm left a comment

Choose a reason for hiding this comment

1. Worker placeholder cleanup can delete another worker's placeholder (breaks session serialization)

2. HardAbort/InterruptHard can mutate the shared placeholder singleton

Additional non-blocking suggestions

Uh oh!

cytown commented Apr 16, 2026

1. Worker placeholder cleanup can delete another worker's placeholder (breaks session serialization)

2. HardAbort/InterruptHard can mutate the shared placeholder singleton

Additional non-blocking suggestions

Uh oh!

yinwm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants