Skip to content

feat: user-to-CEO message channel#194

Merged
akashgit merged 7 commits into
akashgit:mainfrom
RobotSail:feat/message-channel
May 8, 2026
Merged

feat: user-to-CEO message channel#194
akashgit merged 7 commits into
akashgit:mainfrom
RobotSail:feat/message-channel

Conversation

@RobotSail
Copy link
Copy Markdown
Contributor

Summary

Adds factory message <path> "text" — a communication channel from user to CEO. Messages are injected into the CEO's task string as a ## User Messages section at the start of each cycle, then marked as read.

How it works

  1. User runs: factory message /path/to/project "fix quality gates first"
  2. Message is written to .factory/messages/<timestamp-uuid>.md
  3. At the start of the next cycle, _build_ceo_task() reads pending messages and prepends them to the CEO's task as HIGH PRIORITY directives
  4. After injection, messages are moved to .factory/messages/read/

Files

File Change
factory/messages.py New — Message dataclass, write_message, read_pending, mark_read
factory/cli.py cmd_message handler + _build_ceo_task injection + parser/handler registration
tests/test_messages.py 13 tests — lifecycle, CLI, and injection

Test plan

  • 13 tests pass
  • ruff clean

@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

❌ Patch coverage is 88.09524% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.97%. Comparing base (4969e3c) to head (3204dac).

Files with missing lines Patch % Lines
factory/cli.py 82.22% 8 Missing ⚠️
factory/messages.py 91.35% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #194      +/-   ##
==========================================
+ Coverage   86.93%   86.97%   +0.03%     
==========================================
  Files          50       51       +1     
  Lines        7151     7276     +125     
==========================================
+ Hits         6217     6328     +111     
- Misses        934      948      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@akashgit
Copy link
Copy Markdown
Owner

akashgit commented May 7, 2026

This is for when the factory is running in run mode and not ceo I guess?

Copy link
Copy Markdown
Owner

@akashgit akashgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: feat: user-to-CEO message channel

Clean feature — the message queue + injection approach is well-designed and the tests are thorough. Four issues to fix:

1. Side effect in _build_ceo_task (correctness)

mark_read is called inside a function whose name and docstring say it builds a task string. If the CEO subprocess fails to start after task construction, messages are silently lost from the pending queue (moved to read/ but never re-injected).

Fix: return the pending message IDs from _build_ceo_task (or read them at the call site) and call mark_read after the CEO subprocess is successfully spawned.

2. Double datetime.now() in write_message (messages.py:42-43)

msg_id = datetime.now(timezone.utc).strftime(...)  # call 1
ts = datetime.now(timezone.utc)                     # call 2

Two separate datetime.now() calls. If they straddle a microsecond boundary, the ID prefix won't match the stored timestamp. Use one:

ts = datetime.now(timezone.utc)
msg_id = ts.strftime("%Y%m%dT%H%M%S%f") + "-" + uuid.uuid4().hex[:8]

3. Dead code guard in read_pending (messages.py:62)

if path.parent.name == "read":
    continue

glob("*.md") only matches files directly in msg_dir, never in subdirectories. The read/ subdir files are never yielded, so this check is unreachable. Remove it.

4. No project validation in cmd_message

write_message calls msg_dir.mkdir(parents=True, exist_ok=True), which means factory message /nonexistent "hello" happily creates /nonexistent/.factory/messages/ from nothing. Should validate the project path exists (and ideally has .factory/) before writing.

Copy link
Copy Markdown
Owner

@akashgit akashgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: feat: user-to-CEO message channel (post-fixup)

The fixup commit cleanly addressed all 4 issues from the prior review. Core module is solid — clean dataclass, filesystem-based queue, good test coverage (15 tests, all pass, ruff clean). Two issues remain in the CLI integration:

1. Messages consumed on CEO failure (correctness)

In both cmd_ceo headless path (line 1364) and _run_single_cycle (line 1977), mark_read fires regardless of exit code. If the CEO crashes mid-run (code != 0), the messages were injected into the task string but may never have been processed. They're now gone from the pending queue.

Fix: only mark read on success:

if code == 0:
    if pending_ids:
        mark_read(project_path, pending_ids)
    clear_checkpoint(project_path)

2. Interactive mode marks read before execution (correctness)

cmd_ceo line 1379 calls mark_read before interactive_exec. The headless path correctly marks read after, but interactive_exec never returns (it takes over the terminal), so there's no post-execution hook.

This means if the interactive session fails to launch or the user kills it immediately, messages are lost.

Options:

  • Accept as a known limitation and add a comment explaining why
  • Skip mark_read in interactive mode entirely and let the CEO's next cycle re-inject them (safe but noisy)
  • Have interactive_exec return a status (larger change)

I'd go with option 1 — the interactive path is inherently best-effort since the user is at the terminal and can re-send.

Minor nits (non-blocking)

  • Double read_pending per cycle: _build_ceo_task reads pending internally, then the call site reads again for IDs. Could pass messages through as a parameter to avoid the redundant disk read.
  • shutil.movePath.rename: Source and dest are always on the same filesystem (within .factory/). rename is simpler and atomic on Unix.

@RobotSail
Copy link
Copy Markdown
Contributor Author

@akashgit These are very good catches, fixing now

@RobotSail
Copy link
Copy Markdown
Contributor Author

Fixed

Copy link
Copy Markdown
Owner

@akashgit akashgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review (post second fixup)

Previous issues are all resolved. One new issue found:

TOCTOU between _build_ceo_task and pending_ids capture (message loss)

Both cmd_ceo (lines 1336→1344) and _run_single_cycle (lines 1957→1969) follow this pattern:

task = _build_ceo_task(...)          # internally calls read_pending(), injects messages
pending_ids = [m.id for m in read_pending(project_path)]  # reads again
# ... CEO runs ...
mark_read(project_path, pending_ids)  # marks the second set read

A message written between step 1 and step 2 (user runs factory message in another terminal) gets captured in pending_ids but was never injected into the task string. It then gets marked as read on CEO success. That message is silently lost — not in this cycle's task, not available for the next cycle.

Fix: read once, pass through. Have _build_ceo_task accept an optional messages parameter, and capture IDs from the same list:

pending = read_pending(project_path)
pending_ids = [m.id for m in pending]
task = _build_ceo_task(project_path, mode, ..., messages=pending)

This also eliminates the double read_pending nit from the prior review.


Everything else looks clean. Tests pass (15/15), ruff and mypy clean. The mark-read-on-success logic and the interactive-mode comment are both correct.

Copy link
Copy Markdown
Contributor Author

@RobotSail RobotSail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d8b4515. Messages are now read once at the call site and passed to _build_ceo_task via a new messages parameter. Both cmd_ceo and _run_single_cycle follow the same pattern:

pending = read_pending(project_path)
pending_ids = [m.id for m in pending]
task = _build_ceo_task(..., messages=pending)
# ... CEO runs ...
if code == 0:
    mark_read(project_path, pending_ids)

This eliminates both the TOCTOU race and the double read_pending nit.

Copy link
Copy Markdown
Owner

@akashgit akashgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review (post TOCTOU fix)

All issues from 3 prior rounds are resolved. Clean and ready to merge with one minor fix:

Type hint listlist[Message] in _build_ceo_task (type safety)

cli.py line ~1758:

messages: list | None = None,

Should be:

from __future__ import annotations
from factory.messages import Message
# ...
messages: list[Message] | None = None,

Bare list erases type safety — msg.timestamp and msg.text accesses inside the function are unchecked by mypy. Since the project runs mypy factory/, this should be typed.

Use TYPE_CHECKING guard if you want to avoid the runtime import:

from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from factory.messages import Message

Everything else is solid. Fix the type hint and this is good to merge. Also update the PR description test count (says 13, actually 15).

Copy link
Copy Markdown
Contributor Author

@RobotSail RobotSail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c7248af. Added TYPE_CHECKING guard and typed the parameter as list[Message] | None. Also updated the PR description test count to 15.

RobotSail and others added 5 commits May 8, 2026 12:58
Adds `factory message <path> "text"` to queue directives for the CEO.
Messages are injected into the CEO's task string as a high-priority
`## User Messages` section at the start of each cycle, then marked read.

- New module: factory/messages.py (write_message, read_pending, mark_read)
- CLI: cmd_message + _build_ceo_task injection
- 13 tests covering lifecycle, CLI, and injection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ation

1. Move mark_read out of _build_ceo_task to call sites (_run_single_cycle,
   cmd_ceo) so messages aren't lost if CEO fails to spawn
2. Use single datetime.now() in write_message
3. Remove unreachable path.parent.name == "read" guard in read_pending
4. Add project path validation in cmd_message (reject nonexistent/non-factory)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…itation

1. mark_read only fires when code == 0 (headless + _run_single_cycle)
2. Interactive mode: mark before exec with comment explaining why
3. shutil.move → Path.rename (atomic on Unix, same filesystem)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…o_task

Read pending messages once at the call site, capture IDs from the same
list, and pass messages as a parameter to _build_ceo_task. Prevents a
message written between task construction and ID capture from being
marked read without injection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@RobotSail RobotSail force-pushed the feat/message-channel branch from c7248af to 5a9ebcd Compare May 8, 2026 17:02
Copy link
Copy Markdown
Contributor Author

@RobotSail RobotSail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased on latest main (5a9ebcd). All merge conflicts with #195 (finalize gate) and #137 (sprint resume) resolved. 15 tests pass, ruff clean.

Copy link
Copy Markdown
Owner

@akashgit akashgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Clean, well-scoped feature. The file-based queue with inject-then-mark-read lifecycle is the right design for CLI-to-subprocess communication. Test coverage is thorough (13 tests across write, read, mark-read, CLI, injection, validation).

Must fix

No message size/count limitsread_pending reads all *.md files with no cap. A runaway script or accidental glob could create thousands of message files, all concatenated into the CEO task string. Add a cap (e.g., 20 messages or a total character limit) and warn when messages are dropped.

Looks good

  • TYPE_CHECKING import is safe — from __future__ import annotations is at the top of cli.py
  • Message lifecycle is correct: read before task build, mark-read only on success (code == 0), re-injected on failure
  • Interactive path (os.execvp) marks read before exec with a clear comment explaining the tradeoff
  • Heartbeat loop picks up messages per-cycle via _run_single_cycle — new messages sent mid-run are caught on the next cycle
  • Messages arriving between read_pending() and mark_read() aren't lost — they stay pending for the next cycle

Warns when messages are dropped due to caps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@RobotSail RobotSail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d77ca71. read_pending now caps at 20 messages / 50k total characters. Warns via structlog when messages are dropped.

akashgit
akashgit previously approved these changes May 8, 2026
Copy link
Copy Markdown
Owner

@akashgit akashgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — LGTM

Cap fix addresses the previous concern cleanly. read_pending now caps at 20 messages / 50K chars, warns via structlog when capped, and the and messages guard ensures at least one message is always returned even if it exceeds the char limit. Capped messages stay pending for the next cycle — correct behavior.

Nit (non-blocking)

Trailing whitespace on the standup line in cli.py — the + line has a trailing space after the closing ". Ruff W291 will likely catch this.

task += f"\n\n## Sprint Standup\n\n{standup}" 
                                               ^ trailing space

…verage

Round 1 findings from 3 independent reviewers addressed:

- Frozen dataclass (Message is now immutable)
- Input validation: empty text rejection, MAX_MESSAGE_CHARS=10k, rate limiting
- Path containment in mark_read (resolve + is_relative_to)
- Symlink checks in read_pending and mark_read (before resolve)
- Timestamp parse failure logging
- cmd_message: docstring, try/except ValueError, stderr for interactive IDs
- PEP 8: removed double blank line in _run_single_cycle
- 10 new tests: caps, validation, malformed files, edge cases (25 total)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@RobotSail RobotSail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial review fixes (3204dac)

Ran 3 independent reviewers (code quality, Python idioms, security) in parallel for 2 rounds. All findings from round 1 addressed, round 2 all approved.

Security hardening

  • Path traversal protection in mark_read: resolve() + is_relative_to() rejects crafted message IDs
  • Symlink checks in both read_pending and mark_read: skip symlinked files before reading/renaming
  • Input validation: MAX_MESSAGE_CHARS=10,000 per message, empty/whitespace rejection
  • Rate limiting: write_message rejects when >= MAX_PENDING_MESSAGES (20) files exist

Correctness

  • Frozen dataclass: @dataclass(frozen=True) for immutability and correct equality
  • ValueError handling: cmd_message wraps write_message() in try/except for clean error output
  • Timestamp parse warning: logs via structlog when timestamp header is malformed
  • Interactive mode: prints consumed message IDs to stderr before mark_read

Test coverage

  • 25 tests total (up from 15): caps, validation, rate limiting, malformed files, edge cases
  • PEP 8 fix: removed double blank line in _run_single_cycle
  • Docstring added to cmd_message

@akashgit akashgit merged commit f72f5e6 into akashgit:main May 8, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants