Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .devswarm/telemetry-batch-fix.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"swarm_id":"4c16deee-ccaa-9a52-27e7-5f3c349ca7af","repo":"git@github.com:justrach/devswarm.git","task":"Fix these open issues in the justrach/devswarm repo. Each agent should tackle one issue, read the issue details, find the relevant code, and implement the fix with tests.\n\nIssues to fix:\n\n1. #379 — devswarm binds to wrong repo when launched as global MCP server. Find where the repo path is resolved on startup and fix it to use the correct working directory or explicit config.\n\n2. #376 — config.loadDefault called N+2 times per swarm — uncached disk reads. Cache the config after first load so subsequent calls return the cached value instead of hitting disk.\n\n3. #375 — readLine 1-byte syscall loop in agent_sdk.zig — 10K syscalls per response. Replace the 1-byte read loop with a buffered reader to reduce syscall overhead.\n\n4. #223 — hot_cache LRU insert leaks pool.create() on map.put() OOM. Fix the OOM handling so the pool allocation is freed if the map insertion fails.\n\n5. #309 — run_swarm is Codex-only + batched run_agent calls execute sequentially. Make run_swarm work with other providers and fix batched run_agent to actually parallelize.\n\nFor each fix: read the issue on GitHub first, understand the bug, find the relevant source files, implement the fix, and add or update tests.","grids":[{"name":"worker","workers":[{"id":0,"role":"fixer","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":1335629,"tokens_out":20065,"wall_ms":404995,"errors":0,"success":true},{"id":1,"role":"zig_specialist","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":432786,"tokens_out":12687,"wall_ms":286487,"errors":0,"success":true},{"id":2,"role":"zig_specialist","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":463826,"tokens_out":7793,"wall_ms":144736,"errors":0,"success":true},{"id":3,"role":"zig_specialist","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":689016,"tokens_out":9545,"wall_ms":231072,"errors":0,"success":true},{"id":4,"role":"fixer","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":4374798,"tokens_out":26999,"wall_ms":604384,"errors":0,"success":true}],"synthesis_ms":0,"total_tokens":7373144,"total_tool_calls":0}],"total_cost_usd":23.044500,"total_wall_ms":635588,"parallelism_achieved":2.77,"parallelism_theoretical":5}
1 change: 1 addition & 0 deletions .devswarm/telemetry-bugs-387-390.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"swarm_id":"d461a770-e545-4806-94b6-9a203a76e7cb","repo":"git@github.com:justrach/devswarm.git","task":"Fix these 4 new bugs in justrach/devswarm. Each agent tackles one issue.\n\n1. #387 — MCP server ignores shutdown/exit and stays alive after client shutdown. Find where the MCP server handles lifecycle messages (shutdown, exit, notifications) in main.zig and ensure the server exits cleanly when the client sends shutdown or exit.\n\n2. #388 — run_swarm workers ignore orchestrator role and always run with fixer prompt. In swarm.zig, look at how the orchestrator's role field from the JSON decomposition is passed to worker threads. The role from the orchestrator output should flow through to the worker's AgentRequest, not be hardcoded.\n\n3. #389 — Writable run_swarm embeds a second full agency/tools preamble into worker task text. In swarm.zig, look at buildPreamble() and how it's prepended to worker prompts. When workers already get a system prompt via the runtime resolver, the preamble is redundant and bloats the prompt. Fix so the preamble is only added when needed or deduplicated.\n\n4. #390 — Nested codex app-server spawn inherits parent CODEX_* desktop session variables. In codex_appserver.zig or agent_sdk.zig, find where child processes are spawned and ensure CODEX_* environment variables from the parent desktop session are stripped before spawning nested agents.\n\nFor each: read the issue, find the code, implement the fix, add tests.","grids":[{"name":"worker","workers":[{"id":0,"role":"zig_specialist","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":321927,"tokens_out":3972,"wall_ms":144925,"errors":0,"success":true},{"id":1,"role":"zig_specialist","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":657774,"tokens_out":9757,"wall_ms":260315,"errors":0,"success":true},{"id":2,"role":"zig_specialist","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":736853,"tokens_out":11065,"wall_ms":255206,"errors":0,"success":true}],"synthesis_ms":0,"total_tokens":1741348,"total_tool_calls":0}],"total_cost_usd":5.521572,"total_wall_ms":336996,"parallelism_achieved":2.54,"parallelism_theoretical":4}
1 change: 1 addition & 0 deletions .devswarm/telemetry-evolver.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"swarm_id":"4f778bbe-22fd-e909-a99d-769f735f7cbf","repo":"git@github.com:justrach/devswarm.git","task":"Build the evolutionary prompt optimization foundation for devswarm. This ties together issues #148 (evolver epic), #149 (core types), #353 (QD prompt optimization), #354 (eval framework), and #274 (grid tuning).\n\nThe goal: create `src/evolver.zig` with core types, archive management, and a weighted prompt selection system inspired by MAP-Elites QD and weighted frames (where weighted distributions over prompt variants preserve robustness).\n\nIMPORTANT CONTEXT from existing code:\n- `src/runtime/roles.zig` has 12 static RoleSpec prompts (finder, reviewer, fixer, explorer, architect, safety_auditor, zig_specialist, api_reviewer, test_writer, monitor, orchestrator, synthesizer)\n- `src/runtime/types.zig` has `RoleSpec = struct { name, writable, system_prompt }`\n- `src/grid.zig` has `Role = struct { name, model, max_tool_calls, tool_allowlist }`\n- `src/telemetry.zig` has `WorkerMetrics = struct { worker_id, role, model, tool_calls, tokens_in, tokens_out, wall_ms, errors, success }`\n- `Worker.allocated_prompt` in swarm.zig is an unused injection slot for prompt variants\n\nBuild the following in `src/evolver.zig`:\n\n1. **Core types** (issue #149):\n```zig\nPromptVariant = struct {\n id: u64,\n role: []const u8, // which role this is a variant for\n prompt: []const u8, // the actual system prompt text\n parent_id: ?u64, // lineage tracking\n fitness: f64, // composite score [0.0, 1.0]\n generation: u32,\n eval_count: u32, // how many times evaluated\n behavior: BehaviorDescriptor, // for MAP-Elites grid placement\n};\n\nBehaviorDescriptor = struct {\n token_efficiency: f32, // normalized tokens_out / task_complexity\n thoroughness: f32, // normalized tool_calls\n};\n\nEvaluationResult = struct {\n success: bool,\n tokens_in: u64,\n tokens_out: u64,\n wall_ms: u64,\n tool_calls: u32,\n errors: u32,\n};\n```\n\n2. **Archive** (issue #157) — MAP-Elites style grid per role:\n```zig\nArchive = struct {\n // 8x8 grid per role, keyed by BehaviorDescriptor quantized to cells\n cells: [GRID_SIZE][GRID_SIZE]?PromptVariant per role\n \n fn insert(variant: PromptVariant) — inserts if cell empty or variant has higher fitness\n fn sample(role, rng) — weighted selection, higher fitness = higher probability\n fn bestForRole(role) — returns highest fitness variant for a role\n fn toJson / fromJson — persistence to .devswarm/archive.json\n};\n```\n\n3. **Fitness computation** from WorkerMetrics:\n```zig\nfn computeFitness(metrics: WorkerMetrics) f64\n// 0.5 * success + 0.2 * cost_efficiency + 0.15 * speed + 0.15 * (1 - error_rate)\n```\n\n4. **Weighted selection** (the key insight from the weighted frames paper):\n```zig\nfn weightedSelect(archive: *Archive, role: []const u8, rng: *Random) ?*PromptVariant\n// Softmax over fitness values with temperature — NOT argmax\n// This preserves \"continuity\" — small fitness differences → small probability differences\n// Analogous to robust weighted frames vs discontinuous canonicalization\n```\n\n5. **Integration helper** — bridge between evolver and swarm:\n```zig\nfn resolvePromptForRole(archive: *Archive, role: []const u8, rng: *Random) ?[]const u8\n// Returns evolved prompt if archive has one for this role, null otherwise\n// Caller falls back to built-in RoleSpec.system_prompt when null\n```\n\n6. **Comprehensive tests** for all of the above — at least 15 tests covering:\n- PromptVariant JSON round-trip\n- Archive insert (empty cell, fitness improvement, fitness regression rejected)\n- Archive persistence (toJson → fromJson round-trip)\n- Fitness computation from WorkerMetrics\n- Weighted selection distribution (higher fitness → higher selection probability)\n- Weighted selection with single variant (deterministic)\n- resolvePromptForRole with empty archive (returns null)\n- resolvePromptForRole with populated archive\n- BehaviorDescriptor quantization to grid cells\n- Archive sample across all roles\n\nDo NOT modify any existing files. Create only `src/evolver.zig`. This is the foundation that future work (mutation operators, meta-loop, MCP tools) will build on.","grids":[{"name":"worker","workers":[{"id":0,"role":"finder","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":425012,"tokens_out":4648,"wall_ms":145911,"errors":0,"success":true},{"id":1,"role":"architect","model":"claude-opus-4-6","tool_calls":0,"tokens_in":246525,"tokens_out":7514,"wall_ms":268314,"errors":0,"success":true},{"id":2,"role":"api_reviewer","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":358026,"tokens_out":7678,"wall_ms":208012,"errors":0,"success":true},{"id":3,"role":"reviewer","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":41628,"tokens_out":5298,"wall_ms":235734,"errors":0,"success":true},{"id":4,"role":"zig_specialist","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":2677642,"tokens_out":85937,"wall_ms":1320776,"errors":0,"success":true}],"synthesis_ms":0,"total_tokens":3859908,"total_tool_calls":0}],"total_cost_usd":16.321764,"total_wall_ms":1422916,"parallelism_achieved":1.65,"parallelism_theoretical":5}
1 change: 1 addition & 0 deletions .devswarm/telemetry-zig-bugs.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"swarm_id":"0316f3c6-349c-3e05-eed6-7cd3a6626958","repo":"git@github.com:justrach/devswarm.git","task":"Fix these 4 Zig-specific bugs in justrach/devswarm. Each agent tackles one issue. These are all memory safety, correctness, or resource management bugs in core Zig code.\n\n1. #226 — WAL replay @enumFromInt on untrusted bytes can trap on corrupt data\nThe WAL (write-ahead log) replay code uses @enumFromInt on bytes read from disk. If the WAL file is corrupt, this traps (undefined behavior in Zig). Find the WAL replay code (likely in src/graph/storage.zig or similar), and replace all @enumFromInt calls on untrusted data with safe validation — check if the integer value is a valid enum member before converting, and skip/error on invalid entries. Add tests with corrupt WAL data.\n\n2. #225 — IPC server deinit removes fixed SOCKET_PATH instead of actual bound path\nThe IPC server cleanup code deletes a hardcoded socket path constant instead of the path that was actually bound to. If the server bound to a different path (e.g. due to config or fallback), the wrong file gets deleted and the actual socket leaks. Find the IPC server code, store the actual bound path, and delete that on deinit. Add tests.\n\n3. #224 — Watcher misclassifies new files as .modified instead of .created\nThe file watcher reports newly created files as modifications instead of creations. Find the watcher code (likely src/watcher.zig or similar), check the logic that classifies file events, and fix the classification so new files are correctly identified as .created. The distinction matters for incremental indexing. Add tests.\n\n4. #166 — Incremental PPR edge update formulas are not faithful to push rule\nThe PersonalizedPageRank implementation has incremental edge update formulas that don't match the mathematical push rule. Find the PPR code (likely in src/graph/), compare the update formulas to the standard push-based PPR algorithm, and fix any deviations. Add tests comparing incremental updates to full recomputation.\n\nFor each: read the GitHub issue first for details, find the relevant source, implement the fix, add/update tests. These are all in core Zig infrastructure code — be careful with memory safety.","grids":[{"name":"worker","workers":[{"id":0,"role":"zig_specialist","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":1097841,"tokens_out":8897,"wall_ms":191010,"errors":0,"success":true},{"id":1,"role":"zig_specialist","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":664480,"tokens_out":6543,"wall_ms":171847,"errors":0,"success":true},{"id":2,"role":"zig_specialist","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":1113449,"tokens_out":9723,"wall_ms":282272,"errors":0,"success":true},{"id":3,"role":"zig_specialist","model":"claude-sonnet-4-6","tool_calls":0,"tokens_in":1528475,"tokens_out":33707,"wall_ms":564717,"errors":0,"success":true}],"synthesis_ms":0,"total_tokens":4463115,"total_tool_calls":0}],"total_cost_usd":14.095785,"total_wall_ms":588504,"parallelism_achieved":2.14,"parallelism_theoretical":4}
2 changes: 2 additions & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Auto-request @justrach for review on all repository changes.
* @justrach
Binary file added codedb.snapshot
Binary file not shown.
Loading
Loading