fix: address all review findings — wire memory strategy, fix dead code, add tests by yairfalse · Pull Request #34 · false-systems/nopea

yairfalse · 2026-03-04T01:14:11Z

Summary

Wire memory-based strategy auto-selection — the core "deployment tool with memory" feature now actually uses memory for decisions. Services with failure patterns above threshold auto-select :canary
Remove verify_deploy rescue — let OTP handle crashes instead of masking programming errors as "unverified"
Fix SYKLI target routing — route through ServiceAgent instead of bypassing queue/serialization
Fix CLI — remove rollback vaporware, fix serve command, use Logger instead of IO.puts
Add missing MCP tool tests — cover all 5 tools including error paths
Fix flaky distributed registry tests — replace Process.sleep sync barriers with assert_eventually polling
Configure Logger metadata — structured logging with deploy context (service, deploy_id, namespace, etc.)
Populate concurrent_deploys — query Nopea.Registry for currently-deploying services, create :deployed_together graph edges
Update CLAUDE.md — match current codebase reality (modules, deps, supervision tree, API)

Commits

feat: wire memory-based strategy auto-selection
fix: remove verify_deploy rescue, pass k8s_module to Drift
fix: route SYKLI target through ServiceAgent
fix: clean up CLI — remove rollback vaporware, fix serve command
test: add missing MCP tool tests
fix: replace Process.sleep sync barriers with polling in distributed tests
chore: configure Logger metadata keys
feat: populate concurrent_deploys from ServiceAgent registry
docs: update CLAUDE.md to match current codebase

Test plan

mix compile --warnings-as-errors passes
mix test passes (280 tests, 0 failures)
mix format --check-formatted passes
mix credo passes
mix escript.build succeeds
Verify memory-based strategy: service with 2+ failures and strategy: nil → auto-selects :canary
Verify verify_deploy crash propagation on malformed manifests
Verify SYKLI deploys update ServiceAgent state
Verify nopea serve doesn't crash

🤖 Generated with Claude Code

The core "deployment tool with memory" feature was dead code — select_strategy/2 always returned :direct when strategy was nil, ignoring the memory context entirely. Add a new clause that checks known services for failure patterns above the configurable canary_threshold (default 0.15). Services with recurring failures now auto-select canary strategy for safer rollouts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove the overbroad rescue in verify_deploy/2 that silently converted programming errors into "unverified" status. Pass k8s_module through to Drift.verify_manifest so verification uses the configured K8s module. Tests now stub get_resource properly instead of relying on the rescue to mask connection failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

SYKLI target called Deploy.run() directly, bypassing the ServiceAgent's queue/serialization/backpressure. Change to Deploy.deploy() which routes through ServiceAgent when available. Add test verifying ServiceAgent state is updated after SYKLI deploy. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tests Replace flaky Process.sleep barriers in distributed_registry_test with assert_eventually polling helper. Horde.Registry operations are async and can't be reliably synced with fixed sleep durations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Remove rollback from moduledoc (not implemented) - Add require Logger, replace IO.puts with Logger - Fix serve command to use Application.ensure_all_started - Route deploy through Deploy.deploy instead of Deploy.run Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- nopea_deploy: missing service, empty service errors - nopea_history: cache unavailable, known/unknown service - nopea_explain: memory unavailable, unknown service, failure patterns - nopea_context: known service with memory - tools/list: verify all 5 tools listed Add mcp_with_memory_test.exs (async: false) for tests needing Memory. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add structured logging metadata to default formatter config. Fixes credo warnings about metadata keys not in Logger config and ensures service/deploy context appears in log output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Query Nopea.Registry for currently-deploying services instead of hardcoding empty list in record_outcome - Fix ingestor to create :deployed_together edges (not just orphan nodes) so the graph captures temporal deployment correlations - Add :deployed_together to valid relation types Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- OTP tree: Replace Deploy.Supervisor with ServiceAgent.Supervisor, add Cluster, fix Registry description - KEY MODULES: Remove deleted Worker/Supervisor/Canary/BlueGreen, add ServiceAgent, Kulta.RolloutBuilder, Graph modules - FILE LOCATIONS: Remove deleted files, add new ones - DEPENDENCIES: Replace kerto (inlined) with false_protocol - MCP SERVER: Add nopea_health tool - STRATEGY AUTO-SELECTION: Update to actual memory-aware implementation - VERIFICATION: Update test count to 280 - MEMORY: Fix graph stats API, update relationship types - K8s Mock Pattern: Document get_resource stub requirement Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR fixes and completes the “deploy tool with memory” integration by routing deploys through per-service agents, using memory context to auto-select strategies, improving logging, and expanding/strengthening test coverage.

Changes:

Route deploy execution through ServiceAgent (Deploy.deploy/1) and update SYKLI/CLI call sites accordingly.
Add memory-aware strategy auto-selection and record concurrent deploy relationships into the memory graph.
Add/extend tests for MCP tools, strategy selection, SYKLI routing, and reduce flakiness in distributed registry tests.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`lib/nopea/sykli/target.ex`	Switch SYKLI deploy execution to `Deploy.deploy/1` to route through `ServiceAgent`.
`lib/nopea/memory/ingestor.ex`	Ingest `concurrent_deploys` into the graph via `:deployed_together` relationships.
`lib/nopea/graph/relation_type.ex`	Register the new `:deployed_together` relationship type.
`lib/nopea/deploy.ex`	Add memory-based strategy selection, pass `k8s_module` into drift verification, record concurrent deploys.
`lib/nopea/cli.ex`	Update CLI deploy entry point and rework `serve` behavior/logging.
`config/config.exs`	Configure Logger default formatter/metadata for structured deploy context.
`CLAUDE.md`	Update repo documentation to reflect current architecture and conventions.
`test/nopea/sykli/target_test.exs`	Add coverage ensuring SYKLI routing goes through `ServiceAgent`.
`test/nopea/memory/ingestor_test.exs`	Add test coverage for `:deployed_together` edges creation.
`test/nopea/mcp_with_memory_test.exs`	Add MCP tool tests covering memory-backed explain/context/history behavior.
`test/nopea/mcp_test.exs`	Tighten tools/list assertions and add error-path coverage for MCP tools.
`test/nopea/distributed_registry_test.exs`	Replace sleep-based barriers with polling assertions to reduce flakiness.
`test/nopea/deploy_test.exs`	Add strategy-selection tests (memory-aware) and crash-propagation test for verification.
`test/nopea/deploy_integration_test.exs`	Stub K8s `get_resource` for cluster-less test environment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-04T01:30:19Z

+  defp get_concurrent_services(current_service) do
+    if Process.whereis(Nopea.Registry) do
+      Registry.select(Nopea.Registry, [
+        {{:"$1", :"$2", :_}, [], [{{:"$1", :"$2"}}]}
+      ])
+      |> Enum.filter(fn
+        {{:service, name}, _pid} when name != current_service -> true
+        _ -> false
+      end)
+      |> Enum.map(fn {{:service, name}, _pid} -> name end)
+    else


get_concurrent_services/1 currently returns all services registered in Nopea.Registry (i.e., all ServiceAgents), not the services that are actually deploying concurrently. This will over-report concurrent_deploys and can create misleading :deployed_together edges. Consider filtering to agents with status == :deploying (e.g., via Nopea.ServiceAgent.status/1/a registry value) or tracking deploying services separately.

Copilot · 2026-03-04T01:30:20Z

+      # Give casts time to process
+      Process.sleep(50)
+


Using Process.sleep/1 as a synchronization barrier for the async Nopea.Memory.record_deploy/1 cast can make this test flaky under load/CI. Prefer a deterministic barrier (e.g., issue any GenServer.call to Nopea.Memory after the casts, since mailbox order guarantees the prior casts have been processed) instead of sleeping.

Suggested change

# Give casts time to process

Process.sleep(50)

Copilot · 2026-03-04T01:30:20Z

+      Process.sleep(50)
+


This Process.sleep/1 is acting as a timing-based sync barrier for memory ingestion. To avoid flakiness, replace it with a deterministic barrier (e.g., a GenServer.call to Nopea.Memory after the cast(s), or an assert_eventually helper) so the test doesn't depend on scheduler timing.

Copilot · 2026-03-04T01:30:20Z

+      Process.sleep(50)
+


Timing-based Process.sleep/1 here can introduce intermittent failures on slower machines. Since record_deploy/1 is a cast, consider using a deterministic synchronization method (e.g., follow the cast with a GenServer.call to Nopea.Memory) instead of sleeping.

Copilot · 2026-03-04T01:30:20Z

+        })
+      end
+
+      Process.sleep(50)


This test relies on Process.sleep/1 to wait for async Nopea.Memory.record_deploy/1 processing, which can be flaky. Prefer a deterministic barrier (for example, call Nopea.Memory.node_count/0 or another GenServer.call after the casts) to ensure ingestion has completed before asserting.

Suggested change

Process.sleep(50)

# Use a deterministic GenServer.call barrier instead of sleeping

_ = Nopea.Memory.node_count()

Copilot · 2026-03-04T01:30:21Z

+        concurrent_deploys: []
+      })
+
+      Process.sleep(50)


Same concern as above: Process.sleep/1 is used as a sync barrier for an async cast. Replacing it with a deterministic GenServer.call barrier (or a polling helper) will make this test more reliable.

Suggested change

Process.sleep(50)

Nopea.Memory

|> Process.whereis()

|> :sys.get_state()

Copilot · 2026-03-04T01:30:21Z

+    Logger.info("Starting Nopea daemon...")
+    Application.put_env(:nopea, :enable_api, true)

-    case Supervisor.start_child(Nopea.AppSupervisor, Nopea.API.Router) do
-      {:ok, _pid} ->
+    case Application.ensure_all_started(:nopea) do
+      {:ok, _apps} ->


serve/1 sets :enable_api, but the application supervision tree gates the HTTP server on :enable_router (see Nopea.Application.add_router_child/1). As written, nopea serve may never start Nopea.API.Router unless enable_router is already true in config. Use the same config key here (or update the application to read :enable_api consistently).

…t barriers, correct config key - get_concurrent_services: filter to agents with status == :deploying instead of returning all registered services (avoids misleading :deployed_together edges) - Replace Process.sleep(50) sync barriers with Memory.node_count() call — BEAM mailbox ordering guarantees prior casts complete first - CLI serve: use :enable_router (matching Application) not :enable_api Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Addresses Credo warning about inefficient filter chain in get_concurrent_services/1 — single pass instead of three. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove large module/file tables (discoverable via ls). Add config feature flags, ServiceAgent constraints, mailbox flush test pattern, single-test-by-line command, test factory paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

yairfalse and others added 9 commits March 3, 2026 23:11

chore: configure Logger metadata keys

72c2b2b

Add structured logging metadata to default formatter config. Fixes credo warnings about metadata keys not in Logger config and ensures service/deploy context appears in log output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

yairfalse requested a review from Copilot March 4, 2026 01:23

Copilot started reviewing on behalf of yairfalse March 4, 2026 01:24 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

yairfalse and others added 3 commits March 5, 2026 00:06

refactor: collapse chained Enum.filter into single Enum.flat_map

0c2fdd1

Addresses Credo warning about inefficient filter chain in get_concurrent_services/1 — single pass instead of three. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

yairfalse merged commit 03eb41f into main Mar 5, 2026
0 of 2 checks passed

yairfalse deleted the fix/review-findings-v2 branch March 5, 2026 23:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: address all review findings — wire memory strategy, fix dead code, add tests#34

fix: address all review findings — wire memory strategy, fix dead code, add tests#34
yairfalse merged 12 commits into
mainfrom
fix/review-findings-v2

yairfalse commented Mar 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	Process.sleep(50)
	# Use a deterministic GenServer.call barrier instead of sleeping
	_ = Nopea.Memory.node_count()

-      Process.sleep(50)
+      Nopea.Memory
+      |> Process.whereis()
+      |> :sys.get_state()

Conversation

yairfalse commented Mar 4, 2026

Summary

Commits

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants