Conversation
Co-Authored-By: Joel Klabo <joel.klabo@microsoft.com>
Signed-off-by: Peter Jausovec <peter.jausovec@solo.io>
| // WaitForReady polls the daemon API until it responds or the timeout expires | ||
| WaitForReady(baseURL string) error |
There was a problem hiding this comment.
We already have IsRunning & Start methods defined for this interface. Wondering whether it makes sense to deflate this method into Start() since we already call --wait on docker compose. The one benefit is being able to call Start and then WaitForReady by consumers of this interface but the value seems low.
Overall, fine with extending this for now. I think there's some other cruft in this interface that's worth addressing first.
| if !isDockerComposeAvailable() { | ||
| fmt.Println("Docker compose is not available. Please install docker compose and try again.") | ||
| fmt.Println("See https://docs.docker.com/compose/install/ for installation instructions.") | ||
| fmt.Println("agent registry uses docker compose to start the server and the agent gateway.") | ||
| return fmt.Errorf("docker compose is not available") | ||
| } | ||
| if dm.IsRunning() { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Should this logic be flipped? We check if dm.IsRunning() first before validating whether docker compose is available in the env?
| // Package-level var so tests can stub out the real docker-compose check. | ||
| isDockerComposeAvailable = utils.IsDockerComposeAvailable |
There was a problem hiding this comment.
Fine for now - I think this indicates a clear smell with the daemon manager component though. The interface should abstract whether docker compose is available, or we define something like a no-op daemon manager and call the right constructor in the CLI run logic, etc.
Description
Fixes #48 —
arctl list -A(and other commands) failed with "failed to reach API after 3 attempts" when the daemon was auto-started because the API server wasn't fully ready when the client tried to connect.Change Type
Changelog
Root cause
After
docker compose up -d --waitreturns, the container is "healthy" per Docker's health check, but the API server inside may still be initializing. The previous retry logic (3 attempts, 1s/2s/3s linear backoff = 6 seconds total) was insufficient for slow networks or cold starts.Changes
pkg/types/types.go: AddedWaitForReady()toDaemonManagerinterfacepkg/daemon/daemon.go: ImplementedWaitForReady()— polls/v0/pingwith exponential backoff (500ms → 4s) for up to 30 secondspkg/cli/root.go: Callsdm.WaitForReady()afterdm.Start()so the API is confirmed responsive before creating the clientinternal/client/client.go: IncreasedpingWithRetryfrom 3 to 5 attempts with exponential backoff as a secondary safety netTests
pkg/daemon/daemon_test.go): interface compliance, already-ready, becomes-ready-after-delay, isServerRespondinginternal/client/client_test.go): immediate success, success after failures, all failgo test ./...cleanTest plan
arctl list -Asucceeds without retry errorsarctl list -Awithout the daemon running the command didn't wait for the daemon to be fully ready before returning #48)WaitForReady()returns immediately if daemon is already runninggo test ./...— all pass🤖 Generated with Claude Code