Skip to content

fix: wait for daemon readiness after auto-start#141

Closed
joelklabo wants to merge 1 commit intoagentregistry-dev:mainfrom
joelklabo:fix/daemon-readiness-wait
Closed

fix: wait for daemon readiness after auto-start#141
joelklabo wants to merge 1 commit intoagentregistry-dev:mainfrom
joelklabo:fix/daemon-readiness-wait

Conversation

@joelklabo
Copy link
Contributor

@joelklabo joelklabo commented Feb 8, 2026

Description

Fixes #48arctl list -A (and other commands) failed with "failed to reach API after 3 attempts" when the daemon was auto-started because the API server wasn't fully ready when the client tried to connect.

Change Type

/kind fix

Changelog

wait for daemon readiness after auto-start

Root cause

After docker compose up -d --wait returns, the container is "healthy" per Docker's health check, but the API server inside may still be initializing. The previous retry logic (3 attempts, 1s/2s/3s linear backoff = 6 seconds total) was insufficient for slow networks or cold starts.

Changes

  • pkg/types/types.go: Added WaitForReady() to DaemonManager interface
  • pkg/daemon/daemon.go: Implemented WaitForReady() — polls /v0/ping with exponential backoff (500ms → 4s) for up to 30 seconds
  • pkg/cli/root.go: Calls dm.WaitForReady() after dm.Start() so the API is confirmed responsive before creating the client
  • internal/client/client.go: Increased pingWithRetry from 3 to 5 attempts with exponential backoff as a secondary safety net

Tests

  • 4 new daemon tests (pkg/daemon/daemon_test.go): interface compliance, already-ready, becomes-ready-after-delay, isServerResponding
  • 3 new client tests (internal/client/client_test.go): immediate success, success after failures, all fail
  • All existing tests pass: go test ./... clean

Test plan

🤖 Generated with Claude Code

When arctl auto-starts the daemon via docker compose, the API server
inside the container may not be ready to accept requests immediately.
The previous 3-attempt ping with linear 1s/2s/3s backoff (6s total)
was insufficient for slow networks or cold starts.

Changes:
- Add WaitForReady() to DaemonManager interface
- Implement 30-second exponential backoff polling in DefaultDaemonManager
- Call WaitForReady() after Start() in PersistentPreRunE
- Increase pingWithRetry to 5 attempts with exponential backoff

Fixes agentregistry-dev#48

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@joelklabo joelklabo marked this pull request as ready for review February 9, 2026 17:21
@github-actions
Copy link

This pull request has been marked as stale due to no activity in the last 14 days. It will be closed in 3 days unless it is tagged "no stalebot" or other activity occurs.

@github-actions github-actions bot added the stale label Feb 27, 2026
@github-actions
Copy link

github-actions bot commented Mar 2, 2026

This pull request has been closed due to inactivity.

@peterj
Copy link
Contributor

peterj commented Mar 5, 2026

closing and using #263 instead.

@peterj peterj closed this Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

When I ran arctl list -A without the daemon running the command didn't wait for the daemon to be fully ready before returning

3 participants