Skip to content

Add process discovery and global server management#105

Merged
sdairs merged 5 commits intomainfrom
issue-102-process-discovery-global-servers
Apr 14, 2026
Merged

Add process discovery and global server management#105
sdairs merged 5 commits intomainfrom
issue-102-process-discovery-global-servers

Conversation

@sdairs
Copy link
Copy Markdown
Collaborator

@sdairs sdairs commented Apr 10, 2026

Summary

Closes #102 #89

  • Process discovery module (src/local/discovery.rs): Finds running ClickHouse processes via pgrep, resolves their cwd (macOS: lsof, Linux: /proc/<pid>/cwd), and parses command-line args to extract ports and version. Only processes whose cwd matches .clickhouse/servers/<name>/data/ are recognized as CLI-managed.
  • Orphaned server recovery: server list, server start, and other server commands automatically run process discovery on the current project. If a running ClickHouse process has no metadata file, a ServerInfo is recovered and saved so it appears in listings and can be managed normally.
  • --global flag on server list: Shows all running ClickHouse servers across all projects with a Project column indicating which directory each belongs to.
  • --global flag on server stop: Stops a server from any project by name. If the name is ambiguous across projects, --project can disambiguate.
  • --global flag on server stop-all: Stops every CLI-managed ClickHouse process system-wide.

Test plan

  • All 267 existing tests pass
  • cargo clippy clean
  • 16 new unit tests for parsing logic (parse_server_cwd, parse_port_flag, parse_version_from_cmdline)
  • Manual: start a server, delete its .json metadata file, run server list → server should be recovered
  • Manual: start servers in two different project directories, run server list --global from either → both shown
  • Manual: server stop-all --global kills all servers across projects
  • Manual: server stop <name> --global --project /path targets the correct server when names collide

🤖 Generated with Claude Code

…management

Implements #102: ClickHouse processes started by the CLI can now be recovered
via OS-level process inspection when their metadata files are lost. Adds --global
flag to server list/stop/stop-all for cross-project server management.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sdairs sdairs requested a review from iskakaushik as a code owner April 10, 2026 15:25
@sdairs sdairs temporarily deployed to cloud-integration April 10, 2026 15:25 — with GitHub Actions Inactive
Recovery only ran inside list_all_servers(), so start/stop/remove still
relied on metadata-only checks. An orphaned server (live process, missing
JSON) could have its data directory deleted by remove, be launched over
by start, or be unreachable by stop unless server list ran first.

Now each command entry point calls recover_current_project_servers()
before any metadata-dependent logic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sdairs sdairs temporarily deployed to cloud-integration April 10, 2026 17:05 — with GitHub Actions Inactive
Both functions sent SIGTERM/SIGKILL but returned Ok(()) without checking
whether the process actually exited. Permission failures (EPERM) and
stubborn processes would be silently reported as stopped.

Extract shared kill_process() that checks the return value of each
libc::kill call and verifies the process is no longer alive after
SIGKILL before returning Ok.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sdairs sdairs temporarily deployed to cloud-integration April 10, 2026 19:44 — with GitHub Actions Inactive
macOS lsof OR's options by default. Without -a, `lsof -d cwd -p <pid>`
means "all cwd entries for all processes OR all entries for <pid>",
dumping thousands of entries. The parser grabbed the first n-prefixed
line (n/ from init), which never matched the server cwd pattern, so
no servers were ever discovered.

Adding -a ANDs the conditions so only the cwd of the target PID is
returned.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sdairs sdairs temporarily deployed to cloud-integration April 10, 2026 20:06 — with GitHub Actions Inactive
--global is for system-wide maintenance, not normal workflow. Agent
context sections should not mention it (per CLAUDE.md convention), and
the flag descriptions now explicitly steer agents toward the default
project-scoped behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sdairs sdairs temporarily deployed to cloud-integration April 10, 2026 20:15 — with GitHub Actions Inactive
@sdairs sdairs merged commit 6f6f2ed into main Apr 14, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Recover orphaned servers via process discovery + global server management

2 participants