Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 13 additions & 19 deletions KNOWN_ISSUES.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Technical debt identified during codebase analysis. Address these before adding
**Discovered**: 2026-03-24 during v0.13.0 pre-release audit

**Files exceeding 400-line limit:**
- `backends/openshift/sync.py` — 595 lines
- `backends/openshift/sync.py` — dead code for OpenShift (kept for Podman base class)
- `cli/commands.py` — 580 lines
- `backends/podman/backend.py` — 504 lines
- `backends/openshift/proxy.py` — 484 lines
Expand All @@ -23,7 +23,7 @@ Technical debt identified during codebase analysis. Address these before adding
**Methods exceeding 50-line limit:**
- `workflow.py` — `harvest_session()` (~102 lines), `status_sessions()` (~84), `reset_session()` (~72)
- `cli/commands.py` — `session_cp()` (~75 lines)
- `backends/openshift/sync.py` — `_sync_agent_config()` (~90 lines)
- `backends/openshift/sync.py` — dead code (see above)
- `backends/podman/backend.py` — `create_session()` (~95 lines)

**Classes exceeding 20-method limit:**
Expand Down Expand Up @@ -83,14 +83,8 @@ Declarative resources (already applied as JSON via `oc apply`):
- 3 NetworkPolicies (agent egress, proxy egress, proxy ingress) — `backends/openshift/proxy.py`
- 1 StatefulSet with PVC template (agent pod, runs `tini -- sleep infinity`) — `backends/openshift/resources.py`

Post-apply imperative steps (the blocking problem):
- Poll `oc get pod` for readiness
- `oc exec mkdir -p /credentials`
- `oc cp` stub GCP ADC, gitconfig, gitignore, sandbox config script
- `oc rsync` agent config directory (~/.claude/)
- `oc exec` jq to rewrite plugin install paths
- `oc exec touch /credentials/.ready` (signals entrypoint to proceed)
- `oc exec entrypoint-session.sh` (starts agent headless)
Post-apply imperative steps (resolved — config now mounted via ConfigMap):
- Poll `oc get pod` for readiness (still present, but standard K8s usage)

Build resources (shared, coupled to create):
- BuildConfig + ImageStream — `backends/openshift/build.py` (binary build from local dir)
Expand All @@ -99,13 +93,13 @@ Build resources (shared, coupled to create):

**Gap 1 — No manifest export layer.** Each resource builder calls `oc apply -f -` inline. There is no way to collect all resource specs and write them to disk as YAML. Fix: add a `ManifestCollector` that accumulates resource dicts and can either apply them or write to a directory. Resource builders return dicts instead of applying directly.

**Gap 2 — Config injected into running pods via `oc cp`/`oc exec`.** `sync.py:ConfigSyncer.sync_full_config()` pushes files into a `/credentials/` tmpfs mount after the pod starts. The entrypoint polls `/credentials/.ready` for 300 seconds. Fix: prepare the config directory locally before container start, then mount it as a volume (ConfigMap in K8s, bind mount in Podman). The entrypoint runs directly with config already present — same code path for both backends, no conditional branching needed.
**Gap 2 — Config injected into running pods via `oc cp`/`oc exec`.** (Resolved) All config files (stub GCP ADC, gitconfig user.name/email, sandbox config script) are now packaged into a ConfigMap and mounted at `/credentials` before the container starts. No `oc cp`/`oc exec` is needed. Cursor auth and global gitignore syncing were removed entirely.

**Gap 3 — Secrets created inline during `paude create`.** CA cert is generated via openssl and credentials are gathered from the host environment, both stored as K8s Secrets during `paude create`. Fix: users pre-create secrets out-of-band (`oc create secret`, sealed-secrets, ESO, vault) and pass names via `--ca-secret` / `--creds-secret` flags. CA generation becomes a helper command (`paude setup-proxy-ca`). Paude manifests just reference secret names, never contain secret data.

**Gap 4 — Image builds coupled to session creation.** `build.py` creates BuildConfig/ImageStream and runs `oc start-build --from-dir=...` which uploads local files. Fix: separate `paude build` from `paude create`. Emitted YAML references a pre-built image by tag or digest.

**Gap 5 — Container starts with `sleep infinity`, agent launched via `oc exec`.** The StatefulSet command is `tini -- sleep infinity` because the entrypoint can't run until config is pushed. Fix: once config is mounted as volumes (Gap 2), the StatefulSet command becomes `entrypoint-session.sh` directly. No `sleep infinity` + `oc exec` dance.
**Gap 5 — Container starts with `sleep infinity`, agent launched via `oc exec`.** (Resolved) The StatefulSet command is now `tini -- entrypoint-session.sh` with `PAUDE_HEADLESS=1`. Config is pre-mounted via ConfigMap (Gap 2), so the entrypoint runs directly. No `sleep infinity` + `oc exec` pattern.

**Gap 6 — Interactive operations (`oc exec`, `oc port-forward`, connect).** No fix needed. These are operational commands that work against running resources. They are orthogonal to GitOps — declarative manages the desired state, interactive commands are for human access.

Expand All @@ -128,13 +122,13 @@ Phase 3 — Externalize secrets (low-medium effort, high value):
- Paude manifests reference secret names, never generate secret data inline
- Existing inline secret creation remains as default for backward compatibility

Phase 4 — Config as mounted volumes (high effort, high value):
- Prepare config directory locally before container start
- Package as ConfigMap (K8s) or bind mount (Podman) — same entrypoint for both
- Move plugin path rewriting from jq/oc-exec to pure Python at prep time
- Remove `sleep infinity` + `oc exec` pattern; entrypoint runs directly as container command
- Remove `/credentials/.ready` polling from entrypoint (config always present at start)
- Files: `sync.py`, `resources.py`, `entrypoint-session.sh`, Podman backend
Phase 4 — Config as mounted volumes (done for OpenShift):
- Agent config sync and plugin path rewriting already removed (done)
- Remaining config (stub ADC, gitconfig user.name/email, sandbox script) packaged as ConfigMap (done)
- `sleep infinity` + `oc exec` pattern removed; entrypoint runs directly (done)
- ConfigMap includes `.ready` marker so entrypoint skips wait (done)
- Cursor auth and global gitignore syncing removed (done)
- Podman backend still uses old `podman cp`/`podman exec` pattern (future work)

## Security Hardening Backlog

Expand Down
49 changes: 1 addition & 48 deletions containers/paude/entrypoint-lib-config.sh
Original file line number Diff line number Diff line change
@@ -1,54 +1,7 @@
#!/bin/bash
# Agent config copy and PVC persistence utilities for the paude entrypoint.
# Agent config PVC persistence utilities for the paude entrypoint.
# Sourced by entrypoint-session.sh — not run standalone.

# Copy agent config from a source directory into $HOME.
# Handles: recursive copy, config file relocation, plugin permissions, OpenShift ownership.
# Skips runtime directories/files that are generated by the agent inside the
# container — these may be persisted on PVC and must not be overwritten by
# (stale or empty) host-side copies. This is defense-in-depth: the host-side
# rsync already excludes these, but we guard here too.
# Args: source_path (directory containing agent config files)
copy_agent_config() {
local source_path="$1"
local dest="$HOME/$AGENT_CONFIG_DIR"

mkdir -p "$dest"
chmod g+rwX "$dest" 2>/dev/null || true

# Copy items individually, skipping runtime state that the agent generates
# inside the container. Keep in sync with _CLAUDE_CONFIG_EXCLUDES in
# src/paude/agents/claude.py — this is defense-in-depth (host-side rsync
# already excludes these).
for item in "$source_path"/* "$source_path"/.*; do
[[ ! -e "$item" ]] && continue
local name
name="${item##*/}"
case "$name" in
.|..) continue ;;
backups|cache|debug|downloads|file-history|paste-cache|plans|\
session-env|sessions|shell-snapshots|statsig|tasks|todos|projects|.git)
[[ -d "$item" ]] && continue ;;
history.jsonl|stats-cache.json)
[[ -f "$item" ]] && continue ;;
esac
cp -dR --preserve=mode,timestamps "$item" "$dest/" 2>/dev/null || true
done

# Handle config file specially - goes to ~/.<config_file>
# Use cp instead of mv to write through symlinks (on re-execution,
# persist_agent_config may have already symlinked paths to /pvc).
if [[ -n "$AGENT_CONFIG_FILE" ]] && [[ -n "$AGENT_CONFIG_FILE_BASENAME" ]] && [[ -f "$HOME/$AGENT_CONFIG_DIR/$AGENT_CONFIG_FILE_BASENAME" ]]; then
cp -f "$HOME/$AGENT_CONFIG_DIR/$AGENT_CONFIG_FILE_BASENAME" "$HOME/$AGENT_CONFIG_FILE" 2>/dev/null || true
rm -f "$HOME/$AGENT_CONFIG_DIR/$AGENT_CONFIG_FILE_BASENAME" 2>/dev/null || true
chmod g+rw "$HOME/$AGENT_CONFIG_FILE" 2>/dev/null || true
fi

# g+rwX sets read/write and execute on directories (X = execute only if dir)
# This covers plugins/ and all other subdirectories in one pass
chmod -R g+rwX "$HOME/$AGENT_CONFIG_DIR" 2>/dev/null || true
}

# Persist agent config on the PVC volume so it survives container recreation.
# Creates symlinks: $HOME/$AGENT_CONFIG_DIR -> /pvc/$AGENT_CONFIG_DIR
# $HOME/$AGENT_CONFIG_FILE -> /pvc/$AGENT_CONFIG_FILE
Expand Down
5 changes: 0 additions & 5 deletions containers/paude/entrypoint-lib-credentials.sh
Original file line number Diff line number Diff line change
Expand Up @@ -98,11 +98,6 @@ setup_credentials() {
ln -sf "$config_path/gcloud" "$HOME/.config/gcloud"
fi

# Copy agent config (need to be writable, so copy instead of symlink)
if [[ -d "$config_path/$AGENT_NAME" ]]; then
copy_agent_config "$config_path/$AGENT_NAME"
fi

# Set up gitconfig via symlink
if [[ -f "$config_path/gitconfig" ]]; then
rm -f "$HOME/.gitconfig" 2>/dev/null || true
Expand Down
35 changes: 11 additions & 24 deletions containers/paude/entrypoint-session.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,6 @@ AGENT_CONFIG_FILE="${PAUDE_AGENT_CONFIG_FILE:-.claude.json}"
AGENT_INSTALL_SCRIPT="${PAUDE_AGENT_INSTALL_SCRIPT:-curl -fsSL https://claude.ai/install.sh | bash}"
AGENT_SESSION_NAME="${PAUDE_AGENT_SESSION_NAME:-claude}"
AGENT_LAUNCH_CMD="${PAUDE_AGENT_LAUNCH_CMD:-claude}"
AGENT_SEED_DIR="${PAUDE_AGENT_SEED_DIR:-/tmp/claude.seed}"
AGENT_SEED_FILE="${PAUDE_AGENT_SEED_FILE:-/tmp/claude.json.seed}"
# Derive basename for config file (e.g., ".claude.json" -> "claude.json")
AGENT_CONFIG_FILE_BASENAME="${AGENT_CONFIG_FILE#.}"
# Backward compat: PAUDE_AGENT_ARGS > PAUDE_CLAUDE_ARGS > positional args
AGENT_ARGS="${PAUDE_AGENT_ARGS:-${PAUDE_CLAUDE_ARGS:-$*}}"

Expand Down Expand Up @@ -135,34 +131,25 @@ attach_to_session() {
exec tmux -u attach -t "$AGENT_SESSION_NAME"
}

# On reconnect (tmux session already exists), skip config copy and sandbox
# config — re-copying from host seed mounts would overwrite in-container
# state (prompt history, project data, conversation context).
# On reconnect (tmux session already exists), skip sandbox config —
# reapplying would overwrite in-container state.
if tmux -u has-session -t "$AGENT_SESSION_NAME" 2>/dev/null; then
exit_if_headless "already running"
attach_to_session reconnect
fi

# Legacy: Copy seed files if provided via Secret mount (Podman backend fallback)
if [[ -d "$AGENT_SEED_DIR" ]] && [[ ! -d /credentials ]]; then
copy_agent_config "$AGENT_SEED_DIR"
fi

# Also check for separate config file seed mount (Podman backend)
if [[ -n "$AGENT_SEED_FILE" ]] && { [[ -f "$AGENT_SEED_FILE" ]] || [[ -L "$AGENT_SEED_FILE" ]]; }; then
if [[ -n "$AGENT_CONFIG_FILE" ]]; then
cp -L "$AGENT_SEED_FILE" "$HOME/$AGENT_CONFIG_FILE" 2>/dev/null || true
chmod g+rw "$HOME/$AGENT_CONFIG_FILE" 2>/dev/null || true
# Apply agent sandbox config (generated by Python, mounted via ConfigMap or synced)
if [[ "${PAUDE_SUPPRESS_PROMPTS:-}" == "1" ]]; then
_SANDBOX_CFG=""
for _candidate in "$HOME/.paude/agent-sandbox-config.sh" "/credentials/agent-sandbox-config.sh"; do
[[ -f "$_candidate" ]] && _SANDBOX_CFG="$_candidate" && break
done
if [[ -n "$_SANDBOX_CFG" ]]; then
source "$_SANDBOX_CFG" 2>>/tmp/sandbox-config.log \
|| echo "agent-sandbox-config.sh failed: $?" >> /tmp/sandbox-config.log
fi
fi

# Apply agent sandbox config (generated by Python, synced before entrypoint runs)
_SANDBOX_CFG="$HOME/.paude/agent-sandbox-config.sh"
if [[ "${PAUDE_SUPPRESS_PROMPTS:-}" == "1" ]] && [[ -f "$_SANDBOX_CFG" ]]; then
source "$_SANDBOX_CFG" 2>>/tmp/sandbox-config.log \
|| echo "agent-sandbox-config.sh failed: $?" >> /tmp/sandbox-config.log
fi

# Session workspace setup
# For persistent sessions, workspace is at /workspace (mounted volume)
WORKSPACE="${PAUDE_WORKSPACE:-/workspace}"
Expand Down
37 changes: 12 additions & 25 deletions docs/OPENSHIFT.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,31 +118,19 @@ Credentials are stored in RAM-only storage for enhanced security:

**Configuration Sync:**

Configuration is synced via `oc cp` to tmpfs on session start and reconnect:
Minimal configuration is synced via `oc cp` to tmpfs on session start:

**Synced from host:**
- `~/.config/gcloud` → gcloud credentials for Vertex AI authentication
- Stub GCP ADC (sentinel values — real auth handled by proxy sidecar)
- `~/.gitconfig` → Git identity configuration
- `~/.claude/` → Agent config directory (for Claude Code), including:
- `settings.json`, `settings.local.json` - Core settings
- `plugins/` - Installed plugins and marketplace metadata
- `CLAUDE.md` - Global instructions
- `~/.claude.json` → Agent preferences
- `~/.config/git/ignore` → Global gitignore
- Agent sandbox config script (onboarding/trust prompt suppression)
- Cursor auth.json (for Cursor agent only)

**Excluded (session-specific):**
- `history.jsonl`, `tasks/`, `todos/`, `plans/` - Session state
- `sessions/`, `session-env/`, `projects/` - Session metadata
- `cache/`, `stats-cache.json`, `statsig/` - Caches
- `debug/`, `file-history/`, `shell-snapshots/` - Debug logs
- `backups/`, `downloads/`, `paste-cache/`, `.git/` - Misc

Plugin paths are automatically rewritten from host paths to container paths.
Agent config directories (`~/.claude/`, `~/.gemini/`, etc.) are **not** synced from the host. The agent starts with vanilla config — the sandbox config script generates the minimum viable settings. This improves security (no host credentials leak into containers) and simplifies the path to GitOps-compatible session creation.

**Credential Refresh:**
- **First connect** (after pod start): Full sync of gcloud, claude config, and gitconfig
- **Reconnect** (subsequent connects): Only gcloud credentials refreshed (fast)
- This ensures fresh OAuth tokens propagate if you re-authenticate locally
- Long-running pods stay current with local credential changes
- **First connect** (after pod start): Full sync of stub credentials, gitconfig, and sandbox config
- **Reconnect** (subsequent connects): Stub credentials and sandbox config refreshed (fast)

### Network Filtering

Expand Down Expand Up @@ -261,9 +249,9 @@ For merge conflicts, use normal git workflows (rebase, merge, etc.).
│ │ │ │ + tmux │ │ ┌───────────────────────┐ │ │
│ │ │ └────────────┘ │ │ tmpfs: /credentials │ │ │
│ │ │ │ │ (RAM-only, ephemeral) │ │ │
│ │ │ Mounts: │ │ - gcloud creds │ │ │
│ │ │ - /pvc (PVC) │ │ - ~/.claude/ dir │ │ │
│ │ │ - /credentials │ │ - gitconfig │ │ │
│ │ │ Mounts: │ │ - stub gcloud creds │ │ │
│ │ │ - /pvc (PVC) │ │ - gitconfig │ │ │
│ │ │ - /credentials │ │ - sandbox config │ │ │
│ │ │ (tmpfs) │ └───────────────────────┘ │ │
│ │ └──────────────────┘ │ │
│ │ ↑ │ │
Expand All @@ -275,7 +263,6 @@ For merge conflicts, use normal git workflows (rebase, merge, etc.).
┌────────┴────────┐
│ Local Machine │
│ - workspace │
│ - ~/.claude/ │
│ - credentials │
│ - paude CLI │
└─────────────────┘
Expand All @@ -288,7 +275,7 @@ For merge conflicts, use normal git workflows (rebase, merge, etc.).
| Session Persistence | Yes (named volumes) | Yes (tmux + PVC) |
| Network Disconnect | Session lost | Session preserved |
| Code Sync | git push/pull | git push/pull |
| Config Sync | Mounted at start | oc cp at connect |
| Config Sync | oc cp at connect | oc cp at connect |
| Multi-machine | No | Yes |
| Resource Isolation | Container | Pod + namespace |
| Setup Complexity | Low | Medium |
Expand Down
5 changes: 0 additions & 5 deletions src/paude/agents/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,6 @@ class AgentConfig:
passthrough_env_prefixes: Host env var prefixes to forward.
config_dir_name: Config directory under HOME (e.g., ".claude").
config_file_name: Config file under HOME (e.g., ".claude.json"), or None.
config_excludes: Rsync excludes for config sync.
config_sync_files_only: When non-empty, only these files (relative to
config dir) are copied instead of rsyncing the entire directory.
activity_files: Paths (relative to config dir) for activity detection.
yolo_flag: CLI flag to skip permissions
(e.g., "--dangerously-skip-permissions").
Expand All @@ -53,8 +50,6 @@ class AgentConfig:
passthrough_env_prefixes: list[str] = field(default_factory=list)
config_dir_name: str = ".claude"
config_file_name: str | None = ".claude.json"
config_excludes: list[str] = field(default_factory=list)
config_sync_files_only: list[str] = field(default_factory=list)
activity_files: list[str] = field(default_factory=list)
yolo_flag: str | None = "--dangerously-skip-permissions"
clear_command: str | None = "/clear"
Expand Down
45 changes: 1 addition & 44 deletions src/paude/agents/claude.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,29 +10,6 @@
build_provider_credentials,
pipefail_install_lines,
)
from paude.mounts import resolve_path

# Keep in sync with the case statement in copy_agent_config() in
# containers/paude/entrypoint-session.sh (defense-in-depth).
_CLAUDE_CONFIG_EXCLUDES = [
"/backups",
"/cache",
"/debug",
"/downloads",
"/file-history",
"/history.jsonl",
"/paste-cache",
"/plans",
"/session-env",
"/sessions",
"/shell-snapshots",
"/stats-cache.json",
"/statsig",
"/tasks",
"/todos",
"/projects",
"/.git",
]

_CLAUDE_ACTIVITY_FILES = [
"history.jsonl",
Expand Down Expand Up @@ -60,7 +37,6 @@ def __init__(self, provider: str | None = None) -> None:
passthrough_env_prefixes=creds.passthrough_env_prefixes,
config_dir_name=".claude",
config_file_name=".claude.json",
config_excludes=list(_CLAUDE_CONFIG_EXCLUDES),
activity_files=list(_CLAUDE_ACTIVITY_FILES),
yolo_flag="--dangerously-skip-permissions",
clear_command="/clear",
Expand Down Expand Up @@ -136,26 +112,7 @@ def launch_command(self, args: str) -> str:
return "claude"

def host_config_mounts(self, home: Path) -> list[str]:
mounts: list[str] = []

# Claude seed directory (ro)
claude_dir = home / ".claude"
resolved_claude = resolve_path(claude_dir)
if resolved_claude and resolved_claude.is_dir():
mounts.extend(["-v", f"{resolved_claude}:/tmp/claude.seed:ro"])

# Plugins at original host path (ro)
plugins_dir = resolved_claude / "plugins"
if plugins_dir.is_dir():
mounts.extend(["-v", f"{plugins_dir}:{plugins_dir}:ro"])

# claude.json seed (ro)
claude_json = home / ".claude.json"
resolved_claude_json = resolve_path(claude_json)
if resolved_claude_json and resolved_claude_json.is_file():
mounts.extend(["-v", f"{resolved_claude_json}:/tmp/claude.json.seed:ro"])

return mounts
return []

def build_environment(self) -> dict[str, str]:
return build_environment_from_config(self._config)
Loading