Skip to content

[codex] Persist Kimi homes after provider failures#14

Open
Unluckyathecking wants to merge 1 commit into
dmae97:mainfrom
Unluckyathecking:codex/persist-kimi-home-recovery
Open

[codex] Persist Kimi homes after provider failures#14
Unluckyathecking wants to merge 1 commit into
dmae97:mainfrom
Unluckyathecking:codex/persist-kimi-home-recovery

Conversation

@Unluckyathecking
Copy link
Copy Markdown

Summary

  • Persist isolated Kimi HOME under .omk/runs/<runId>/kimi-home/<node> when run/session metadata is available.
  • Retain that HOME on Kimi provider failures, including rate-limit and quota exits, and print the preserved session/subagent path for recovery.
  • Resume recorded Kimi session IDs when omk chat --run-id is used.

Why

Rate-limit or quota failures can kill Kimi subagents before their in-flight sessions are inspectable or resumable. The previous temp HOME cleanup removed .kimi/sessions/.../subagents evidence immediately, which made recovery impossible after provider-side failures.

This keeps recovery artifacts scoped to the OMK run while preserving cleanup on successful exits, startup failures, user stops, and orchestrator/controller stops.

Validation

  • npm run build
  • npm run check
  • npm run lint
  • npm test (83 files passed)

@Unluckyathecking Unluckyathecking marked this pull request as ready for review May 24, 2026 17:42
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c9342905ed

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/kimi/runner.ts
Comment on lines +991 to +999
result = await runShellStreaming(kimiBin, args, {
cwd: worktree,
timeout: effectiveTimeout,
env: mergedEnv,
logPath,
input: "",
onStdout: thinkingHandler,
signal,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Ensure isolated HOME cleanup runs on runner exceptions

Wrap the runShellStreaming call in a try/finally so cleanupIsolatedKimiHome(tmpHome) executes even when runShellStreaming rejects before producing a result (for example, log-path creation or spawn/setup errors). In the current flow, those exceptions bypass cleanup entirely, leaving the isolated HOME (including symlinked auth material) behind and violating the intended behavior of cleaning up on non-provider failures.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant