Skip to content

fix(agent-core, kimi-code): handle stale closed sessions during resume#1110

Open
qyz7438 wants to merge 3 commits into
MoonshotAI:mainfrom
qyz7438:fix/session-not-found-on-resume
Open

fix(agent-core, kimi-code): handle stale closed sessions during resume#1110
qyz7438 wants to merge 3 commits into
MoonshotAI:mainfrom
qyz7438:fix/session-not-found-on-resume

Conversation

@qyz7438

@qyz7438 qyz7438 commented Jun 25, 2026

Copy link
Copy Markdown

Problem

When a session is closed while initialization is still running (for example, because an MCP server fails to start), the Session object can remain in CoreRPCImpl.sessions even though the underlying session directory still exists on disk. Resuming that session later then fails with [session.not_found] because requireSession rejects the closed object.

This matches the symptom reported by users who see [session.not_found] after an earlier initialization error (such as an MCP failure).

Changes

  • packages/agent-core/src/session/index.ts: track Session.closed state and guard close() / closeForReload() against double close.
  • packages/agent-core/src/rpc/core-impl.ts:
    • In resumeSessionWithOverrides, drop stale closed session objects so a fresh session can be reconstructed from persisted state.
    • In requireSession, return a clear SESSION_NOT_FOUND error when the requested session has been closed.
  • apps/kimi-code/src/tui/kimi-tui.ts: catch SESSION_NOT_FOUND when applying startup modes to a resumed session and show an actionable message instead of crashing.

Verification

  • pnpm typecheck passes for @moonshot-ai/agent-core and @moonshot-ai/kimi-code.
  • Existing tests in packages/agent-core/test run; failures observed are unrelated to this change (skill scanner / provider mocks / environment-specific path issues).

Related

Fixes the [session.not_found] resume failure path described in user reports.

When a session is closed while initialization is still running (for
example, because an MCP server fails to start), the session object can
remain in CoreRPCImpl.sessions even though the underlying directory
exists on disk. Resuming that session later then fails with
[session.not_found] because requireSession rejects the closed object.

Changes:
- Track Session.closed state and guard close() against double close.
- In resumeSessionWithOverrides, drop stale closed session objects so
  a fresh session can be reconstructed from persisted state.
- In requireSession, return a clear SESSION_NOT_FOUND error when the
  requested session has been closed.
- In the TUI, catch SESSION_NOT_FOUND when applying startup modes to a
  resumed session and show an actionable message instead of crashing.

Fixes symptom reported in MoonshotAI/kimi-code where resuming a
session after an MCP initialization failure produced
[session.not_found].
@changeset-bot

changeset-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 5afd3dc

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@moonshot-ai/agent-core Patch
@moonshot-ai/kimi-code Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fb517a24f6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread apps/kimi-code/src/tui/kimi-tui.ts Outdated
`(for example, an MCP server failed to start). ` +
`Try running the command again, or start a fresh session.`,
);
return;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Abort after the resumed session disappears

When setPermission, getStatus, or setPlanMode reports SESSION_NOT_FOUND, this handler only displays a message and then returns to its callers. The startup path still proceeds to setSession(session) and syncRuntimeState(session), so the same dead SDK session is installed and the next RPC fails again instead of cleanly aborting or starting over; the session-picker path likewise continues to hide the picker after a failed apply. This affects the exact stale/closed-session race this catch is trying to handle.

Useful? React with 👍 / 👎.

qyz7438 added 2 commits June 26, 2026 01:13
Address review feedback: when applyStartupModesToResumedSession detects
SESSION_NOT_FOUND, it now returns false instead of swallowing the error.

- Startup path throws so init() fails instead of installing a dead session.
- Session-picker path leaves the picker open so the user can choose again.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant