Skip to content

LLM-1593 preserve user language after compaction#201

Open
linkas45 wants to merge 1 commit into
masterfrom
fix/LLM-1593-non-english-pruning
Open

LLM-1593 preserve user language after compaction#201
linkas45 wants to merge 1 commit into
masterfrom
fix/LLM-1593-non-english-pruning

Conversation

@linkas45
Copy link
Copy Markdown
Contributor

@linkas45 linkas45 commented May 10, 2026

Problem

During long tool-call chains, the context compactor's protected window (PROTECT_WINDOW = 30 messages / MAX_PROTECTED_CHARS = 100k) could push the user's last message out of view. With 30+ compacted [compacted: ...] tool results filling the window, the LLM loses the language and tone anchor from the user's input. The model then drifts into the user's locale — for example, switching into Danish mid-session.

Fix

Surgically extend the protected window backward so the most recent user message is always visible. The model retains whatever language, tone, and framing the user actually chose.

How it works

Before:
  baseCutoff = computeCutoff(...)  // purely window/char based
After:
  baseCutoff = computeCutoff(...)
  lastUserIndex = findLastUserMessageIndex(messages)  // latest role:"user"
  cutoff = min(baseCutoff, lastUserIndex)  // never prune past the user message

Why not other approaches?

Approach Problem This fix
Hard-code "always English" in system prompt Breaks legitimate Danish/German/Chinese sessions Respects user's language
Keep last N messages 100k char budget can still eat user message User message always visible
Inject language annotation during pruning Complex, fragile, still strips user content Simple, no annotations needed

Changes

  • src/extensions/context-compactor.ts — floor pruning at the most recent user message
  • src/extensions/context-compactor.test.ts — coverage for the new floor behavior

Testing

All 18 context-compactor tests pass.


Kimchi Summary

What changed

Extends the context compactor's protected window to always retain the most recent user message, preventing it from being pruned during long tool-call chains.

Why

Without this guard, an extended sequence of tool results could push the user's original message past the pruning boundary, causing the LLM to lose their language, tone, and task framing.

Key changes

  • src/extensions/context-compactor.ts: Added findLastUserMessageIndex() helper to locate the latest user message in context.
  • src/extensions/context-compactor.ts: Updated cutoff calculation to Math.min(baseCutoff, lastUserIndex), expanding the protected window backward when necessary.
  • src/extensions/context-compactor.test.ts: Added test verifying that a user message is preserved even when 30 subsequent tool results fill the protected window.

Impact

Pruning becomes slightly less aggressive whenever the most recent user message falls outside the standard protected window. No breaking changes or migration steps are required.

During long tool-call chains the compactor's protected window could push
the user's last message out of view, stripping the LLM of language / tone
anchors. The model would then drift into the user's locale (e.g. Danish).

- Add  to locate the latest  message.
- Floor the cutoff so it never exceeds that index: we keep everything from
  the last user message onward, and only compact tool results before it.
- Add test verifying the floor works when the protected window would
  otherwise bury the user message under 30 tool results.

Co-Authored-By: Kimchi <noreply@kimchi.dev>
@kimchi-review
Copy link
Copy Markdown

kimchi-review Bot commented May 10, 2026

Kimchi Code Review

Property Value
Commit aaffc07
Author @linkas45
Files changed 0
Review status Completed
Comments 2 (1 info, 1 warning)
Duration 95s

Summary

📊 Review Score: 82/100 (overall code quality — 0 lowest, 100 highest)
⏱️ Estimated effort to review: 2/5 (1 = trivial, 5 = very complex)

🧪 Tests: yes — A new test case verifies that the compaction window is extended backward to preserve the most recent user message, asserting both message retention and the resulting cutoff value.

📝 Found 2 issue(s). See inline comments for details.

What to expect

Kimchi will analyze the changes in this pull request and post:

  • A summary of the overall changes
  • Inline comments on specific lines with findings categorized by issue type

The review typically completes within a few minutes. This comment will be updated once the review is ready.

Interact with Kimchi
  • @kimchi review — re-trigger a full review on the latest commit
  • @kimchi summary — regenerate the PR summary
  • @kimchi ignore — skip this PR (no review will be posted)
  • Reply to any inline comment to ask follow-up questions or request clarification
Configuration

Reviews are configured by your organization admin.
Review instructions, excluded directories, and severity thresholds can be adjusted per repository in the Kimchi dashboard.


Powered by Kimchi — AI-powered code review by CAST AI

Copy link
Copy Markdown

@kimchi-review kimchi-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📊 Review Score: 82/100 (overall code quality — 0 lowest, 100 highest)
⏱️ Estimated effort to review: 2/5 (1 = trivial, 5 = very complex)

🧪 Tests: yes — A new test case verifies that the compaction window is extended backward to preserve the most recent user message, asserting both message retention and the resulting cutoff value.

📝 Found 2 issue(s). See inline comments for details.

@@ -90,8 +105,15 @@ export default function contextCompactorExtension(pi: ExtensionAPI) {
if (lastInputTokens < PRUNE_THRESHOLD) return

const { messages } = event
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️🐛 Bug

After flooring cutoff to the last user message index, the code never checks whether the final cutoff is 0. If the most recent user message sits at index 0 and baseCutoff is greater than 0, the final cutoff becomes 0, which means no messages will actually be pruned. The original early return for cutoff === 0 is bypassed, risking unnecessary work and a spurious compaction telemetry entry.

💡 Suggestion: Add an early return immediately after computing the final cutoff: if (cutoff === 0) return.

* This ensures the LLM always retains the user's language, tone,
* and task framing even during long tool-call chains.
*/
function findLastUserMessageIndex(messages: ContextEvent["messages"]): number {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️🔧 Maintainability

The as { role?: string } cast inside findLastUserMessageIndex suppresses TypeScript type checking. If the message union type is later refactored, this cast will not trigger a compile-time error and could silently fail to identify user messages at runtime.

💡 Suggestion: Replace the type assertion with a runtime type guard, e.g. function isUserMessage(m: unknown): m is { role: 'user' }, and use it to narrow the message type safely.

@linkas45 linkas45 changed the title LLM-1593 LLM-1593 preserve user language after compaction May 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants