feat(nodes): add context window optimizer node for token budget management by nihalnihalani · Pull Request #527 · rocketride-org/rocketride-server

nihalnihalani · 2026-03-30T22:43:58Z

#Hack-with-bay-2

Contribution Type

Feature — Context window optimizer for LLM token budget management

Problem / Use Case

RocketRide pipelines have no way to manage what fits in an LLM's context window. Users must manually estimate token counts and prune content. When pipelines chain documents + history + system prompts, context overflows cause silent truncation or API errors.

"Context engineering" is the #1 AI trend in 2026 — optimizing what information goes into LLM context.

Proposed Solution

A `context_optimizer` node with:

Token counting via tiktoken (handles unicode/emoji)
Priority-based budget allocation: system prompt (10%) > query (15%) > documents (50%) > history (25%)
Smart truncation at sentence boundaries (not mid-word)
History summarization: keeps first + last messages, compresses middle
Document ranking by keyword relevance, greedy selection within budget
Built-in limits for GPT-5 (128K), Claude (200K), Gemini (1M)

Why This Feature Fits This Codebase

The node processes Question objects from upstream nodes and optimizes their content before forwarding to LLM nodes. It uses `Config.getNodeConfig()` and the standard IGlobal/IInstance pattern. Token counting via tiktoken complements the existing token-aware features in `rocketlib/types.py` (IInvokeLLM.getTokenCounter).

Validation

68 tests passing. ruff check + format clean.

How This Could Be Extended

Integrate with Cost Tracker for cost-aware context optimization
Add embedding-based document ranking (vs keyword overlap)
Support model-specific tokenizers beyond tiktoken

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added a Context Optimizer node to manage LLM context budgets: sentence-aware truncation, query/document selection & ranking, conversation compression, and token usage/savings reporting. Includes a transform service with UI-configurable "auto" and "manual" profiles and budget/model controls.
Tests
- Comprehensive unit and integration tests covering token counting, budgeting, truncation, history compression, document ranking, metadata reporting, lifecycle and pass-through behavior.
Chores
- Declared runtime dependency on tiktoken.

coderabbitai · 2026-03-30T22:44:13Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a Context Optimizer node: global lifecycle loads config and conditionally instantiates a ContextOptimizer; instance wrapper intercepts and deep-copies questions, runs token-budgeted optimization (system prompt, query, documents, history) when available, and forwards optimized questions. Package metadata, dependency, and tests added.

Changes

Cohort / File(s)	Summary
Lifecycle / integration `nodes/src/nodes/context_optimizer/IGlobal.py`, `nodes/src/nodes/context_optimizer/IInstance.py`	Adds `IGlobal` to initialize/clear `ContextOptimizer` during global lifecycle; adds `IInstance` to deep-copy incoming `Question`s, extract components, call `optimizer.optimize(...)` when present, rebuild/filter the `Question`, emit metadata, and forward it.
Core optimizer `nodes/src/nodes/context_optimizer/optimizer.py`	New `ContextOptimizer` implementing token counting (tiktoken integration), model limits, budget allocation/normalization, sentence-aware truncation, history summarization, document ranking/selection, and the end-to-end `optimize()` pipeline returning optimized components and metadata.
Package & metadata `nodes/src/nodes/context_optimizer/__init__.py`, `nodes/src/nodes/context_optimizer/requirements.txt`, `nodes/src/nodes/context_optimizer/services.json`	Adds package exports, declares `tiktoken>=0.7.0`, and introduces `context_optimizer://` service with lanes, UI metadata, configuration fields/profiles, automated test payloads, and preconfigs.
Tests `test/nodes/test_context_optimizer.py`	Comprehensive pytest suite covering token counting, budget allocation/normalization, truncation, history summarization, document ranking, end-to-end `optimize()` behavior, lifecycle interactions, config coercion/clamping, and many edge cases.

Sequence Diagram

sequenceDiagram
    participant Startup as "Startup" 
    participant IGlobal as "IGlobal"
    participant Optimizer as "ContextOptimizer"
    participant IInstance as "IInstance"
    participant Pipeline as "Pipeline"

    Startup->>IGlobal: beginGlobal()
    alt CONFIG mode
        IGlobal->>IGlobal: keep optimizer = None
    else NORMAL mode
        IGlobal->>IGlobal: import depends, load node config
        IGlobal->>Optimizer: instantiate ContextOptimizer(config)
        Optimizer->>Optimizer: prepare tokenizer & model limits
    end

    Pipeline->>IInstance: writeQuestions(question)
    alt Optimizer available
        IInstance->>IInstance: deep-copy & extract system/query/docs/history
        IInstance->>Optimizer: optimize(system_prompt, model, documents, history)
        Optimizer->>Optimizer: count tokens -> allocate budgets
        Optimizer->>Optimizer: truncate system/query, rank/select docs, summarize history
        Optimizer-->>IInstance: optimized components + metadata
        IInstance->>IInstance: rebuild Question, log metadata
    else No optimizer
        IInstance->>IInstance: emit warning & pass-through question
    end
    IInstance->>Pipeline: writeQuestions(optimized_question)

    Startup->>IGlobal: endGlobal()
    IGlobal->>IGlobal: clear optimizer & config

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

jmaionchi
stepmikhaylov

Poem

🐇 I nibble tokens, slice with care,
Trim the past and keep the flair,
Prompts get snug, docs hop in line,
Questions shine, concise and fine,
Hooray — the pipeline feels just right.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 41.07% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately and concisely describes the main changeset: adding a context optimizer node for token budget management. It directly corresponds to the primary changes across all files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nodes/src/nodes/context_optimizer/IInstance.py`:
- Around line 69-71: Replace the explicit for-loop that builds the history list
in IInstance.py with a single list comprehension: instead of initializing
history = [] and looping over question.history to append {'role': h.role,
'content': h.content}, assign history to a comprehension that iterates over
question.history or [] and constructs the same dicts; update the code around the
variable named history and the use of question.history accordingly.

In `@nodes/src/nodes/context_optimizer/optimizer.py`:
- Around line 299-302: The code builds the final message list via concatenation
into an intermediate variable `result`; replace that with list unpacking to
simplify and improve clarity by returning `[first_msg, summary_placeholder,
*recent]` directly instead of `result = [first_msg, summary_placeholder] +
recent` and `return result`—apply this change in the function where
`summary_placeholder` is set (the block using `first_msg`,
`summary_placeholder`, and `recent`).

In `@nodes/src/nodes/context_optimizer/services.json`:
- Around line 111-118: The "auto" profile currently claims to detect the model
from the pipeline but actually uses the configured default model_name ("gpt-5");
update the "auto" profile definition by either changing the "title" to
accurately reflect current behavior (e.g., "Auto - uses configured default model
(gpt-5)") or add a descriptive note in the profile (e.g., a "description" field)
stating that pipeline model detection is a planned future enhancement; modify
the "auto" object (keys: "title", "model_name") in the JSON to reflect this
clarification.

In `@test/nodes/test_context_optimizer.py`:
- Around line 348-350: Replace the two separate endswith checks with a single
tuple form: update the assertion that currently does "assert
result.endswith('.') or result.endswith('sentence')" to use a single endswith
call with a tuple (e.g., result.endswith(('.', 'sentence'))) so the static
analyzer warning is resolved; this change should be applied where the test calls
optimizer.truncate_to_budget and asserts on the result variable.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 57cc9afb-270e-4b0d-8a41-9510a6dda2e5

📥 Commits

Reviewing files that changed from the base of the PR and between b6b2dc1 and 223347f.

📒 Files selected for processing (7)

nodes/src/nodes/context_optimizer/IGlobal.py
nodes/src/nodes/context_optimizer/IInstance.py
nodes/src/nodes/context_optimizer/__init__.py
nodes/src/nodes/context_optimizer/optimizer.py
nodes/src/nodes/context_optimizer/requirements.txt
nodes/src/nodes/context_optimizer/services.json
test/nodes/test_context_optimizer.py

nodes/src/nodes/context_optimizer/IInstance.py

nodes/src/nodes/context_optimizer/optimizer.py

nodes/src/nodes/context_optimizer/services.json

test/nodes/test_context_optimizer.py

coderabbitai

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nodes/src/nodes/context_optimizer/IInstance.py`:
- Around line 97-101: Move the from ai.common.schema import QuestionHistory out
of the method and place it at module top-level so the QuestionHistory symbol is
imported once (improves readability and avoids repeated import overhead); update
the file-level imports and remove the inline import inside the block that
updates question.history (the code section using result['history'] and
constructing QuestionHistory(role=..., content=...)); if a circular-import
prevents a top-level import, use a typing.TYPE_CHECKING guard or keep the local
import but add a short comment explaining the circular dependency.

In `@nodes/src/nodes/context_optimizer/optimizer.py`:
- Around line 409-414: The history token calculation is inconsistent: in
optimizer.optimize (original_history) you add a flat "+ 4" for role overhead
while summarize_history._msg_tokens uses count_tokens(role) + 4; extract a
shared helper (e.g., msg_token_count(message: dict) or _msg_tokens(role,
content)) that returns count_tokens(role) + count_tokens(content) + 4 and use it
both in summarize_history._msg_tokens and in the optimize
original_history/doc/history counting so both places compute identical
per-message tokens and remove the hardcoded "+ 4" discrepancy.
- Around line 209-223: The current logic splits text with _SENTENCE_RE and then
joins selected pieces with ' ', which collapses original inter-sentence
whitespace; change the approach to preserve original separators by using
_SENTENCE_RE with a capturing group (or iterate re.finditer over matches) so you
keep each sentence plus its following separator, measure tokens with
count_tokens(sentence_piece, encoding), append the exact sentence_piece to
result_parts, and return ''.join(result_parts) (instead of ' '.join) when
returning the truncated text; update the code paths that reference
sentences/result_parts and keep count_tokens, encoding and max_tokens usage
unchanged.
- Around line 47-66: The MODEL_LIMITS class variable is mutable and triggers
Ruff RUF012; annotate it as a ClassVar to mark it read-only by importing
ClassVar from typing and changing the declaration to use ClassVar[Dict[str,
int]] for MODEL_LIMITS, or if you prefer to keep the current annotation add a
trailing "# noqa: RUF012" to the MODEL_LIMITS line to suppress the warning;
locate the MODEL_LIMITS symbol in optimizer.py and apply one of these two fixes.
- Around line 282-291: The loop building recent currently uses recent.insert(0,
msg) which causes O(n²) behavior; instead append messages to recent and then
reverse once at the end to preserve original order. Update the block that
iterates over reversed(messages[1:]) (uses _msg_tokens and budget_for_recent) to
call recent.append(msg) when within budget and after the loop call
recent.reverse() so the logic and variables (recent, recent_cost, _msg_tokens,
budget_for_recent, messages) remain the same but with O(n) performance.

In `@nodes/src/nodes/context_optimizer/services.json`:
- Around line 88-101: Extend the "test" cases in services.json to cover token
truncation, document selection under budget, and history compression: add
separate entries in the "cases" array that (1) provide a very long question text
to assert truncation behavior (extend the existing "questions" object and set an
"expect" that checks truncated length or presence of a truncation marker), (2)
supply multiple documents plus a budget constraint to assert the chosen
documents (create an input field representing documents and an "expect"
asserting selected document IDs or count), and (3) include a long conversation
history to verify compression (add a "history" input and an "expect" that
compressed history length or summary is within expected bounds); reference the
existing test structure keys ("test", "cases", "questions", "expect") so new
cases match the same schema and validation style.

In `@test/nodes/test_context_optimizer.py`:
- Around line 764-792: Test currently only checks that 'question' was passed to
optimizer.optimize; tighten it by asserting the actual extracted values: fetch
call_args = mock_opt.optimize.call_args and read the question from
call_args.kwargs.get('question', call_args.args[0] if call_args.args else None)
and assert it equals the expected question text (from the Question instance or
q.get_text()/repr), and similarly assert that 'components' (or the expected
components key) is present and is a list containing the components extracted by
inst.writeQuestions; update test_optimizer_called_with_components to perform
these specific checks using the optimizer.optimize call_args instead of the
loose presence check.
- Around line 234-235: The test test_none_returns_zero calls count_tokens with
None but count_tokens is typed as text: str; update the count_tokens signature
to accept Optional[str] (import typing.Optional) and update its docstring to
state None is allowed, keeping the existing None-check (if not text: return 0)
so the test remains valid; alternatively change the test to pass an empty string
''—pick one consistent approach and update typing and imports accordingly in the
function definition for count_tokens.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3dcd3061-e08b-4c22-8e64-17cc00d04a83

📥 Commits

Reviewing files that changed from the base of the PR and between 223347f and 0f6510a.

📒 Files selected for processing (4)

nodes/src/nodes/context_optimizer/IInstance.py
nodes/src/nodes/context_optimizer/optimizer.py
nodes/src/nodes/context_optimizer/services.json
test/nodes/test_context_optimizer.py

nodes/src/nodes/context_optimizer/IInstance.py

nodes/src/nodes/context_optimizer/optimizer.py

coderabbitai · 2026-03-30T23:34:03Z

nodes/src/nodes/context_optimizer/optimizer.py

+        # Split into sentences
+        sentences = self._SENTENCE_RE.split(text)
+        result_parts: List[str] = []
+        used_tokens = 0
+
+        for sentence in sentences:
+            sentence_tokens = self.count_tokens(sentence, encoding)
+            if used_tokens + sentence_tokens <= max_tokens:
+                result_parts.append(sentence)
+                used_tokens += sentence_tokens
+            else:
+                break
+
+        if result_parts:
+            return ' '.join(result_parts)


🧹 Nitpick | 🔵 Trivial

Sentence joining normalizes whitespace.

The _SENTENCE_RE.split(text) preserves sentence content but the ' '.join(result_parts) normalizes all inter-sentence whitespace to single spaces. This may alter formatting if the original had double spaces, newlines, or other whitespace. This is likely acceptable for LLM context but worth documenting.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@nodes/src/nodes/context_optimizer/optimizer.py` around lines 209 - 223, The current logic splits text with _SENTENCE_RE and then joins selected pieces with ' ', which collapses original inter-sentence whitespace; change the approach to preserve original separators by using _SENTENCE_RE with a capturing group (or iterate re.finditer over matches) so you keep each sentence plus its following separator, measure tokens with count_tokens(sentence_piece, encoding), append the exact sentence_piece to result_parts, and return ''.join(result_parts) (instead of ' '.join) when returning the truncated text; update the code paths that reference sentences/result_parts and keep count_tokens, encoding and max_tokens usage unchanged.

@coderabbitai Fair observation. The ' '.join(result_parts) does normalize inter-sentence whitespace to single spaces. As you noted, this is likely acceptable for LLM context where whitespace formatting is typically not semantically significant, but I'll add a brief comment documenting this behavior for future maintainers.

Rate Limit Exceeded

@nihalnihalani have exceeded the limit for the number of chat messages per hour. Please wait 1 minutes and 46 seconds before sending another message.

Leaving this behavior as-is. The sentence-aware truncation intentionally normalizes inter-sentence whitespace when it rejoins preserved sentences, and the current implementation now documents that directly above the join call. For this node’s LLM-context use case that tradeoff is acceptable, and the focused optimizer suite still passes on the current head (81 passed).

@nihalnihalani, understood — keeping the whitespace normalization with an in-code comment documenting the intentional behavior is a reasonable resolution. The LLM-context use case makes the tradeoff sound, and the added comment gives future maintainers the necessary context. Thanks for clarifying.

✏️ Learnings added

Learnt from: nihalnihalani Repo: rocketride-org/rocketride-server PR: 527 File: nodes/src/nodes/context_optimizer/optimizer.py:249-264 Timestamp: 2026-04-09T10:23:39.742Z Learning: In `nodes/src/nodes/context_optimizer/optimizer.py`, the `truncate_to_budget` method intentionally uses `' '.join(result_parts)` after splitting on sentence boundaries (`_SENTENCE_RE`), which normalizes all inter-sentence whitespace to single spaces. This behavior is intentional and documented in the code — for LLM context, whitespace formatting is not semantically significant. Do not flag this whitespace normalization as a bug or suggest preserving original separators in future reviews.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: ryan-t-christensen Repo: rocketride-org/rocketride-server PR: 640 File: nodes/src/nodes/tool_github/IInstance.py:1032-1036 Timestamp: 2026-04-09T07:00:32.335Z Learning: In `nodes/src/nodes/tool_github/IInstance.py`, `_require_str` always calls `.strip()` on the returned value, including when used for `file_create`/`file_edit` `content` arguments. This means a trailing `\n` in plain-text file content will be stripped before base64 encoding. This behavior is intentional and accepted — agents are not expected to supply files where trailing whitespace is semantically significant. Do not flag this as a bug in future reviews.

Learnt from: asclearuc Repo: rocketride-org/rocketride-server PR: 417 File: nodes/src/nodes/llm_vision_ollama/__init__.py:28-32 Timestamp: 2026-03-24T21:28:07.433Z Learning: For Python pipeline node files under `nodes/**/*.py`, follow the repo’s string/docstring quoting conventions: use single quotes for regular string literals, and require PEP 257-style docstrings to use triple double quotes (`"""..."""`). Do not recommend changing docstring quotes from `"""..."""` to `'''...'''`.

Learnt from: KaushikSiva Repo: rocketride-org/rocketride-server PR: 386 File: nodes/src/nodes/search_exa/IInstance.py:31-32 Timestamp: 2026-03-25T16:11:50.206Z Learning: In nodes/**/*.py, treat `self.instance.writeAnswers(...)` as accepting a single `Answer` object (not `List[Answer]`). In reviews, do not flag calls where a single `Answer` is passed to `writeAnswers` as type mismatches. If the abstract base class (e.g., `filters.py`) still declares `List[Answer]`, consider updating that type annotation to match the actual runtime/override usage, since concrete implementations (e.g., `response/IInstance.py`, `extract_data/IInstance.py`, `dictionary/IInstance.py`, `db_instance_base.py`) and all call sites pass a single `Answer`.

Learnt from: asclearuc Repo: rocketride-org/rocketride-server PR: 456 File: nodes/src/nodes/db_neo4j/IInstance.py:98-100 Timestamp: 2026-03-27T23:31:51.042Z Learning: In this project’s Python lifecycle stubs under `nodes/` (e.g., empty lifecycle methods like `endInstance`), it may be intentional to include an explicit `pass` statement even if the method already contains a docstring. When reviewing, do not flag this as redundant (Ruff `PIE790`) for these empty stub lifecycle methods when the `pass` is used to clearly mark the body as intentionally empty.

Learnt from: chinesepowered Repo: rocketride-org/rocketride-server PR: 518 File: nodes/src/nodes/llm_openai_api/openai_client.py:25-32 Timestamp: 2026-03-30T21:57:11.296Z Learning: In Python LLM node implementations under `nodes/**/*.py`, it’s an established pattern that `ChatBase.__init__` may not store all configuration on the instance. Subclasses may intentionally call `Config.getNodeConfig(provider, connConfig)` again inside their own `__init__` to read provider-specific fields (e.g., `apikey`, `base_url`). During code review, do not flag this as a redundant duplicate call when it’s used to obtain provider-specific configuration for the subclass.

Learnt from: chinesepowered Repo: rocketride-org/rocketride-server PR: 518 File: nodes/src/nodes/llm_openai_api/IGlobal.py:11-81 Timestamp: 2026-03-30T22:18:48.072Z Learning: In this repository’s Python LLM node implementation files under `nodes/**/*.py`, follow the established `validateConfig` style: use the existing “monolithic probe + error-formatting” approach for provider-first exceptions to keep behavior consistent. When linting flags require suppression, prefer the smallest/most targeted Ruff `# noqa` usage (e.g., `PLR0912`/`PLR0915`) and add specific justification comments for any targeted `BLE001`/similar codes rather than refactoring into multiple helper functions solely to satisfy linting. Only refactor the exception/validation pattern if you can update it consistently across all similar nodes.

Learnt from: dsapandora Repo: rocketride-org/rocketride-server PR: 583 File: nodes/src/nodes/audio_tts/IGlobal.py:236-264 Timestamp: 2026-04-02T13:09:51.916Z Learning: In RocketRide, filter node code (e.g., implementations under `nodes/**/*.py`) should assume `IGlobal.synthesize()` and filter node methods are executed sequentially per pipeline instance: one document is processed through the filter at a time. Concurrent access to the same `IGlobal` instance is not an expected runtime scenario. Therefore, do not raise review findings about missing synchronization/locks (e.g., missing mutex/atomic protections) for shared `IGlobal` state in filter node implementations unless there is explicit evidence of concurrent access beyond the framework’s pipeline execution model.

Learnt from: Rod-Christensen Repo: rocketride-org/rocketride-server PR: 599 File: nodes/src/nodes/db_neo4j/IInstance.py:158-163 Timestamp: 2026-04-02T20:29:16.277Z Learning: When reviewing Python code in this repository, do not treat Ruff rule RET505 ("Unnecessary elif after return statement") as a review finding. Maintainers consider this nitpick unnecessary, and it is acceptable to keep elif/else branching even immediately after a return.

Learnt from: Rod-Christensen Repo: rocketride-org/rocketride-server PR: 599 File: nodes/src/nodes/tool_firecrawl/IInstance.py:103-103 Timestamp: 2026-04-03T04:00:08.250Z Learning: When reviewing Python files under `nodes/**/*.py`, do not flag Ruff Q000 (single-quote style) violations for double-quoted string literals if the content contains a single quote/apostrophe (e.g., `"Map a website's structure..."`). In such cases, double quotes are intentional to avoid backslash escapes and are considered correct under Ruff’s Q000 behavior.

Learnt from: nihalnihalani Repo: rocketride-org/rocketride-server PR: 525 File: nodes/src/nodes/router_llm/IInstance.py:0-0 Timestamp: 2026-04-03T05:20:16.219Z Learning: When reviewing Python code under the `nodes/` package, treat the `IInstanceBase.open()` lifecycle method’s parameter named `obj: Entry` as the correct engine Python-bridge interface convention. Do not flag `obj` as a non-standard parameter name. This convention exists specifically because the earlier parameter name `object` (which shadowed the Python builtin) was replaced with `obj`.

Learnt from: Rod-Christensen Repo: rocketride-org/rocketride-server PR: 599 File: nodes/src/nodes/memory_internal/IInstance.py:33-35 Timestamp: 2026-04-03T18:31:22.345Z Learning: In RocketRide pipeline engine nodes (nodes/**/*.py), rely on the lifecycle invariant: `beginInstance()` is guaranteed to complete successfully before `open()` runs, and if `beginInstance()` throws, the engine aborts and `open()` is never called. Therefore, during code review, do not flag instance fields that are initialized in `beginInstance()` (e.g., `self._store`) as potentially `None`/uninitialized when they are referenced in `open()`. This is safe because the framework lifecycle enforces it.

Learnt from: Rod-Christensen Repo: rocketride-org/rocketride-server PR: 599 File: nodes/src/nodes/llm_base/IInstance.py:36-50 Timestamp: 2026-04-03T18:31:20.227Z Learning: In Python pipeline node files under `nodes/**/*.py` (including `nodes/src/nodes/llm_base/IInstance.py`), do not flag missing PEP 257 docstrings for trivial one-liner methods decorated with `invoke_function` (or similar decorator-based invocation). Skip the docstring requirement when the method’s name and return value/behavior make the contract self-evident (e.g., `getContextLength`, `getOutputLength`, `getTokenCounter`, `ask`), because the decorator plus the method name already communicate intent clearly.

Learnt from: Rod-Christensen Repo: rocketride-org/rocketride-server PR: 599 File: nodes/src/nodes/tool_firecrawl/IInstance.py:67-69 Timestamp: 2026-04-03T18:31:34.956Z Learning: For Python tools implemented with rocketlib’s `tool_function` decorator, the framework validates the declared `input_schema` at `tool.invoke` before entering the tool method body. Therefore, do not require (or flag) redundant runtime type checks inside the tool (e.g., `isinstance(url, str)` when the schema declares `url` as a string). Instead, only add lightweight value guards for schema-compatible edge cases (e.g., `if not url:` to handle the empty-string case) and avoid duplicating schema-based type validation logic.

Learnt from: charliegillet Repo: rocketride-org/rocketride-server PR: 509 File: nodes/src/nodes/tool_exa_search/exa_driver.py:146-198 Timestamp: 2026-04-08T07:11:59.835Z Learning: In RocketRide tool driver implementations under nodes/**/*.py that use ToolsBase, ensure _tool_validate normalizes inputs exactly the same way as _tool_invoke (via a shared _normalize_tool_input helper). This must allow inputs consistently when ToolsBase.handle_invoke triggers validation directly—specifically, JSON strings, Pydantic-like models, and wrapped payloads like {"input": ...} should be accepted and normalized prior to validation.

nodes/src/nodes/context_optimizer/optimizer.py

coderabbitai · 2026-03-30T23:34:03Z

nodes/src/nodes/context_optimizer/services.json

+	"test": {
+		"profiles": ["auto"],
+		"cases": [
+			{
+				"questions": {
+					"questions": [{"text": "What is the capital of France?"}]
+				},
+				"expect": {
+					"questions": {
+						"notEmpty": true
+					}
+				}
+			}
+		]


🧹 Nitpick | 🔵 Trivial

Consider adding more comprehensive test cases.

The current test only verifies that questions are non-empty. Consider adding test cases that verify:

Token truncation behavior with a long question

Document selection when over budget

History compression

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@nodes/src/nodes/context_optimizer/services.json` around lines 88 - 101, Extend the "test" cases in services.json to cover token truncation, document selection under budget, and history compression: add separate entries in the "cases" array that (1) provide a very long question text to assert truncation behavior (extend the existing "questions" object and set an "expect" that checks truncated length or presence of a truncation marker), (2) supply multiple documents plus a budget constraint to assert the chosen documents (create an input field representing documents and an "expect" asserting selected document IDs or count), and (3) include a long conversation history to verify compression (add a "history" input and an "expect" that compressed history length or summary is within expected bounds); reference the existing test structure keys ("test", "cases", "questions", "expect") so new cases match the same schema and validation style.

@coderabbitai Agreed — the current test only verifies that questions are non-empty, which is minimal coverage. Adding test cases for token truncation with a long question, document selection when over budget, and history compression would make the service definition tests much more meaningful. I'll add these in the next commit.

test/nodes/test_context_optimizer.py

asclearuc

Thank you for this contribution — the token counting, budget allocation, and
sentence-aware truncation are well implemented, and reusing tiktoken is the right
choice. The overall design is solid.

A few issues need to be resolved before merge. Inline comments have been left on
specific lines.

asclearuc · 2026-04-01T20:48:21Z

nodes/src/nodes/context_optimizer/requirements.txt

@@ -0,0 +1 @@
+tiktoken>=0.7.0,<1.0.0


Could it be used without version restriction?
uv pip compile is used internally to calculate version number?
it is not critical since version restriction is large.

Fixed — removed the upper bound. Now tiktoken>=0.7.0 without the <1.0.0 cap.

asclearuc · 2026-04-01T20:50:18Z

nodes/src/nodes/context_optimizer/optimizer.py

+import re
+from typing import Any, Dict, List, Optional
+
+import tiktoken


import tiktoken at module level will fail if tiktoken is not yet installed when
the module is first imported. Move the import inside the methods that use it, or
at minimum inside __init__, to ensure depends() in IGlobal has run first.

Fixed — moved import tiktoken from module level into the _get_encoding() method for lazy loading. This ensures depends() in IGlobal has run first before tiktoken is imported.

asclearuc · 2026-04-01T20:50:57Z

nodes/src/nodes/context_optimizer/IGlobal.py

+            from depends import depends  # type: ignore
+
+            # Load the requirements (tiktoken)
+            requirements = os.path.dirname(os.path.realpath(__file__)) + '/requirements.txt'


use please os.path.join

Fixed — now using os.path.join(os.path.dirname(os.path.realpath(__file__)), 'requirements.txt') instead of string concatenation.

nodes/src/nodes/context_optimizer/IInstance.py

nodes/src/nodes/context_optimizer/optimizer.py

asclearuc · 2026-04-01T21:23:33Z

nodes/src/nodes/context_optimizer/optimizer.py

+        opt_history_tokens = sum(self.count_tokens(m.get('content', '')) + self.count_tokens(m.get('role', '')) + 4 for m in opt_history)
+        if opt_history_tokens < original_history:
+            components_truncated.append('history')
+


Budget is allocated as fixed percentages of the total limit with no redistribution
of unused space. If the system prompt uses 100 tokens out of a 12,800-token
allocation, the surplus 12,700 tokens are discarded — they do not flow to other
components. This means history can be truncated even when the global token budget
has plenty of room.

Example: system_prompt=100 tokens, history=40,000 tokens needed, total_limit=128K

system_prompt budget: 12,800 (only 100 used, 12,700 wasted)

history budget: 32,000 → truncated despite 40,900 tokens available globally

Consider a two-pass approach: first check if all components fit within total_limit
without per-component caps; only apply percentage budgets when the total exceeds
the limit. This would make truncation the exception rather than the default.

Fixed — implemented two-pass approach. Pass 1 checks if all components fit within total_limit without per-component caps (no wasted budget). Pass 2 applies percentage-based budgets only when total exceeds the limit, making truncation the exception rather than the default.

asclearuc · 2026-04-01T21:29:39Z

nodes/src/nodes/context_optimizer/optimizer.py

+            scored.append((overlap, tokens, doc))
+
+        # Sort by overlap descending, then by token cost ascending (prefer smaller docs on tie)
+        scored.sort(key=lambda x: (-x[0], x[1]))


This re-ranks documents by keyword overlap, silently discarding the ordering
established by the upstream vector DB node (which orders by embedding similarity
score). If the vector DB already returned the best documents first, re-ranking
by a weaker keyword heuristic may produce a worse selection.

Consider preserving the original document order and applying only greedy budget
selection (without re-sorting), or using doc.score as the primary sort key with
keyword overlap as a secondary tiebreaker only when scores are absent.

Fixed — now preserves vector DB ordering when documents carry a score field. Keyword overlap is only used as a lightweight fallback when no scores are present, with original index as secondary tiebreaker to avoid completely discarding upstream ordering.

nihalnihalani · 2026-04-02T08:09:31Z

@asclearuc Thank you for the thorough and deeply technical review! Your feedback is excellent and touches on real correctness issues. I'm working on addressing all 8 points:

tiktoken version restriction — Will loosen to major version pin only
Module-level tiktoken import — Moving to lazy import inside methods to avoid install-time failures
os.path.join — Fixing path construction for consistency
Silent pass-through warning — Adding explicit warning when optimizer is uninitialized
Non-numeric validation — Adding try/except with user-friendly messages and clamping for percentages [0,100] and tokens >= 0
Budget percentage validation — Adding warning when sum(pcts) < 90 and rejecting negatives
Unused budget redistribution — Implementing the two-pass approach you suggested (fit first, then cap). This is the biggest change and will prevent unnecessary truncation
Re-ranking vs vector DB ordering — Preserving original document order with score-based primary sort and keyword overlap as secondary tiebreaker only when scores are absent

Will push these fixes shortly. Thanks again for the insightful review!

coderabbitai

♻️ Duplicate comments (3)

nodes/src/nodes/context_optimizer/IInstance.py (1)

98-102: 🧹 Nitpick | 🔵 Trivial

Move QuestionHistory import to module level.

The QuestionHistory import at line 99 is still inside the method. As you acknowledged in a previous review, since there's no circular import concern, moving it to the top level alongside the existing Question import improves readability.

♻️ Move import to top of file

 from rocketlib import IInstanceBase, Entry, debug, warning
 from .IGlobal import IGlobal
-from ai.common.schema import Question
+from ai.common.schema import Question, QuestionHistory
 
 ...
 
         # Update history
-        from ai.common.schema import QuestionHistory
-
         if result['history']:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/context_optimizer/IInstance.py` around lines 98 - 102, The
import of QuestionHistory is currently inside a method in IInstance.py; move it
to the module level alongside the existing Question import at the top of the
file so the module-level imports read both Question and QuestionHistory, then
remove the local "from ai.common.schema import QuestionHistory" inside the
method where result['history'] is used (the code that sets question.history =
[QuestionHistory(...)] should continue to reference QuestionHistory unchanged).

nodes/src/nodes/context_optimizer/optimizer.py (1)

318-327: 🧹 Nitpick | 🔵 Trivial

Use append + reverse to avoid O(n²) list insertions.

Using recent.insert(0, msg) inside a loop causes O(n²) behavior due to list shifting. You agreed to refactor this in a previous review.

♻️ Suggested optimization

         # Greedily add recent messages from the end
         recent: List[Dict[str, str]] = []
         recent_cost = 0
         for msg in reversed(messages[1:]):
             cost = _msg_tokens(msg)
             if recent_cost + cost <= budget_for_recent:
-                recent.insert(0, msg)
+                recent.append(msg)
                 recent_cost += cost
             else:
                 break
+        recent.reverse()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/context_optimizer/optimizer.py` around lines 318 - 327, The
loop building recent currently uses recent.insert(0, msg) causing O(n²)
behavior; change it to append messages in iteration order (e.g., append to
recent when iterating reversed(messages[1:])), then call recent.reverse() after
the loop so the final order is preserved; update references in this block
(variables recent, recent_cost, and the loop over reversed(messages[1:])) to use
append + reverse to avoid repeated front-insertions.

test/nodes/test_context_optimizer.py (1)

789-793: 🧹 Nitpick | 🔵 Trivial

Tighten assertions on optimizer call arguments.

The assertion at line 793 is still loose—it only checks that 'question' exists somewhere in the call. You agreed to verify the actual extracted values in a previous review.

♻️ More specific assertion

         mock_opt.optimize.assert_called_once()
         call_kwargs = mock_opt.optimize.call_args
-        assert 'question' in call_kwargs.kwargs or len(call_kwargs.args) > 0
+        # Verify extracted components
+        assert call_kwargs.kwargs.get('question') == 'What is AI?'
+        assert call_kwargs.kwargs.get('system_prompt') == 'You are helpful.'

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/nodes/test_context_optimizer.py` around lines 789 - 793, The optimizer
call assertion is too loose—replace the generic check with a precise comparison
to the questions you passed in: after inst.writeQuestions(q) and
mock_opt.optimize.assert_called_once(), grab call_kwargs =
mock_opt.optimize.call_args and assert that the question was passed exactly
(either assert call_kwargs.kwargs.get('question') == q or assert
call_kwargs.args and call_kwargs.args[0] == q), and if writeQuestions can pass a
list/tuple ensure you compare to the correct element/structure (e.g., q[0] or q)
so the test verifies the actual extracted values rather than just the presence
of the key.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@nodes/src/nodes/context_optimizer/IInstance.py`:
- Around line 98-102: The import of QuestionHistory is currently inside a method
in IInstance.py; move it to the module level alongside the existing Question
import at the top of the file so the module-level imports read both Question and
QuestionHistory, then remove the local "from ai.common.schema import
QuestionHistory" inside the method where result['history'] is used (the code
that sets question.history = [QuestionHistory(...)] should continue to reference
QuestionHistory unchanged).

In `@nodes/src/nodes/context_optimizer/optimizer.py`:
- Around line 318-327: The loop building recent currently uses recent.insert(0,
msg) causing O(n²) behavior; change it to append messages in iteration order
(e.g., append to recent when iterating reversed(messages[1:])), then call
recent.reverse() after the loop so the final order is preserved; update
references in this block (variables recent, recent_cost, and the loop over
reversed(messages[1:])) to use append + reverse to avoid repeated
front-insertions.

In `@test/nodes/test_context_optimizer.py`:
- Around line 789-793: The optimizer call assertion is too loose—replace the
generic check with a precise comparison to the questions you passed in: after
inst.writeQuestions(q) and mock_opt.optimize.assert_called_once(), grab
call_kwargs = mock_opt.optimize.call_args and assert that the question was
passed exactly (either assert call_kwargs.kwargs.get('question') == q or assert
call_kwargs.args and call_kwargs.args[0] == q), and if writeQuestions can pass a
list/tuple ensure you compare to the correct element/structure (e.g., q[0] or q)
so the test verifies the actual extracted values rather than just the presence
of the key.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7bbc2114-9f39-4749-abe1-660f0d9e06c9

📥 Commits

Reviewing files that changed from the base of the PR and between 0f6510a and 2a8e7ab.

📒 Files selected for processing (4)

nodes/src/nodes/context_optimizer/IGlobal.py
nodes/src/nodes/context_optimizer/IInstance.py
nodes/src/nodes/context_optimizer/optimizer.py
test/nodes/test_context_optimizer.py

nihalnihalani · 2026-04-03T05:21:32Z

All review feedback addressed in latest commits

@asclearuc review items — all fixed:

requirements.txt:1 — Removed the upper bound version restriction. Now using tiktoken>=0.7.0 without the <1.0.0 cap.
optimizer.py:34 — import tiktoken moved from module level into _get_encoding() method for lazy loading. This ensures depends() runs first.
IGlobal.py:43 — Already using os.path.join (fixed in prior commit).
IInstance.py:52 — Added warning on pass-through when optimizer is not initialized.
optimizer.py:88 (validation) — Added _parse_pct() helper with try/except, clamps to [0, 100], and logs warnings. max_context_tokens also validated and clamped to >= 0.
optimizer.py:88 (budget sum) — Added warning when sum(pcts) < 90 to alert about underutilized context window. Negative values are clamped to 0.
optimizer.py:439 (redistribution) — Implemented two-pass approach: Pass 1 checks if everything fits without per-component caps. Pass 2 applies percentage budgets only when total exceeds limit.
optimizer.py:354 (document ranking) — Now preserves vector DB ordering when documents carry a score field. Keyword overlap only used as fallback when no scores are present.

CodeRabbit nitpicks — also addressed:

Moved QuestionHistory import to module level in IInstance.py
Added ClassVar annotation to MODEL_LIMITS (RUF012)
Documented whitespace normalization in sentence-boundary truncation
Replaced O(n^2) insert(0) with append + reverse in history summarization
Extracted shared _message_tokens() helper for consistent token overhead calculation
Changed count_tokens signature to accept Optional[str] for explicit None handling
Strengthened test assertion to verify actual extracted question/system_prompt values

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nodes/src/nodes/context_optimizer/IInstance.py`:
- Around line 86-92: The current block updates question.questions[0].text but
silently discards result['question'] when question.questions is empty; add a
guard that checks if not question.questions and result['question'] and then emit
a debug log (including the optimized text and context) to surface this edge
case, and optionally append a new Question object with text=result['question']
(or preserve embedding info if available) to question.questions so the optimized
content isn’t lost; reference the question.questions list and result['question']
in your changes and use the module's existing logger to record the event.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b77db817-bf83-4e4f-8c16-177629dd3afd

📥 Commits

Reviewing files that changed from the base of the PR and between 2a8e7ab and ed9fc0d.

📒 Files selected for processing (4)

nodes/src/nodes/context_optimizer/IInstance.py
nodes/src/nodes/context_optimizer/optimizer.py
nodes/src/nodes/context_optimizer/requirements.txt
test/nodes/test_context_optimizer.py

nodes/src/nodes/context_optimizer/IInstance.py

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nodes/src/nodes/context_optimizer/IInstance.py`:
- Around line 63-68: The current conversion in IInstance.py discards fields like
score by only appending {'content': content, '_original': doc}; change the
transformation so you preserve all dumped fields from doc_dict (including
'score') and add/override 'content' and '_original' instead of replacing the
entire dict; update the loop that builds docs (used by rank_documents()) to
merge doc_dict with {'content': content, '_original': doc} so score-first
ordering works correctly.
- Around line 84-103: The code treats empty optimizer outputs as "no update" by
using truthiness checks (e.g., if result['question'], if result['history'], if
question.questions) so empty strings/lists are ignored; change those checks to
explicit presence checks and assign/clear fields even when the optimizer returns
empty values: use "if 'question' in result:" to update
question.questions[0].text = result['question'] (and truncate extras) or set
question.questions = [] when result['question'] is an empty string and no
embeddings should remain; use "if 'documents' in result:" to set
question.documents = [d['_original'] for d in result['documents'] if '_original'
in d] (allowing empty list to clear documents); and use "if 'history' in
result:" to set question.history = [QuestionHistory(... ) for m in
result['history']] (allowing empty history to clear it); also replace the role
assignment to use result.get('system_prompt', '') or assign explicitly when
'system_prompt' in result so an empty string overwrites the previous role.

In `@test/nodes/test_context_optimizer.py`:
- Around line 32-140: The test's _install_stubs() and global _STUBS_INSTALLED
permanently mutate sys.modules (adding 'rocketlib', 'ai.common.schema', etc.) at
import time which leaks fake modules into other tests; change this to apply the
stubs only within a context/fixture (use unittest.mock.patch.dict(sys.modules,
{...}) or a pytest fixture that patches sys.modules) and import the modules
under test inside that scoped patch so the originals are restored after the
test; update where _install_stubs() is called (and the duplicate stubbing at the
other occurrence around the 177-178 area) to use the scoped patch/fixture
instead of setting sys.modules globally.
- Around line 147-175: The mock tiktoken setup is currently inside the except
block so it only runs if find_spec raises—move the mock creation to run whenever
_TIKTOKEN_AVAILABLE is False: keep the try/except that sets _TIKTOKEN_AVAILABLE
(using importlib.util.find_spec('tiktoken')), but remove the mock from the
except and add an if not _TIKTOKEN_AVAILABLE: block that defines tiktoken_mod,
_MockEncoding, _get_encoding and sets sys.modules['tiktoken']; reference
_TIKTOKEN_AVAILABLE, tiktoken_mod, _MockEncoding and _get_encoding so the
optimizer import that follows will see the mocked module when tiktoken is
absent.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 688affea-3221-4a2e-90c6-8528165e9c45

📥 Commits

Reviewing files that changed from the base of the PR and between ed9fc0d and a8b84ec.

📒 Files selected for processing (2)

nodes/src/nodes/context_optimizer/IInstance.py
test/nodes/test_context_optimizer.py

nodes/src/nodes/context_optimizer/IInstance.py

test/nodes/test_context_optimizer.py

asclearuc · 2026-04-06T12:42:14Z

@nihalnihalani, please address CodeRabbit’s concerns as well.
CodeRabbit may not add comments immediately after your push, so you should return to your PR after some time and check again.
Also, make sure the PR is clean from CodeRabbit’s point of view—please make this a rule for all your PRs.

asclearuc · 2026-04-06T13:28:02Z

@nihalnihalani

Please address all CodeRabbit comments, not only for this PR but also for others.
Please keep in mind that CodeRabbit may identify additional issues in this feature, so you should return to this PR some time after your push.

nihalnihalani · 2026-04-06T18:17:28Z

All CodeRabbit review comments have been addressed across prior commits — verified with 81/81 tests passing, ruff lint clean, and ruff format clean.

I've gone through this PR (and the others) to make sure no CodeRabbit feedback was left unaddressed. Will continue to check back after pushes for any new CodeRabbit comments that may appear, as you suggested. Thanks for the process guidance — applying this as a standard practice going forward.

nihalnihalani · 2026-04-06T18:18:14Z

All CodeRabbit review comments on this PR have been addressed in commit 270498d:

Fixes applied:

Preserve document score field (Major) — Document conversion now spreads all fields from doc_dict so rank_documents() score-first ordering works correctly
Empty optimizer outputs treated as valid (Critical) — Truthiness checks replaced with explicit updates; empty string/list now properly clears fields instead of keeping originals over budget
Test stub leakage (Major) — Stubs applied via patch.dict(sys.modules) with teardown_module() cleanup so fakes don't leak into other test files
Unreachable tiktoken mock (Major) — Mock setup moved outside except block since find_spec() returns None without raising when the package is absent

All 81 tests pass, ruff lint and format checks are clean. I will continue to monitor the PR for any new CodeRabbit comments that may appear after this push.

Add a new pipeline node that manages LLM context window budgets for context engineering. The optimizer counts tokens using tiktoken, allocates budgets across system prompt, query, documents, and conversation history using configurable priority-based percentages, and truncates/summarizes content to fit within model-specific token limits. Key capabilities: - Token counting with tiktoken (handles unicode/emoji) - Priority-based budget allocation (system > query > docs > history) - Sentence-boundary-aware truncation - Conversation history summarization (keeps first + last N, summarizes middle) - Keyword-overlap document ranking with greedy budget selection - Built-in limits for GPT-5, Claude, and Gemini model families - Auto and manual configuration profiles Includes 68 unit tests covering token counting, budget allocation, truncation, history summarization, document ranking, full pipeline, model limits, edge cases, and IGlobal/IInstance lifecycle. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Replace for-loop history building with list comprehension in IInstance.py - Use list unpacking [first_msg, summary_placeholder, *recent] in optimizer.py - Update "auto" profile title to reflect configured default, not auto-detection - Use tuple form for endswith checks in test_context_optimizer.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Move tiktoken import from module-level to lazy import inside methods to prevent ImportError before depends() installs the package - Use os.path.join instead of string concatenation for path construction - Add warning() when optimizer is not initialized and question passes through - Validate non-numeric config values with try/except, clamp percentages to [0, 100] and max_context_tokens to >= 0 - Warn when budget percentages sum to less than 90% - Implement two-pass optimization: pass 1 returns content unchanged when everything fits within total limit, pass 2 applies per-component budgets only when total exceeds limit - Preserve vector DB document ordering by using doc.score as primary sort key when available; keyword overlap used only as tiebreaker when scores are absent - Add tests for input validation, two-pass optimization, and score-preserving document ranking Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Relax tiktoken version restriction (remove upper bound) per reviewer question - Move QuestionHistory import to module level for readability - Add ClassVar annotation to MODEL_LIMITS to suppress RUF012 - Document whitespace normalization in sentence-boundary truncation - Use append+reverse instead of O(n^2) insert(0) in history summarization - Extract shared _message_tokens helper for consistent token overhead calculation - Accept Optional[str] in count_tokens for explicit None handling - Strengthen test assertion to verify actual extracted question/system_prompt values Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…pty questions list When question.questions is empty but the optimizer produces a non-empty result, the optimized question text was silently discarded. Add a debug log to surface this edge case for troubleshooting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ptimizer - Preserve all document fields (including score) during optimizer conversion so rank_documents() score-first ordering works correctly - Treat empty optimizer outputs as valid truncation results instead of skipping updates (empty string/list now clears the field) - Fix tiktoken mock: move setup outside except block since find_spec returns None without raising when the package is absent - Scope test stubs via patch.dict + teardown_module so fake modules do not leak into subsequent test files in the pytest session Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

alexs-aparavi · 2026-04-07T13:18:45Z

nodes/src/nodes/context_optimizer/optimizer.py

+
+logger = logging.getLogger(__name__)
+
+


Please replace stdlib logging with rocketlib's warning() / debug() throughout this file — the same functions already imported in IInstance.py.

# remove import logging logger = logging.getLogger(__name__) # replace all logger.warning(...) calls with from rocketlib import debug, warning

This keeps log output consistent with the rest of the node pipeline.

Already fixed in current head via d629106. optimizer.py now uses rocketlib.warning() instead of stdlib logging on those warning paths, which keeps this node aligned with the rest of the pipeline logging style. I also reran python3 -m pytest -q test/nodes/test_context_optimizer.py and python3 -m ruff check nodes/src/nodes/context_optimizer/optimizer.py nodes/src/nodes/context_optimizer/IInstance.py nodes/src/nodes/context_optimizer/IGlobal.py test/nodes/test_context_optimizer.py; both are clean on the current branch.

alexs-aparavi · 2026-04-07T13:18:55Z

nodes/src/nodes/context_optimizer/services.json

+	"protocol": "context_optimizer://",
+	//
+	// Required:
+	//		Class type of the node - what it does


Please mark this node as experimental in capabilities:

"capabilities": ["experimental"],

Rationale: the node silently drops documents and truncates content, the MODEL_LIMITS map doesn't cover most real model IDs (so the 128K fallback fires silently), and tiktoken's cl100k_base encoding is OpenAI-specific and will produce inaccurate token counts for Claude/Gemini models. The experimental flag signals to users that this node requires careful validation before use in production pipelines — see agent_rocketride/services.json for precedent.

Already fixed in current head via d629106. services.json now marks the node as "experimental" in capabilities, so the service metadata reflects that this node still needs careful validation before production use.

…etride-org#527) Clarify "auto" profile title in services.json to say "Use default model limits" since pipeline detection is not implemented — the implementation uses the configured model_name default to look up limits. All other review feedback items (module-level import, _message_tokens helper, Optional[str] signature, list comprehension, ClassVar annotation, endswith tuple form) were addressed in prior commits. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

nihalnihalani · 2026-04-08T00:05:45Z

Review Feedback Addressed

Fixes

Moved QuestionHistory import to module level — no circular dependency concern
Fixed inconsistent token overhead calculation — extracted shared _message_tokens() helper used by both optimize() and summarize_history()
Fixed count_tokens type signature — now accepts Optional[str] matching actual behavior
Simplified history conversion — list comprehension instead of explicit loop
Clarified "auto" profile — title now says "Use default model limits" since pipeline detection isn't implemented
Verified ClassVar annotation and endswith tuple form from prior commits

Ready for re-review.

github-actions · 2026-04-08T00:05:58Z

No description provided.

nihalnihalani · 2026-04-08T00:43:05Z

🚀 Merge Request

All review feedback addressed:

✅ Moved QuestionHistory import to module level
✅ Fixed inconsistent token overhead calculation
✅ Fixed count_tokens type signature
✅ Clarified "auto" profile title
✅ ClassVar annotation and endswith tuple form applied
✅ 63/64 tests passing (1 pre-existing unrelated failure)

Ready for re-review and merge. @asclearuc

nihalnihalani · 2026-04-08T00:51:34Z

@asclearuc — All review feedback addressed and pushed (7 fixes, 63/64 tests passing — 1 pre-existing unrelated failure). Ready for re-review. Thanks! 🙏

nihalnihalani requested review from Rod-Christensen, jmaionchi and stepmikhaylov as code owners March 30, 2026 22:43

github-actions bot added the module:nodes Python pipeline nodes label Mar 30, 2026

coderabbitai bot reviewed Mar 30, 2026

View reviewed changes

asclearuc requested changes Apr 1, 2026

View reviewed changes

coderabbitai bot reviewed Apr 2, 2026

View reviewed changes

nihalnihalani requested a review from asclearuc April 2, 2026 10:14

coderabbitai bot reviewed Apr 3, 2026

View reviewed changes

nodes/src/nodes/context_optimizer/IInstance.py Show resolved Hide resolved

coderabbitai bot reviewed Apr 3, 2026

View reviewed changes

nihalnihalani and others added 7 commits April 6, 2026 11:41

fix(deps): update pnpm-lock.yaml for lucide-react dependency

4e26890

nihalnihalani force-pushed the feature/context-optimizer-node branch from 270498d to 82f225a Compare April 6, 2026 18:46

alexs-aparavi reviewed Apr 7, 2026

View reviewed changes

refactor(context-optimizer): align logging and metadata

d629106

		@@ -0,0 +1 @@
		tiktoken>=0.7.0,<1.0.0

Conversation

nihalnihalani commented Mar 30, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Contribution Type

Problem / Use Case

Proposed Solution

Why This Feature Fits This Codebase

Validation

How This Could Be Extended

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 1, 2026

Choose a reason for hiding this comment

Rate Limit Exceeded

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

asclearuc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nihalnihalani commented Mar 30, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 30, 2026 •

edited

Loading

coderabbitai bot Mar 30, 2026 •

edited

Loading