docs_summarizer.iterate_with_tools method refactoring by blublinsky · Pull Request #2803 · openshift/lightspeed-service

blublinsky · 2026-03-09T15:26:38Z

Description

Refactoring of iterate_with_tools in docs_summarizer.py to prepare for tools approval support. This splits the existing monolithic method into several smaller, focused helpers while preserving all observable behavior.

Changes

_collect_round_llm_chunks – collects one round of LLM streaming chunks
_resolve_tool_call_definitions – maps LLM tool calls to executable StructuredTool instances with duplicate-name detection
_tool_result_chunk_for_message – converts a ToolMessage into a streamed tool_result chunk with metadata enrichment
_process_tool_calls_for_round – orchestrates tool execution, approval event forwarding, and result streaming for one round
_enrich_with_tool_metadata – static helper that adds server_name and tool_meta from MCP metadata to tool_call/tool_result events

Other changes in this PR

Introduce ChunkType enum and update StreamedChunk and streaming endpoint to use it

Type of change

Related Tickets & Documents

Related Issue #
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

openshift-ci · 2026-03-09T15:29:13Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign blublinsky for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

onmete · 2026-03-11T08:20:44Z

contains the same regression as per this comment

/hold

blublinsky · 2026-03-11T09:02:53Z

contains the same regression as per this comment
What exactly are you talking about?
Can you be more specific?

onmete · 2026-03-11T11:57:06Z

@blublinsky sorry, forgot to pass the link: #2783 (comment)

Upon further inspection, the enrichment regression point is partially invalid - my apologies. The PR replaces _enrich_tool_call and _build_tool_result_chunks with _enrich_with_tool_metadata and _tool_result_chunk_for_message. The core enrichment logic (server_name, tool_meta) seems to be preserved. It wasn't clear from the diff because both the old helpers and all their associated tests were removed at the same time. That said, the has_meta field is dropped from the tool result logging - minor, but worth noting.

Timeout - the old code wraps the entire iterate_with_tools body in timeout. The PR moves the timeout to _collect_round_llm_chunks only, leaving execute_tool_calls unprotected.

Deleted test coverage for metadata behavior.

I'd still suggest splitting this up for exactly the reasons above. A refactoring PR shouldn't change end behavior - it should be trivial to confirm that inputs and outputs stay the same. Mixing in timeout changes, new duplicate-tool detection, MIN_TOOL_EXECUTION_TOKENS budget skipping, the ChunkType enum, and the eager build_mcp_config in init makes that confirmation much harder than it needs to be.

blublinsky · 2026-03-11T14:46:19Z

@blublinsky sorry, forgot to pass the link: #2783 (comment)

Upon further inspection, the enrichment regression point is partially invalid - my apologies. The PR replaces _enrich_tool_call and _build_tool_result_chunks with _enrich_with_tool_metadata and _tool_result_chunk_for_message. The core enrichment logic (server_name, tool_meta) seems to be preserved. It wasn't clear from the diff because both the old helpers and all their associated tests were removed at the same time. That said, the has_meta field is dropped from the tool result logging - minor, but worth noting.

Timeout - the old code wraps the entire iterate_with_tools body in timeout. The PR moves the timeout to _collect_round_llm_chunks only, leaving execute_tool_calls unprotected.

Deleted test coverage for metadata behavior.

Fixed

onmete · 2026-03-11T15:21:51Z

Can you try to ask Cursor "split these changes so it will be easier to review"? And let it raise different PRs - we have a skill for that in this repo.
That would be really helpful and we would be able to bring these changes in faster.

blublinsky · 2026-03-11T15:48:26Z

Can you try to ask Cursor "split these changes so it will be easier to review"? And let it raise different PRs - we have a skill for that in this repo. That would be really helpful and we would be able to bring these changes in faster.

Sorry. I have wasted soooooo much time keeping everything in sync. Can we please finally merge this - after more than 2 weeks and move on
I was trying to be a good citizen, but there is a limit.
I reviewed much larger PRs recently

onmete · 2026-03-11T15:52:00Z

Then you are a better reviewer than I am. I can't confidentelly ack these changes.

blublinsky · 2026-03-13T13:11:01Z

/retest

blublinsky · 2026-03-13T16:18:00Z

/retest

blublinsky · 2026-03-13T16:19:10Z

/test e2e-ols-cluster

openshift-ci · 2026-03-13T18:11:27Z

@blublinsky: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

blublinsky · 2026-03-13T18:40:32Z

/retest

xrajesh · 2026-03-17T21:36:50Z

ols/src/query_helpers/docs_summarizer.py

+                # Phase 2: resolve and execute tool calls for this round.
+                tool_token_usage = ToolTokenUsage(used=tool_tokens_used)
+                try:
+                    async with asyncio.timeout(constants.TOOL_CALL_ROUND_TIMEOUT):


@blublinsky - tool_call_round_timeout can run twice as long now (worst case), because invoke_llm also using it . How do you plan to manage this ?

openshift-ci bot requested review from bparees and joshuawilson March 9, 2026 15:29

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 11, 2026

blublinsky force-pushed the doc_summarizer-refactoring branch 2 times, most recently from 180a7ec to 670cb90 Compare March 11, 2026 14:43

docs_summarizer.iterate_with_tools method refactoring

7f98fab

blublinsky force-pushed the doc_summarizer-refactoring branch from 670cb90 to 7f98fab Compare March 13, 2026 13:01

xrajesh reviewed Mar 17, 2026

View reviewed changes

Conversation

blublinsky commented Mar 9, 2026

Description

Changes

Other changes in this PR

Type of change

Related Tickets & Documents

Checklist before requesting a review

Testing

Uh oh!

openshift-ci bot commented Mar 9, 2026

Uh oh!

onmete commented Mar 11, 2026

Uh oh!

blublinsky commented Mar 11, 2026

Uh oh!

onmete commented Mar 11, 2026

Uh oh!

blublinsky commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

onmete commented Mar 11, 2026

Uh oh!

blublinsky commented Mar 11, 2026

Uh oh!

onmete commented Mar 11, 2026

Uh oh!

blublinsky commented Mar 13, 2026

Uh oh!

blublinsky commented Mar 13, 2026

Uh oh!

blublinsky commented Mar 13, 2026

Uh oh!

openshift-ci bot commented Mar 13, 2026

Uh oh!

blublinsky commented Mar 13, 2026

Uh oh!

xrajesh Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

blublinsky commented Mar 11, 2026 •

edited

Loading