Skip to content

fix: handle empty gremlin examples when building index#323

Open
Gardenia-zx wants to merge 1 commit intoapache:mainfrom
Gardenia-zx:fix-gremlin-example-empty-list
Open

fix: handle empty gremlin examples when building index#323
Gardenia-zx wants to merge 1 commit intoapache:mainfrom
Gardenia-zx:fix-gremlin-example-empty-list

Conversation

@Gardenia-zx
Copy link
Copy Markdown

Purpose

This PR handles the empty examples case in BuildGremlinExampleIndex.

Previously, when examples was empty, run() could access examples_embedding[0] and raise an IndexError.

Changes

  • Return early when examples is empty.
  • Return early when generated embeddings are empty.
  • Update the unit test to verify that no embedding or vector index operation is triggered for empty examples.
  • Adjust the mock target to avoid coroutine warnings in tests.

Tests

uv run pytest src/tests/operators/index_op/test_build_gremlin_example_index.py -v

@dosubot dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label May 3, 2026
@github-actions github-actions Bot added the llm label May 3, 2026
@dosubot dosubot Bot added the bug Something isn't working label May 3, 2026
@imbajin imbajin requested a review from Copilot May 5, 2026 03:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an IndexError in BuildGremlinExampleIndex.run() by safely handling the case where no Gremlin examples (or no embeddings) are produced, and updates unit tests to assert that embedding/vector-index operations are skipped in those cases.

Changes:

  • Add an early return in BuildGremlinExampleIndex.run() when examples is empty (sets context["embed_dim"]=0).
  • Add an early return when the generated embeddings list is empty (avoids examples_embedding[0] access).
  • Update unit tests to validate “no-op” behavior for empty examples and adjust mocking to prevent coroutine warnings.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
hugegraph-llm/src/hugegraph_llm/operators/index_op/build_gremlin_example_index.py Adds guards for empty examples / empty embeddings to avoid indexing into an empty list and to skip vector index operations.
hugegraph-llm/src/tests/operators/index_op/test_build_gremlin_example_index.py Updates mocks and assertions to ensure no embedding/index calls happen when examples are empty, and verifies embed_dim is set to 0.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


def run(self, context: Dict[str, Any]) -> Dict[str, Any]:
# !: We have assumed that self.example is not empty
if not self.examples:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ BuildGremlinExampleIndexNode.node_init() still rejects [] via if not self.wk_input.examples (hugegraph-llm/src/hugegraph_llm/nodes/index_node/build_gremlin_example_index.py:37), so the main BuildExampleIndexFlow never reaches this branch for the empty-list case. As written, the unit test passes but the end-to-end workflow still returns examples is required. Could we align the node-level guard in the same PR or add flow-level coverage?

# !: We have assumed that self.example is not empty
if not self.examples:
context["embed_dim"] = 0
return context
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ Returning early here leaves any previously persisted gremlin_examples index untouched. GremlinExampleIndexQuery reuses that index whenever exist("gremlin_examples") is true, so rebuilding with an empty example set will still serve stale examples from the previous run. If an empty list is meant to clear the few-shot examples, we should wipe the index before returning.

Suggested change
return context
if not self.examples:
self.vector_index.clean(self.vector_index_name)
context["embed_dim"] = 0
return context

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working llm size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants