fix: handle empty gremlin examples when building index#323
fix: handle empty gremlin examples when building index#323Gardenia-zx wants to merge 1 commit intoapache:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes an IndexError in BuildGremlinExampleIndex.run() by safely handling the case where no Gremlin examples (or no embeddings) are produced, and updates unit tests to assert that embedding/vector-index operations are skipped in those cases.
Changes:
- Add an early return in
BuildGremlinExampleIndex.run()whenexamplesis empty (setscontext["embed_dim"]=0). - Add an early return when the generated embeddings list is empty (avoids
examples_embedding[0]access). - Update unit tests to validate “no-op” behavior for empty examples and adjust mocking to prevent coroutine warnings.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
hugegraph-llm/src/hugegraph_llm/operators/index_op/build_gremlin_example_index.py |
Adds guards for empty examples / empty embeddings to avoid indexing into an empty list and to skip vector index operations. |
hugegraph-llm/src/tests/operators/index_op/test_build_gremlin_example_index.py |
Updates mocks and assertions to ensure no embedding/index calls happen when examples are empty, and verifies embed_dim is set to 0. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| def run(self, context: Dict[str, Any]) -> Dict[str, Any]: | ||
| # !: We have assumed that self.example is not empty | ||
| if not self.examples: |
There was a problem hiding this comment.
BuildGremlinExampleIndexNode.node_init() still rejects [] via if not self.wk_input.examples (hugegraph-llm/src/hugegraph_llm/nodes/index_node/build_gremlin_example_index.py:37), so the main BuildExampleIndexFlow never reaches this branch for the empty-list case. As written, the unit test passes but the end-to-end workflow still returns examples is required. Could we align the node-level guard in the same PR or add flow-level coverage?
| # !: We have assumed that self.example is not empty | ||
| if not self.examples: | ||
| context["embed_dim"] = 0 | ||
| return context |
There was a problem hiding this comment.
gremlin_examples index untouched. GremlinExampleIndexQuery reuses that index whenever exist("gremlin_examples") is true, so rebuilding with an empty example set will still serve stale examples from the previous run. If an empty list is meant to clear the few-shot examples, we should wipe the index before returning.
| return context | |
| if not self.examples: | |
| self.vector_index.clean(self.vector_index_name) | |
| context["embed_dim"] = 0 | |
| return context |
Purpose
This PR handles the empty examples case in
BuildGremlinExampleIndex.Previously, when
exampleswas empty,run()could accessexamples_embedding[0]and raise anIndexError.Changes
examplesis empty.Tests