fix: fall back to catalog search when search_business_context returns empty (closes #61)#69
Merged
Merged
Conversation
… empty (closes #61) When a dataset exists in DataHub but has no docs, glossary terms, domains, or data products, all four business-context sub-searches return empty and the LLM was incorrectly telling the user the entity does not exist. _search_business_context_impl now detects the all-empty case and automatically runs a general catalog search, returning the results as `catalog_search` so the agent can confirm entity existence before drawing conclusions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…core The assessor LLM could see catalog_search hits in a search_business_context result and score context as Fair when all governance searches (docs, glossary, domains, data products) were actually empty. Add an explicit rule to the assessment prompt clarifying that catalog_search is a last-resort existence check only and must not raise the score. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A rich dataset description from catalog_search is genuinely useful, so treating it as "empty" was wrong. The correct rule: no governed definition (glossary, docs, domain, data product) → score cannot exceed 3 (Fair), but within 1-3 the assessor should still use get_entities results to judge how informative the context was. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
search_business_contextsub-searches return empty and the agent was incorrectly telling the user the entity doesn't exist._search_business_context_implnow detects the all-empty case and automatically runs a general catalogsearch, returning results ascatalog_search+ anoteexplaining the gap — so the agent sees the entity exists before drawing conclusions.catalog_searchwas found (no governed definition), while still allowing the assessor to score 1–2 if the entity description was unhelpful. Scores of 4–5 continue to require a governed definition (glossary, doc, domain, or data product).Test plan
tests/unit/test_search_business_context.py— all passing (271 total)datahub docker quickstartinstance:_search_business_context_impl("SampleHiveDataset")returnscatalog_searchwith 3 hits and anote, rather than four empty results🤖 Generated with Claude Code