skills: improve validate-search-filters with aggregation-first approach#98
Merged
Conversation
- Prefer aggregation count queries over random document sampling to avoid false confidence from lucky samples. - Add explicit grep step to discover all QueryResources callers before trusting the reference table. - Document FiltersAll mechanism alongside Filters in the mechanism table. - Add reference rows for search_members, get_membership_key_contacts, and search_b2b_orgs tools. - Clarify that Name and date fields do not need index field verification. - Restructure Step 4 to lead with count queries and relegate sample queries to a secondary debugging-only role. Assisted-by: github-copilot:claude-sonnet-4.6 Signed-off-by: Eric Searcy <eric@linuxfoundation.org>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the validate-search-filters skill documentation/workflow to improve how filter correctness is validated against the live OpenSearch resources index and upstream indexer-contract docs, shifting primary evidence collection toward count-based queries and clarifying tool/filter mapping guidance.
Changes:
- Reworked Step 4 to prefer
size: 0hit-count queries (with sampling as a secondary debug step). - Added a grep-first discovery step to find all
QueryResourcescall sites before trusting the reference table. - Expanded the mechanism and tool/filter reference tables (including
FiltersAlland additional tools), and clarified thatName/date operations don’t require index-field verification.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Rename 'aggregation count query' to 'count-only query' to accurately reflect the use of size:0 without an aggregation pipeline. - Add explicit note that track_total_hits is intentionally omitted; approximate counts are sufficient evidence for field presence. - Fix grep command to use -E flag for portable extended regex syntax. Assisted-by: github-copilot:claude-sonnet-4.6 Signed-off-by: Eric Searcy <eric@linuxfoundation.org>
…filters - Fix total.value -> hits.total.value (correct OpenSearch response path). - Replace the "exactly 10,000" heuristic with hits.total.relation: 'eq' means exact, 'gte' means lower bound. - Fix remaining 'aggregation count queries' -> 'count-only queries' in Step 8. Assisted-by: github-copilot:claude-sonnet-4.6 Signed-off-by: Eric Searcy <eric@linuxfoundation.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Improves the
validate-search-filtersskill with several workflow and documentation enhancements:"size": 0) as the primary evidence step. A handful of random docs can give false confidence — counts prove whether a field is populated across the corpus. Sample queries are now a secondary debugging tool only.QueryResourcescallers before trusting the reference table, since the table may lag new tool additions.FiltersAllpayload field alongsideFiltersin the mechanism table.search_members,get_membership_key_contacts, andsearch_b2b_orgsin the tool/filter reference table.Nameand date fields are query-time operations and do not need index field verification.🤖 Generated with GitHub Copilot (via OpenCode)