Expand component search indexing fields#2423
Open
Mbeaulne wants to merge 1 commit into
Open
Conversation
🎩 PreviewA preview build has been created at: |
Collaborator
Author
8 tasks
This was referenced Jun 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Description
Expands the component search index to include richer input/output details and a new
metadatamatch field.Previously, the
iosearchable field only contained input and output names. It now includes descriptions, types, and annotations for each input and output spec. A newmetadatafield has been added that indexes component-level metadata annotations (with a blocklist for noisy keys likepython_original_code, editor state, and similar large/irrelevant blobs) as well as the source label andpublished_byvalue from the component reference.The
MatchFieldtype and all related scoring, labeling, and UI display logic have been updated to includemetadataalongside the existing fields. Annotation values longer than 500 characters are excluded from indexing to avoid polluting search with large blobs.Related Issue and Pull requests
Type of Change
Checklist
Screenshots (if applicable)
Test Instructions
sklearnorlightgbm).metadatalisted as a matched field.python_original_codeor other excluded annotation keys and confirm it does not return results.parquet,artifact) and confirm results appear withioas the matched field.Additional Comments
The annotation exclusion list (
ANNOTATION_KEYS_EXCLUDED_FROM_SEARCH) and the 500-character value length cap are the primary mechanisms for keeping the metadata index clean. These can be extended as new noisy annotation keys are identified.