Skip to content

Add synonym expansion to component lexical search#2425

Open
Mbeaulne wants to merge 1 commit into
06-18-normalize_component_search_tokens_for_better_matchingfrom
06-18-add_synonym_groups
Open

Add synonym expansion to component lexical search#2425
Mbeaulne wants to merge 1 commit into
06-18-normalize_component_search_tokens_for_better_matchingfrom
06-18-add_synonym_groups

Conversation

@Mbeaulne

@Mbeaulne Mbeaulne commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Description

Adds a synonym expansion system to the component lexical search so that queries using common aliases resolve to the intended components. For example, searching gcs now surfaces storage-related components, fit surfaces training components, infer surfaces prediction components, and df surfaces dataframe/table components.

A new componentSearchSynonyms.ts module defines synonym groups (e.g. gcs ↔ storage ↔ bucket, train ↔ fit, predict ↔ infer, df ↔ dataframe ↔ table) and exposes expandSynonymTokens, which fans out any recognized token into all members of its group.

The search pipeline was also refactored to separate base tokenization (baseSearchTokens) from the full normalized text used for document indexing. Synonym expansion is applied to query tokens before scoring, and the phrase-match bonus now uses the original (pre-expansion) token sequence so multi-word phrase matching remains accurate.

Related Issue and Pull requests

Type of Change

  • Bug fix
  • New feature
  • Improvement
  • Cleanup/Refactor
  • Breaking change
  • Documentation update

Checklist

  • I have tested this does not break current pipelines / runs functionality
  • I have tested the changes on staging

Screenshots (if applicable)

Test Instructions

  1. Open the component search panel.
  2. Search for gcs and confirm storage/bucket components appear at the top.
  3. Search for fit and confirm model training components appear.
  4. Search for infer and confirm prediction components appear.
  5. Search for df and confirm dataframe/table components appear.
  6. Verify that multi-word phrase matching (e.g. train test split) still correctly ranks exact name matches above partial matches.

Additional Comments

The synonym groups are intentionally domain-neutral and kept in a single flat list in componentSearchSynonyms.ts to make it easy to extend with additional aliases in the future. THIS IS NOT AN EXHAUSTIVE LIST

@Mbeaulne Mbeaulne changed the title add synonym groups Add synonym expansion to component lexical search Jun 18, 2026
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

🎩 Preview

A preview build has been created at: 06-18-add_synonym_groups/f5a29c0

Comment thread src/services/componentSearchIndex.ts Outdated
Comment thread src/services/componentSearchIndex.ts
Comment thread src/services/componentSearchSynonyms.ts Outdated
Comment thread src/services/componentSearchIndex.ts Outdated
@Mbeaulne Mbeaulne force-pushed the 06-18-add_synonym_groups branch from 2655160 to dce82a1 Compare June 18, 2026 19:12
@Mbeaulne Mbeaulne force-pushed the 06-18-add_synonym_groups branch from dce82a1 to f5a29c0 Compare June 18, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant