Add source-diverse AI rerank candidate pool by Mbeaulne · Pull Request #2429 · TangleML/tangle-ui

Mbeaulne · 2026-06-18T17:40:42Z

Description

The AI rerank candidate pool now uses a source-diverse selection strategy instead of relying purely on lexical hits. Previously, the candidate pool was capped at 50 results drawn entirely from lexical matches, falling back to a plain alphabetical slice when no matches were found. This meant components from underrepresented sources (e.g. user uploads) could be crowded out entirely when a query produced many strong lexical hits from a single source.

The new approach builds the candidate pool in three layers:

Up to 60 of the strongest lexical hits for the query
An evenly-sampled set of up to 8 candidates per source (source-diverse browse)
An evenly-sampled alphabetical slice of the full index to fill remaining slots up to the new cap of 80

The rerank base is now always set to aiCandidateMatches rather than switching between lexical and AI candidate lists depending on whether lexical results existed.

Related Issue and Pull requests

Type of Change

Checklist

I have tested this does not break current pipelines / runs functionality
I have tested the changes on staging

Screenshots (if applicable)

Test Instructions

Open the component search panel in the editor.
Enter a query that matches many components from a single source (e.g. a library with 100+ entries).
Click the AI rerank button and verify that components from other sources (e.g. user-uploaded files) still appear in the ranked results.
Verify that the total candidate count sent to the reranker does not exceed 80.
Run the unit tests in componentSearchV2Logic.test.ts to confirm the new source-diversity test passes.

Additional Comments

The new sampleEvenly helper picks items at uniform intervals so that the browse sample is representative of the full sorted list rather than just the top entries.

github-actions · 2026-06-18T17:40:53Z

🎩 Preview

A preview build has been created at: 06-18-build_broader_ai_candidate_pools_for_component_search/88f3546

Mbeaulne · 2026-06-18T17:41:02Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

Mbeaulne · 2026-06-18T18:56:49Z

+  appendUniqueMatches(
+    candidates,
+    seenDigests,
+    sampleEvenly(sortedIndex, AI_CANDIDATE_LIMIT).map(indexEntryToLexicalMatch),


🤖 This is an AI-generated code review comment.

[MEDIUM] The source-diversity layer (buildSourceDiverseBrowseMatches) and this alphabetical-fill layer run unconditionally, padding the AI candidate pool toward the cap (AI_CANDIDATE_LIMIT = 80) with alphabetically-early, lexically-irrelevant components even when lexical search already returned a strong source-spanning set.

This is not a correctness bug — lexical hits are preserved first (appended before the fill layers), and RERANK_EXCLUSION_THRESHOLD keeps junk from being badged. But it sends more low-signal candidates to the billed reranker on every rerank (cost/latency), and can surface irrelevant items in the unbadged tail.

Optional fix: only run the fill layer when the pool is under a smaller floor, or skip the alphabetical fill when lexical + diversity already produced a source-diverse set. Worth confirming reranker cost at 80 vs 50 candidates is acceptable.

Mbeaulne mentioned this pull request Jun 18, 2026

Add negative constraint parsing to lexical search #2428

Open

8 tasks

Mbeaulne changed the title ~~Build broader AI candidate pools for component search~~ Add source-diverse AI rerank candidate pool Jun 18, 2026

Mbeaulne mentioned this pull request Jun 18, 2026

Add deep AI search to rerank all components in selected sources #2430

Open

8 tasks

Mbeaulne marked this pull request as ready for review June 18, 2026 17:56

Mbeaulne requested a review from a team as a code owner June 18, 2026 17:56

This was referenced Jun 18, 2026

Add search quality expectation tests for lexical search #2431

Open

add client-side embeddings cached in IndexedDB #2432

Open

Debounce component search input #2433

Open

Mbeaulne commented Jun 18, 2026

View reviewed changes

Mbeaulne force-pushed the 06-18-build_broader_ai_candidate_pools_for_component_search branch from d363ca7 to 60b076d Compare June 18, 2026 19:12

Mbeaulne force-pushed the 06-18-parse_negative_constraints_without_not_no_exclude_ branch from 97e37c0 to 4f20ff2 Compare June 18, 2026 19:12

Mbeaulne force-pushed the 06-18-build_broader_ai_candidate_pools_for_component_search branch from 60b076d to 455266e Compare June 18, 2026 20:28

Mbeaulne force-pushed the 06-18-parse_negative_constraints_without_not_no_exclude_ branch from 4f20ff2 to 638c7b7 Compare June 18, 2026 20:28

Mbeaulne force-pushed the 06-18-build_broader_ai_candidate_pools_for_component_search branch from 455266e to 4a246ee Compare June 18, 2026 20:49

Mbeaulne force-pushed the 06-18-parse_negative_constraints_without_not_no_exclude_ branch from 638c7b7 to 3f91762 Compare June 18, 2026 20:49

Mbeaulne force-pushed the 06-18-build_broader_ai_candidate_pools_for_component_search branch from 4a246ee to 8cc6222 Compare June 18, 2026 21:02

Mbeaulne force-pushed the 06-18-parse_negative_constraints_without_not_no_exclude_ branch from 3f91762 to 790c426 Compare June 18, 2026 21:02

Build broader AI candidate pools for component search

88f3546

Mbeaulne force-pushed the 06-18-build_broader_ai_candidate_pools_for_component_search branch from 8cc6222 to 88f3546 Compare June 18, 2026 21:16

Mbeaulne force-pushed the 06-18-parse_negative_constraints_without_not_no_exclude_ branch from 790c426 to 554c927 Compare June 18, 2026 21:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add source-diverse AI rerank candidate pool#2429

Add source-diverse AI rerank candidate pool#2429
Mbeaulne wants to merge 1 commit into
06-18-parse_negative_constraints_without_not_no_exclude_from
06-18-build_broader_ai_candidate_pools_for_component_search

Mbeaulne commented Jun 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

Mbeaulne commented Jun 18, 2026 •

edited

Loading

Uh oh!

Mbeaulne Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Mbeaulne commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue and Pull requests

Type of Change

Checklist

Screenshots (if applicable)

Test Instructions

Additional Comments

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎩 Preview

Uh oh!

Mbeaulne commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mbeaulne Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Mbeaulne commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Mbeaulne commented Jun 18, 2026 •

edited

Loading