Add search quality expectation tests for lexical search#2431
Conversation
🎩 PreviewA preview build has been created at: |
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
| expectedDigests: ["text-embeddings"], | ||
| }, | ||
| { | ||
| query: "upload a file but not to GCS", |
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
This case passes even if negative-constraint parsing is removed entirely: the plain query "upload a file" already ranks local-upload #1 (upload_file matches "file"; upload_to_gcs does not), and the assertion only pins rank #1 via slice(0,1). It does not exercise the exclusion. Assert the exclusion directly (e.g. expect(results.map(r => r.digest)).not.toContain("gcs-upload")), or shape the fixture so gcs-upload would out-rank local-upload absent the negative clause.
| expectedDigests: ["predict-labels"], | ||
| }, | ||
| { | ||
| query: "make vector embeddings for text", |
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
No query in this suite requires synonym expansion: each shares a literal token/stem with its target (this one matches on literal vector/embeddings/text), so the synonym feature is never isolated and the suite would not catch it regressing. Add 1-2 synonym-only cases, e.g. "vectorize text documents" → ["text-embeddings"] and "store a file in a bucket" → ["gcs-upload"].
| (result) => result.digest, | ||
| ); | ||
|
|
||
| expect(results.slice(0, expectedDigests.length)).toEqual( |
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
These assertions pin only rank #1, not the ordering of close competitors nor that irrelevant components stay out of the visible top-5. Optional: add a couple of 2-element expectedDigests where the secondary match is stable (deterministic tie-break, so not flaky).
| }), | ||
| ]); | ||
|
|
||
| it.each([ |
There was a problem hiding this comment.
🤖 This is an AI-generated code review comment.
No ambiguous-multi-match or empty/nonsense-result case in the suite. Optional: add a query that should return multiple relevant components and assert both are present in the top-N.
28a8928 to
6e2b2ae
Compare
7d30372 to
c443c7a
Compare
6e2b2ae to
d8565d2
Compare
c443c7a to
9fdd3d5
Compare
d8565d2 to
761f88a
Compare
d9e254e to
1351eea
Compare
761f88a to
41a7bd9
Compare
41a7bd9 to
c379d9b
Compare
1351eea to
817441b
Compare

Description
Related Issue and Pull requests
Type of Change
Checklist
Screenshots (if applicable)
Test Instructions
Additional Comments