fix: replace hardcoded "images" dataset check with generic corruption handling by gordonmurray · Pull Request #40 · lance-format/lance-data-viewer

gordonmurray · 2026-04-07T19:11:07Z

Fixes #19

Problem

/datasets/{name}/rows had a hardcoded branch that forced a schema-only corrupted_but_readable_schema response whenever the dataset was named images, regardless of whether the data was actually corrupted. Two failure modes fell out of this:

Any healthy dataset named images was incorrectly surfaced as corrupted.
Any corrupted dataset with a different name got no special handling.

Change

Remove the name-based check and rely on the existing except around the read path. Any dataset that fails to read (corruption, format error, unreadable bytes) now falls back to the same informational single-row response that was already documented as the graceful-degradation path. Healthy datasets named images are read normally.

Also drop the fallback log level from error to warning, since graceful degradation is an expected path rather than an error condition.

Verification

Smoke-tested locally against a temp LanceDB directory containing two tables:

normal (10 rows, with a vector column): read normally, returns rows and totals as expected.
images (5 rows): previously this would return the hardcoded corrupted_but_readable_schema single-row response. With this change it now returns the real data:

{
  "rows": [
    {"id": 0, "label": "img0"},
    {"id": 1, "label": "img1"},
    {"id": 2, "label": "img2"}
  ],
  "total": 5,
  "limit": 3,
  "offset": 0
}

The fallback path still triggers on any read failure and returns the existing error / dataset / details informational row, so the graceful-degradation contract is unchanged for actually-corrupted datasets.

Notes

No test additions in this PR; a proper endpoint test suite is tracked separately in test: add API endpoint tests #28.
No API shape changes for healthy datasets. The only observable behavior change is that healthy datasets named images now return their real rows instead of the synthetic schema-info row.

… handling The `/datasets/{name}/rows` endpoint had a hardcoded branch that forced a schema-only "corrupted_but_readable_schema" response whenever the dataset was named `images`, regardless of whether the data was actually corrupted. Any healthy dataset sharing that name was incorrectly shown as corrupted, and any corrupted dataset with a different name got no special handling. Remove the name-based check and rely on the existing exception handler around the read path. Any dataset that fails to read (corruption, format error, unreadable bytes) now falls back to the same informational single-row response, matching the graceful-degradation behavior already documented for the endpoint. Healthy datasets named `images` are read normally. Also drop the log level for the fallback from `error` to `warning`, since graceful degradation is an expected path rather than an error condition. Fixes lance-format#19

gordonmurray force-pushed the fix/generic-corruption-handling branch from 21231d9 to a516b7e Compare April 7, 2026 19:17

gordonmurray merged commit fb5debe into lance-format:main Apr 7, 2026
12 checks passed

gordonmurray deleted the fix/generic-corruption-handling branch April 7, 2026 20:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: replace hardcoded "images" dataset check with generic corruption handling#40

fix: replace hardcoded "images" dataset check with generic corruption handling#40
gordonmurray merged 1 commit intolance-format:mainfrom
gordonmurray:fix/generic-corruption-handling

gordonmurray commented Apr 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

gordonmurray commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Change

Verification

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gordonmurray commented Apr 7, 2026 •

edited

Loading