Skip to content

MàJ DataHub#9

Open
NaicheD wants to merge 6500 commits into
NaicheD:masterfrom
datahub-project:master
Open

MàJ DataHub#9
NaicheD wants to merge 6500 commits into
NaicheD:masterfrom
datahub-project:master

Conversation

@NaicheD

@NaicheD NaicheD commented Oct 9, 2023

Copy link
Copy Markdown
Owner

No description provided.

v-tarasevich-blitz-brain and others added 23 commits May 28, 2026 12:55
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
)

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Max Margalith <max.margalith@datahub.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…er (#12435)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ries (#17641)

Co-authored-by: Cursor <cursoragent@cursor.com>
…wright results (#17654)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Adrian Machado <adrian.machado@datahub.com>
Co-authored-by: Adrian Machado <adrianm7151@gmail.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Chris Collins <chriscollins3456@gmail.com>
Co-authored-by: Devashish Chandra <devashish2203@users.noreply.github.com>
priyabratadas-dh and others added 30 commits June 17, 2026 15:30
…hema before fields

Follow-up to the schemaMetadata size-trimming fix. When a schemaMetadata
aspect exceeds the payload limit, the previous logic only budgeted the
fields list and ignored the rest of the aspect — chiefly the platformSchema
blob (an opaque, source-format schema dump that can be several MB on its own)
plus the aspect envelope. As a result the trimmed aspect could still overshoot
the limit and be rejected by GMS with a 400 "Cannot parse request entity",
which in batch mode also failed the unrelated aspects sharing the request.

Changes:
- Measure the whole aspect against the budget, not just the fields. A fast
  path no-ops when the aspect already fits.
- Shed the least-valuable content first: when oversized, drop the raw
  platformSchema blob before trimming fields, so the structured, queryable
  field metadata is preserved as far as possible. Fields are trimmed only if
  the aspect still doesn't fit.
- Dispatch over the platformSchema union explicitly per variant with direct
  attribute access, so the type checker verifies every field name and a future
  model change surfaces as a type error rather than a silent no-op. Small
  markers (e.g. KafkaSchema.documentSchemaType) are left intact, and an
  unrecognized variant is reported as a warning rather than dropped blindly.
- Report a single warning per entity with the count of dropped fields, instead
  of one warning per field, and track platform-schema drops in the processor
  report.

This is graceful degradation: a genuinely oversized schema is ingested with a
trimmed representation and a clear warning, rather than failing the dataset and
poisoning its batch. The change is in the ingestion client, so the executor/CLI
must be upgraded to pick it up.

Tests cover the no-op fast path, platform-schema-before-fields priority, the
drop-then-trim cascade, per-entity warning reporting, and full coverage of every
platformSchema union variant.
…stance to charts, dashboards, and dataflows (#17292)

Co-authored-by: Peter Rosina <peter.rosina@tui.com>
…Query entity (#17904)

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.