Skip to content

fix: validate schema field IDs during unmarshal#1315

Merged
zeroshade merged 3 commits into
apache:mainfrom
fallintoplace:fix/schema-unmarshal-duplicate-field-ids
Jul 3, 2026
Merged

fix: validate schema field IDs during unmarshal#1315
zeroshade merged 3 commits into
apache:mainfrom
fallintoplace:fix/schema-unmarshal-duplicate-field-ids

Conversation

@fallintoplace

Copy link
Copy Markdown
Contributor

Summary

  • validate duplicate field IDs while unmarshalling schemas from JSON
  • add coverage for top-level and nested duplicate IDs returning ErrInvalidSchema

Why

Schema constructors already reject duplicate positive field IDs, but Schema.UnmarshalJSON assigned decoded fields without running the same check. Malformed schema JSON could enter the system and later corrupt ID-based lookups, projections, name mapping, or writes.

Testing

  • go test . -run '^TestUnmarshalSchemaRejectsDuplicateFieldIDs$' -count=1\n- go test . -count=1\n- go run github.com/golangci/golangci-lint/v2/cmd/golangci-lint@v2.11.4 run --timeout=10m .\n- git diff --check

@fallintoplace fallintoplace requested a review from zeroshade as a code owner June 24, 2026 21:42

@tanmayrauth tanmayrauth left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one nit on test coverage.

Comment thread schema_test.go
assert.True(t, tableSchemaSimple.Equals(&schema))
}

func TestUnmarshalSchemaRejectsDuplicateFieldIDs(t *testing.T) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice — the top-level and nested-struct cases are the important ones. Since checkDuplicateFieldIDs also walks list element IDs and map key/value IDs (schema.go:118-130), would you consider adding a list or map dup-ID case too? Those branches are only exercised indirectly right now, and they're the easiest to silently break later.

@fallintoplace fallintoplace force-pushed the fix/schema-unmarshal-duplicate-field-ids branch from b5da3aa to 30d2edd Compare June 25, 2026 07:30

@zeroshade zeroshade left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this. The duplicate-ID detection is correct, and I like the recursion into struct/list/map plus the negative-placeholder carve-out. Requesting changes on one gap relative to the PR title (validate schema field IDs) - noted inline.

Comment thread schema.go
return err
}

if err := checkDuplicateFieldIDs(nil, aux.Fields); err != nil {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catches duplicate positive IDs, but missing list/map IDs still slip through. checkDuplicateFieldIDs only records f.ID > 0 (schema.go:101), and an omitted list element-id or map key-id/value-id decodes to 0 - so a schema that omits one of those still unmarshals cleanly (a lone 0 isn't a duplicate). If the goal is to validate field IDs on unmarshal, it'd be worth rejecting missing/zero element/key/value IDs too (e.g. decode them as *int and error when absent), while still allowing the negative placeholders used during construction. Is the broader validation in scope here, or is this intentionally limited to duplicates?

@laskoviymishka laskoviymishka left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there — the missing-key (nil) checks and the checkDuplicateFieldIDs reuse across the constructor and the decode path are exactly the right shape. One blocker before this lands.

The == 0 rejection on element-id/key-id/value-id is stricter than the spec asks for. The spec only requires these IDs to be unique in the table schema — it doesn't reserve 0 or require positivity, and Java, PyIceberg, and iceberg-rust all parse 0 without complaint. So this makes iceberg-go permanently unable to read otherwise-valid metadata those clients wrote, and it breaks our own round-trip: MarshalJSON(&ListType{}) emits element-id: 0, which this same UnmarshalJSON now refuses. Worth noting too that these methods fire on every list/map decode — partition fields, sort orders, REST responses — so the effect is a global tightening of list/map decoding, not just schema validation.

Before merge:

  • Drop the three *aux.ID == 0 / *aux.KeyID == 0 / *aux.ValueID == 0 checks; keep the nil checks and the duplicate-ID check.
  • Add a blank line before each consecutive guard block — nlreturn will fail CI otherwise.
  • Flip the two "zero …" test rows to expect success, add the missing zero-map-key-id / missing-map-value-id cases, and add one positive round-trip case asserting a valid list/map schema still unmarshals.

The nil checks and duplicate-ID validation are good as-is. Happy to take another pass once the zero-checks come out.

Comment thread types.go Outdated
Comment on lines +455 to +460
if aux.ID == nil {
return fmt.Errorf("%w: field is missing required 'element-id' key in JSON", ErrInvalidSchema)
}
if *aux.ID == 0 {
return fmt.Errorf("%w: field 'element-id' must not be 0", ErrInvalidSchema)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nil check here is right — it mirrors Java's JsonUtil.getInt existence check, so keep it. The == 0 rejection is the blocker, though, and I'd drop it.

The spec only requires field IDs to be unique in the table schema — it doesn't reserve 0 or require positivity, and Java, PyIceberg, and iceberg-rust all parse 0 without complaint. So this makes iceberg-go permanently unable to read otherwise-valid metadata any of them wrote with element-id: 0. It also breaks our own round-trip: ListType.ElementID is a plain int, so json.Marshal(&iceberg.ListType{}) emits element-id: 0 and this same method then refuses it.

Suggested change
if aux.ID == nil {
return fmt.Errorf("%w: field is missing required 'element-id' key in JSON", ErrInvalidSchema)
}
if *aux.ID == 0 {
return fmt.Errorf("%w: field 'element-id' must not be 0", ErrInvalidSchema)
}
if aux.ID == nil {
return fmt.Errorf("%w: field is missing required 'element-id' key in JSON", ErrInvalidSchema)
}

Reject 0 vs. accept it as the spec allows — wdyt?

Comment thread types.go Outdated
Comment on lines +544 to +555
if aux.KeyID == nil {
return fmt.Errorf("%w: field is missing required 'key-id' key in JSON", ErrInvalidSchema)
}
if *aux.KeyID == 0 {
return fmt.Errorf("%w: field 'key-id' must not be 0", ErrInvalidSchema)
}
if aux.ValueID == nil {
return fmt.Errorf("%w: field is missing required 'value-id' key in JSON", ErrInvalidSchema)
}
if *aux.ValueID == 0 {
return fmt.Errorf("%w: field 'value-id' must not be 0", ErrInvalidSchema)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same blocker on the map side — keep both nil checks, drop the two == 0 checks. Java's mapFromJson parses zero key-id/value-id fine, so rejecting them breaks reading Java/Py/Rust-written maps the same way. The suggestion also spaces the two guards out, which keeps nlreturn happy.

Suggested change
if aux.KeyID == nil {
return fmt.Errorf("%w: field is missing required 'key-id' key in JSON", ErrInvalidSchema)
}
if *aux.KeyID == 0 {
return fmt.Errorf("%w: field 'key-id' must not be 0", ErrInvalidSchema)
}
if aux.ValueID == nil {
return fmt.Errorf("%w: field is missing required 'value-id' key in JSON", ErrInvalidSchema)
}
if *aux.ValueID == 0 {
return fmt.Errorf("%w: field 'value-id' must not be 0", ErrInvalidSchema)
}
if aux.KeyID == nil {
return fmt.Errorf("%w: field is missing required 'key-id' key in JSON", ErrInvalidSchema)
}
if aux.ValueID == nil {
return fmt.Errorf("%w: field is missing required 'value-id' key in JSON", ErrInvalidSchema)
}

Comment thread schema_test.go
contains: "field is missing required 'element-id' key in JSON",
},
{
name: "zero list element id",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the == 0 checks come out, this row and "zero map value id" assert the wrong outcome — they'd need to flip to require.NoError. While here: the table never exercises zero map key-id or missing map value-id (both are real branches), and there's no positive case proving a valid non-zero list/map schema still unmarshals after the *int change — tableSchemaNested goes through the constructor, not JSON decode, so a regression there wouldn't be caught. I'd add a round-trip case asserting require.NoError.

@zeroshade zeroshade merged commit 4fd86d5 into apache:main Jul 3, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants