Skip to content

feat(validation): validate -config against Talos network configuration schema#6

Merged
Andrei Kvapil (kvaps) merged 4 commits into
mainfrom
feat/config-schema-validation
Jun 5, 2026
Merged

feat(validation): validate -config against Talos network configuration schema#6
Andrei Kvapil (kvaps) merged 4 commits into
mainfrom
feat/config-schema-validation

Conversation

@kvaps

@kvaps Andrei Kvapil (kvaps) commented Jun 4, 2026

Copy link
Copy Markdown
Member

What

The -config file is now validated against the actual Talos metal network configuration schema (network.PlatformConfigSpec from siderolabs/talos/pkg/machinery v1.13.3) instead of a generic YAML well-formedness check.

The validation strictly rejects:

  • unknown fields (catches typos like adress or upDelay instead of updelay)
  • malformed IP addresses and CIDRs
  • invalid enum values (e.g. family: inet5)
  • empty configuration and extra YAML documents

The original file bytes are now written to the META partition as-is, without re-marshaling through a generic YAML round-trip.

Documentation links updated to the Talos v1.13 docs (the old v1.8 URL is no longer available).

Why

A typo in a field name or value previously passed the YAML check silently and produced a configuration that Talos would partially or completely ignore at boot — with no feedback at write time. Validating against the same type Talos itself uses to parse META key 0xa catches these mistakes before anything is written to disk.

Breaking change

The tool no longer accepts arbitrary YAML: input must be a valid metal network configuration. Note that configs using the Talos ≤1.8 dhcp6.DUID field are rejected — it was renamed to dhcp6.clientIdentifier upstream.

Summary by CodeRabbit

  • Documentation
    • Updated metal network configuration docs to Talos v1.13 and documented schema validation behavior.
  • New Features
    • Added optional runtime flag to bypass configuration schema validation.
  • Bug Fixes
    • Configuration validation now rejects unknown fields, malformed addresses, invalid enum values, empty configs, and extra documents.

Andrei Kvapil (kvaps) and others added 2 commits June 4, 2026 19:31
Replace the generic YAML well-formedness check with a strict decode
into network.PlatformConfigSpec from the Talos machinery module — the
same type Talos uses to parse META key 0xa. Unknown fields, malformed
addresses, invalid enum values and extra YAML documents are now
rejected before anything is written to the META partition.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The pull request replaces YAML normalization validation with strict Talos v1.13 metal network schema validation. Dependencies are updated to include Talos machinery v1.13.3, documentation links to the v1.13 schema, a new validateConfig function validates input YAML strictly, and the main flow integrates this validation while writing raw configuration bytes.

Changes

Validation Schema Migration

Layer / File(s) Summary
Dependency and documentation update
go.mod, README.md
go.mod replaces the old YAML dependency with github.com/siderolabs/talos/pkg/machinery v1.13.3 and go.yaml.in/yaml/v4, and expands/pins indirect deps; README.md updates the Talos metal network docs link to v1.13 and documents pre-write validation and -skip-validation.
Validation implementation and test coverage
validate.go, validate_test.go
Adds validateConfig(data []byte) error which decodes with KnownFields(true), errors on empty input, and rejects extra YAML documents; validate_test.go adds a valid fixture and table-driven negative tests plus targeted error-message checks.
Main flow integration and test cleanup
main.go, main_test.go
main.go adds -skip-validation, conditionally calls validateConfig, writes original configData bytes to META (no YAML-normalized round-trip), and main_test.go removes the old YAML validation tests.

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • cozystack/talos-meta-tool#5: Refactors writeConfig to operate on the GPT-discovered META partition handle, which pairs with this PR's switch to write raw configData bytes.
  • cozystack/talos-meta-tool#4: Introduces the validateYAML-based YAML normalization that this PR replaces with the new strict validateConfig validation logic.

Poem

🐰 I hopped through YAML fields with care,
Sniffed unknown keys and malformed air.
I checked each enum and IP line,
Then wrote raw bytes to META fine.
Hooray—validation, neat and fair!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: implementing validation of the -config parameter against the Talos network configuration schema, which is the primary objective of this PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/config-schema-validation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces the generic YAML validation with strict validation of the metal network configuration against the Talos v1.13 network.PlatformConfigSpec schema. It introduces a new validate.go file using go.yaml.in/yaml/v4 to decode and validate fields strictly, rejecting unknown fields, malformed values, and extra YAML documents. Corresponding tests are added in validate_test.go, and dependencies in go.mod are updated. The reviewer suggested preserving the underlying parser error when checking for unexpected extra YAML documents to provide better diagnostic feedback to the user.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread validate.go Outdated
Comment on lines +31 to +33
if err := dec.Decode(new(any)); !errors.Is(err, io.EOF) {
return errors.New("unexpected extra YAML document")
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When checking for unexpected extra YAML documents, if the trailing content is malformed, the current implementation discards the underlying parser error and returns a generic unexpected extra YAML document error.

Preserving the underlying error when it is not io.EOF provides much better diagnostic feedback to the user (e.g., if they have a syntax error in their trailing content).

	err := dec.Decode(new(any))
	if err == nil {
		return errors.New("unexpected extra YAML document")
	}
	if !errors.Is(err, io.EOF) {
		return errors.New("unexpected extra YAML document: " + err.Error())
	}

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@kvaps Andrei Kvapil (kvaps) marked this pull request as ready for review June 4, 2026 18:59

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
README.md (1)

20-20: ⚡ Quick win

Document all breaking validation rejections explicitly.

Please include empty-config, extra-document rejection, and the legacy dhcp6.DUID incompatibility in this sentence so users understand the full breaking surface.

Proposed README wording
-The `-config` file is validated against the Talos v1.13 metal network configuration schema (`network.PlatformConfigSpec`) before writing: unknown fields, malformed addresses and invalid enum values are rejected.
+The `-config` file is validated against the Talos v1.13 metal network configuration schema (`network.PlatformConfigSpec`) before writing: unknown fields, malformed addresses, invalid enum values, empty configurations, and extra YAML documents are rejected.  
+Note: configs using legacy Talos ≤ v1.8 `dhcp6.DUID` are rejected; use `dhcp6.clientIdentifier`.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` at line 20, Update the sentence that describes validation against
the Talos v1.13 metal network configuration schema (network.PlatformConfigSpec)
to explicitly list all breaking validation rejections: mention that empty-config
is rejected, extra-document rejection will fail multi-document YAMLs, and that
legacy dhcp6.DUID values are incompatible (the dhcp6.DUID legacy format is
rejected); keep the reference to unknown fields, malformed addresses and invalid
enum values and ensure the wording names the three specific rejections
(empty-config, extra-document, legacy dhcp6.DUID) so users see the full breaking
surface.
validate_test.go (1)

73-87: ⚡ Quick win

Add an explicit regression case for legacy dhcp6.DUID.

The PR advertises this breaking change; adding it to the invalid table prevents accidental reintroduction.

Proposed test case addition
 	{
 		name   string
 		config string
 	}{
 		{"malformed YAML", "key: [\ninvalid"},
 		{"empty", ""},
 		{"unknown top-level field", "hostname: talos-test\n"},
 		{"unknown nested field", "addresses:\n  - adress: 1.2.3.4/32\n"},
+		{"legacy dhcp6 duid field", "operators:\n  - operator: dhcp6\n    linkName: eth0\n    dhcp6:\n      DUID: 00:11:22:33\n"},
 		{"wrong type", "addresses: notalist\n"},
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@validate_test.go` around lines 73 - 87, Add a regression entry to the invalid
cases in TestValidateConfigInvalid to ensure legacy dhcp6.DUID remains rejected:
inside the composite literal in TestValidateConfigInvalid add a new struct
element with name like "legacy dhcp6.DUID" and config containing a dhcp6 block
with a legacy DUID value (e.g. multiline string "dhcp6:\n  DUID:
00:01:00:01:...") so the validation routine that parses dhcp6.DUID (referenced
in TestValidateConfigInvalid) will be exercised and fail as expected.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@README.md`:
- Line 20: Update the sentence that describes validation against the Talos v1.13
metal network configuration schema (network.PlatformConfigSpec) to explicitly
list all breaking validation rejections: mention that empty-config is rejected,
extra-document rejection will fail multi-document YAMLs, and that legacy
dhcp6.DUID values are incompatible (the dhcp6.DUID legacy format is rejected);
keep the reference to unknown fields, malformed addresses and invalid enum
values and ensure the wording names the three specific rejections (empty-config,
extra-document, legacy dhcp6.DUID) so users see the full breaking surface.

In `@validate_test.go`:
- Around line 73-87: Add a regression entry to the invalid cases in
TestValidateConfigInvalid to ensure legacy dhcp6.DUID remains rejected: inside
the composite literal in TestValidateConfigInvalid add a new struct element with
name like "legacy dhcp6.DUID" and config containing a dhcp6 block with a legacy
DUID value (e.g. multiline string "dhcp6:\n  DUID: 00:01:00:01:...") so the
validation routine that parses dhcp6.DUID (referenced in
TestValidateConfigInvalid) will be exercised and fail as expected.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6f2c9b82-91da-42c4-bd1c-52255d8bbd36

📥 Commits

Reviewing files that changed from the base of the PR and between 5072cab and 0d10353.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (6)
  • README.md
  • go.mod
  • main.go
  • main_test.go
  • validate.go
  • validate_test.go
💤 Files with no reviewable changes (1)
  • main_test.go

@mcanevet

Copy link
Copy Markdown
Contributor

Hey Andrei Kvapil (@kvaps), great improvement — catching typos at write time rather than silently producing a broken config is exactly the right fix, and writing the original bytes instead of re-marshaling is a nice bonus bug fix.

A couple of things worth discussing before merge:

Schema version lag / future-proofing. The tool now rejects any field not known to Talos 1.13.3. A user running Talos 1.14+ with a new valid field will get a confusing rejection at write time. Worth adding a -skip-validation (or -validate=false) flag so users can bypass the schema check when they know their config is correct — it future-proofs the tool without requiring a release for every Talos schema bump.

RC dependency. go.yaml.in/yaml/v4 v4.0.0-rc.4 is a release candidate. Worth tracking an issue to upgrade once it stabilises, or checking if v4 is out by now.

Minor nits:

  • The strings.Contains(err.Error(), "field") assertion in TestValidateConfigUnknownFieldError is fragile against library error message changes.
  • The tagless switch for the extra-document check is correct but an if/else if reads more naturally there.

Allow bypassing the schema check so a user on a newer Talos version
(with a config that uses fields this tool's schema does not yet know
about) can still write META without waiting for the tool to bump its
machinery dependency.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@kvaps

Copy link
Copy Markdown
Member Author

Thanks for the careful read, Mickaël.

  1. -skip-validation flag — added in 7e26fdc. Defaults to validating; users on a newer Talos with fields this tool's schema doesn't yet know can pass -skip-validation to bypass. Good call.

  2. go.yaml.in/yaml/v4 RC — fair point. The reason I pinned v4 was a parser-feature inconsistency I hit while testing the strict decode + extra-document detection on v3; v4 behaves the way the metal-network spec docs expect. I'll keep an eye on the release; happy to take a follow-up PR that bumps to a stable when it lands.

  3. String-matching the error in TestValidateConfigUnknownFieldError — the assertion is on the field name ("foo"), not on a phrase like "field", which is a fair bit more stable than the surrounding wording. The point of that test is "the message should at least name the offending field" — if a future yaml-library refactor stops surfacing the field name in unknown-key errors, that's exactly the kind of regression I'd want to catch.

  4. Tagless switch vs if/else if — kept it as-is. The pattern is "three outcomes of one Decode call" and grouping them in a switch reads to me as a single decision rather than two separate guards. Subjective, of course.

Thanks again — happy to merge once you're happy.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
main.go (1)

131-131: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make the success message conditional on whether validation ran.

Line 131 always says the config was validated, but that is false when -skip-validation is set. This can mislead operators and logs.

💡 Proposed fix
-	fmt.Println("Configuration successfully validated and written to META partition.")
+	if *skipValidation {
+		fmt.Println("Configuration successfully written to META partition (validation skipped).")
+	} else {
+		fmt.Println("Configuration successfully validated and written to META partition.")
+	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@main.go` at line 131, The success message "Configuration successfully
validated and written to META partition." is always printed but should only
appear when validation actually ran; update the code to print this message
conditionally after the validation path completes (i.e., only when the
skip-validation flag is false and validation succeeded). Locate the fmt.Println
call and move or wrap it so it executes only when the validation branch ran (use
the program's skipValidation / -skip-validation flag variable or the validation
function/result — e.g., after validateConfig() returns successfully) and do not
print it when skip-validation was set.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@main.go`:
- Line 131: The success message "Configuration successfully validated and
written to META partition." is always printed but should only appear when
validation actually ran; update the code to print this message conditionally
after the validation path completes (i.e., only when the skip-validation flag is
false and validation succeeded). Locate the fmt.Println call and move or wrap it
so it executes only when the validation branch ran (use the program's
skipValidation / -skip-validation flag variable or the validation
function/result — e.g., after validateConfig() returns successfully) and do not
print it when skip-validation was set.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fb49ea27-fef0-431f-b29a-6463d9e10e3b

📥 Commits

Reviewing files that changed from the base of the PR and between 0d10353 and 7e26fdc.

📒 Files selected for processing (2)
  • README.md
  • main.go
✅ Files skipped from review due to trivial changes (1)
  • README.md

@kvaps Andrei Kvapil (kvaps) merged commit 615c6eb into main Jun 5, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants