Skip to content

chore: bump Go to 1.25.8 and fix CI and gosec issues#378

Open
maxlambrecht wants to merge 4 commits into
spiffe:mainfrom
maxlambrecht:chore/update-golangci-lint-gosec
Open

chore: bump Go to 1.25.8 and fix CI and gosec issues#378
maxlambrecht wants to merge 4 commits into
spiffe:mainfrom
maxlambrecht:chore/update-golangci-lint-gosec

Conversation

@maxlambrecht
Copy link
Copy Markdown
Member

@maxlambrecht maxlambrecht commented Mar 25, 2026

What

  • Bump the minimum Go version to 1.25.8 in go.mod, add the unreleased changelog entry, and align the Makefile toolchain version with that minimum.
  • Update golangci-lint to v2.11.4 and fix the Makefile Go download URLs to use the full Go version so Windows CI can fetch the toolchain correctly.
  • Move SPIFFE TLS peer authorization from VerifyPeerCertificate to VerifyConnection in spiffetls/tlsconfig, and update related tests and smaller gosec-driven example/test cleanups.

Why

  • Keep the repository’s minimum supported Go version aligned with its support policy.
  • Make the lint/toolchain setup work with the newer Go target and restore Windows CI toolchain downloads.
  • Ensure custom SPIFFE TLS authorization also runs on resumed sessions, which VerifyPeerCertificate alone does not cover.

How tested

  • make lint
  • make test

@maxlambrecht maxlambrecht changed the title fix: update golangci-lint and address gosec findings chore: bump Go to 1.25.8 and fix CI and gosec issues Mar 25, 2026
Copy link
Copy Markdown
Member

@azdagron azdagron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, @maxlambrecht! I have some backcompat concerns that I'm not sure how to resolve right away...

}
}

func wrapVerifyConnection(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried about backwards compatibility. There are two situations:

  1. somebody passes a tls.Config where either VeriyPeerCertificate/VerifyConnection is set. This code handles that fine.
  2. somebody takes the returned tls.Config and wraps VerifyPeerCertificate. We'd break the latter here, because we no longer set VerifyPeerCertificate.

I need to think about how we handle this. Open to suggestions. It may be the case that we have to accept a small breaking change. In any case, whatever we pick, the nuances should be documented on the functions.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. The implementation already preserves the input-side case by capturing any existing config.VerifyPeerCertificate / config.VerifyConnection callbacks and invoking them after SPIFFE auth succeeds. The remaining change is the narrow returned-config case: callers that previously wrapped VerifyPeerCertificate on the returned config now need to extend VerifyConnection instead, since SPIFFE auth has to live on VerifyConnection to also run on resumed sessions. I updated the function docs to call out that nuance explicitly, and I added focused tests for the preserved input-callback behavior and chaining/error propagation. So this is a narrow breaking change, but I think it is the right tradeoff here.

Copy link
Copy Markdown
Contributor

@strideynet strideynet Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to throw my 2c into the mix:

About a month or two back, I had some interesting conversations with folks on the topic of TLS resumption. My original stance had been that it made sense to re-verify certificates upon resumption of a connection but I did hear a strong counter argument that has made me somewhat reconsider.

The argument goes that if you are not continually monitoring open TLS connections and dropping them if they are no longer "valid" (i.e CA has rotated, or cert has expired), then what makes a resumption event particularly remarkable that you need to re-perform this verification?

Curious on others thoughts on this since I haven't quite made my mind up either way.

Another element came to mind for me as well - which perhaps @maxlambrecht may be better suited to answer in regards to: In Go, the peer certificate is stored as part of the session state/ticket and allows us to re-verify upon resumption. Is this the case for Rust/other languages as well?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Rust, I think this is more about the TLS library than the language itself. In rustls, peer_certificates() is available for both full and resumed handshakes, so the peer certificate chain is still available after resumption. But at least on the TLS 1.3 client resumption path, rustls does not re-run the full certificate verification. It treats resumption as a continuation of the previous session and restores the cached server certificate chain into the connection state.

So for rustls, my understanding is: yes, peer cert material is available on resume, but no, that does not mean the chain is verified again from scratch on that path.

For Java and Python, I think the general idea is similar, but it may depend on the TLS implementation. I have not verified the exact behavior there in enough detail to say more confidently.

Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
@maxlambrecht maxlambrecht force-pushed the chore/update-golangci-lint-gosec branch from f505442 to 6c7af92 Compare March 26, 2026 17:55
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
@azdagron
Copy link
Copy Markdown
Member

azdagron commented May 9, 2026

I went ahead and merged some stuff to unblock the builds. Sorry about the churn and the latency on this.

Picking up from my earlier "two situations" comment, I think case #1 (input-side) is handled cleanly now. I've been chewing on case #2 where the caller wraps VerifyPeerCertificate on the returned config. Where I've landed lines up with your view that a narrow breaking change is probably the right tradeoff. I'd like to soften the edge with a rescue path before we accept it outright.

The obvious "no breakage" option is to set both VerifyPeerCertificate and VerifyConnection. That keeps the wrap pattern working on full handshakes while VerifyConnection closes the resumption gap. The problem is doubled signature verification on every full handshake. For servers fielding lots of mTLS handshakes that's a real CPU cost I don't want to impose by default.

What I'd like to propose instead is borrowing a page from the Go team's GODEBUG playbook:

  1. Default to the new behavior. SPIFFE auth runs in VerifyConnection only. Single verification per handshake. This accepts the case-Initial check-in for go-spiffe library #2 break.
  2. Add an env var escape hatch. Something like GO_SPIFFE_VERIFY=peer for legacy or connection for the new default. Read once at package init. When legacy mode is honored, emit a one-time log line so it shows up in operational telemetry.
  3. Document a sunset. Remove the legacy mode in v3 or by a defined date. We don't want to maintain two paths forever.
  4. Independently, detect the input-side wrap at Hook* time. If config.VerifyPeerCertificate != nil on entry, log a one-time deprecation warning. That catches the easy case at runtime without users needing to know about the env var.

The env var doesn't fully eliminate silent breakage. We can't detect users mutating the returned config. It does cap the blast radius for anyone who ships and discovers the regression later. The env var, sunset, and input-side warning together feel like the right level of "here's what changed, here's the rescue path" messaging.

Re: @strideynet's broader question on whether resume re-verification matters. I lean yes, narrowly. It catches bundle rotation, revocation, and authorizer state changes between the original handshake and the resume. Session-cached state does not catch those. The bigger gap of continuously revalidating live connections is real but separate. I don't think we should let it stall this fix.

Open to alternatives if anyone sees a cleaner shape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants