do not merge - vsphere client mgmt#10581
Conversation
|
Skipping CI for Draft Pull Request. |
WalkthroughThis PR introduces a unified vSphere Session abstraction that centralizes SOAP/REST client creation, caching, and cleanup, deprecates the legacy CreateVSphereClients wrapper, updates consumers to use NewSession, and adds comprehensive session tests. ChangesvSphere Session Abstraction
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 12 | ❌ 3❌ Failed checks (1 warning, 2 inconclusive)
✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
@coderabbitat please review |
|
✅ Actions performedReview triggered.
|
|
/test e2e-vsphere-ovn |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/asset/installconfig/vsphere/session.go`:
- Around line 148-149: The log call s.logger.WithField("vcenter",
s.server).Debug("Closing vSphere session") (and the similar occurrences at the
other spots) is emitting raw vCenter hostnames; replace the raw s.server value
with a redacted or hashed representation before logging. Implement or call a
helper (e.g., redactVCenter or hashHostname) that deterministically masks the
hostname (or returns a one-way hash) and use that helper in all
logger.WithField("vcenter", ...) calls in session.go (refer to the
s.logger.WithField calls around the Closing vSphere session line and the other
occurrences noted) so logs never contain the plain hostname but still allow
correlating entries.
- Around line 105-109: The cache may return a previously closed session because
Close() doesn't remove entries from sessions; update the lookup logic (where
sessionsMu, sessions and sess are used — e.g., the NewSession/cache-hit paths at
the shown locations) to detect a closed/dead sess after retrieving it from
sessions, delete sessions[key] and continue to create a fresh session instead of
returning it; in practice, after reading sess from sessions, check its
closed/invalid state (use the session object's existing IsClosed/Valid/State
check or equivalent), if closed then delete(sessions, key) while holding
sessionsMu, unlock and proceed to create and store a new session; apply the same
eviction/check logic to the other similar cache-return sites referenced (the
blocks at the other indicated locations).
- Around line 123-126: The race path drops the newly created sess when the
second cache check finds an existing session, leaking authenticated clients;
update the cache-return branch so that when you detect an existing session (the
block checking sessions[key] -> s), you call sess.Close() (the just-created
session variable) before returning the cached s, while still unlocking
sessionsMu and returning func() { s.Close() }, nil; ensure you only close the
newly created sess (not the cached s) to avoid double-closing.
In `@pkg/destroy/vsphere/client.go`:
- Around line 70-73: When creating the CNS client, ensure the opened session is
closed if cns.NewClient fails: in the block where cns.NewClient(context.TODO(),
sess.Vim25Client()) is called, call sess.Close() (or the appropriate session
cleanup method on the sess object) before returning the error so the session is
not leaked; update the error path in the function that initializes cnsClient to
close sess on failure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 04893483-3d6d-4abe-b149-b4222be98a12
📒 Files selected for processing (3)
pkg/asset/installconfig/vsphere/client.gopkg/asset/installconfig/vsphere/session.gopkg/destroy/vsphere/client.go
Redact vCenter hostnames from structured logs using a deterministic SHA-256 hash to avoid exposing raw hostnames. Fixes the session cache stale-entry return (closed sessions from cache are now evicted), the double-check race that leaked newly created sessions, and the CNS client error path that leaked its session.
|
@coderabbitat why do you keep not reviewing, please review! |
|
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
pkg/asset/installconfig/vsphere/session.go (1)
190-194:⚠️ Potential issue | 🔴 Critical | ⚡ Quick win
Error()callingonce.DopreventsClose()from running.If
Error()is called beforeClose(), the emptyonce.Do(func() {})at line 192 marks thesync.Onceas done, andClose()will never execute its cleanup logic. This can leak authenticated sessions.The comment "ensure once guard is initialized" is incorrect—
sync.Oncedoesn't need initialization; callingDowith an empty func consumes the one-time execution.Proposed fix — use a separate closed flag
type Session struct { server string govmomi *govmomi.Client vim25 *vim25.Client rest *rest.Client cns *cns.Client pbm *pbm.Client finder *find.Finder logger logrus.FieldLogger once sync.Once closeErr error + closed bool } // Close releases all resources held by the session. func (s *Session) Close() { s.once.Do(func() { // ... existing cleanup logic ... s.closeErr = firstErr + s.closed = true s.logger.WithField("vcenter", redactHostname(s.server)).Debug("vSphere session closed") }) } // Error returns any error that occurred during Close. func (s *Session) Error() error { - s.once.Do(func() {}) // ensure once guard is initialized + if !s.closed { + return nil // Not yet closed + } return s.closeErr }Or, if callers need to distinguish "not closed" from "closed with no error", consider returning
(error, bool)or checkings.closedexplicitly in the cache eviction logic.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/asset/installconfig/vsphere/session.go` around lines 190 - 194, The Error() method must not call s.once.Do(...) because that prematurely marks the once guard and prevents Close() from running; remove the empty once.Do call from Error() and instead track closed state explicitly (e.g., add a s.closed bool set to true inside Close() after running the once-protected cleanup, protected by the existing mutex or an atomic), then have Error() return s.closeErr (and/or provide a variant returning (error,bool) if callers need to know whether Close() ran). Ensure Close() still uses s.once.Do to perform cleanup and sets s.closed and s.closeErr inside that Do block so Error() can observe accurate state.
🧹 Nitpick comments (1)
pkg/destroy/vsphere/client.go (1)
70-74: 💤 Low valueConsider reusing the session's CNS client.
The
Sessionalready creates a CNS client internally (optional, logged at Debug on failure). Creating another here duplicates the work. If the session's CNS client suffices, usesess.CNSClient()with a nil check.However, since destroy requires a CNS client while the session treats it as optional, the current approach ensures the error surfaces. This is acceptable if intentional.
Optional: reuse session's CNS client
- cnsClient, err := cns.NewClient(context.TODO(), sess.Vim25Client()) - if err != nil { - sess.Close() - return nil, err + cnsClient := sess.CNSClient() + if cnsClient == nil { + // Session's optional CNS creation failed; retry explicitly. + var err error + cnsClient, err = cns.NewClient(context.TODO(), sess.Vim25Client()) + if err != nil { + sess.Close() + return nil, err + } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/destroy/vsphere/client.go` around lines 70 - 74, Replace the unconditional creation of a new CNS client with reuse of the session's CNS client: check sess.CNSClient() first and if non-nil assign it to cnsClient; otherwise fall back to calling cns.NewClient(context.TODO(), sess.Vim25Client()) and preserve the current error handling (sess.Close() and return nil, err) to surface failures. Use the existing variables cnsClient and sess and the methods sess.CNSClient() and cns.NewClient(...) to implement this conditional reuse.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/asset/installconfig/vsphere/session.go`:
- Around line 112-124: The mutex sessionsMu is being unlocked twice when a
cached session has sess.Error() != nil: the code calls sessionsMu.Unlock()
inside that branch and then falls through to the subsequent sessionsMu.Unlock(),
causing a panic. Fix by ensuring only one unlock path when evicting a closed
session — remove the inner Unlock after delete(sessions, key) and let the single
Unlock after the if-block run, or restructure to return immediately after
eviction; reference sessionsMu, sessions, sess.Error(), delete(sessions, key)
and the existing return path that returns sess with func() { sess.Close() } to
guide where to change behavior.
---
Duplicate comments:
In `@pkg/asset/installconfig/vsphere/session.go`:
- Around line 190-194: The Error() method must not call s.once.Do(...) because
that prematurely marks the once guard and prevents Close() from running; remove
the empty once.Do call from Error() and instead track closed state explicitly
(e.g., add a s.closed bool set to true inside Close() after running the
once-protected cleanup, protected by the existing mutex or an atomic), then have
Error() return s.closeErr (and/or provide a variant returning (error,bool) if
callers need to know whether Close() ran). Ensure Close() still uses s.once.Do
to perform cleanup and sets s.closed and s.closeErr inside that Do block so
Error() can observe accurate state.
---
Nitpick comments:
In `@pkg/destroy/vsphere/client.go`:
- Around line 70-74: Replace the unconditional creation of a new CNS client with
reuse of the session's CNS client: check sess.CNSClient() first and if non-nil
assign it to cnsClient; otherwise fall back to calling
cns.NewClient(context.TODO(), sess.Vim25Client()) and preserve the current error
handling (sess.Close() and return nil, err) to surface failures. Use the
existing variables cnsClient and sess and the methods sess.CNSClient() and
cns.NewClient(...) to implement this conditional reuse.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: fc98e816-70a8-43a8-87a1-78630e304011
📒 Files selected for processing (3)
pkg/asset/installconfig/vsphere/client.gopkg/asset/installconfig/vsphere/session.gopkg/destroy/vsphere/client.go
|
@coderabbitai please review |
|
✅ Action performedFull review finished. |
- Migrate validation.go:getVCenterClient() and vsphere.go:getClients() to use NewSession instead of deprecated CreateVSphereClients - Simplify Session.Error() by removing unnecessary sync.Once guard - Fix optional CNS/PBM client creation to use context.Background() instead of timeout context to prevent premature cancellation - Add 33 unit tests for Session covering caching, idempotent close, error propagation, logging, and cache key properties - Remove unused defaultTimeout constant from destroy/vsphere/client.go
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@coderabbitai please review |
|
🧠 Learnings used✅ Action performedFull review finished. |
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
pkg/asset/installconfig/vsphere/session.go (2)
114-124:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winRemove the extra unlock in the eviction path.
When
sess.Error() != nil, Line 117 unlockssessionsMuand control still falls through to Line 124, which unlocks it again. That panics withsync: unlock of unlocked mutex.Minimal fix
if sess, ok := sessions[key]; ok { if sess.Error() != nil { // Cached session is closed — evict it so a fresh one is created. delete(sessions, key) - sessionsMu.Unlock() // Fall through to create a new session. } else { sessionsMu.Unlock() return sess, func() { sess.Close() }, nil } } sessionsMu.Unlock()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/asset/installconfig/vsphere/session.go` around lines 114 - 124, The code double-unlocks the mutex sessionsMu when a cached session is evicted: inside the sess.Error() != nil branch you call sessionsMu.Unlock() then fall through to the outer sessionsMu.Unlock(), causing a panic; remove the inner unlock so that after delete(sessions, key) control falls through and the single outer sessionsMu.Unlock() releases the lock. Specifically edit the block around sess.Error() != nil (the delete(sessions, key) path) in session.go so only the outer sessionsMu.Unlock() is executed.
112-121:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftCache hits hand out a shared session with a real closer, so callers can invalidate each other.
Every
NewSessioncaller gets a cleanup function that callsClose(), including cache hits. If two call sites reuse the same cachedSession, the first cleanup logs out the shared SOAP/REST clients while the other caller still holds them. Thesess.Error()check does not make this safe either, because a cleanClose()leavescloseErr == nil. This needs lease/reference-count semantics, orClose()must evict/mark the session closed before any later cache hit can reuse it.Also applies to: 147-147
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/asset/installconfig/vsphere/session.go` around lines 112 - 121, The cached session is being handed out with a real closer so multiple callers can call sess.Close() and race to logout shared clients; update the cache to use lease/reference-count semantics instead of returning the raw Close: change the sessions map entries to a small struct (e.g., containing the Session, an int refCount, and a closed flag), increment refCount on cache hit in NewSession, and return a cleanup func that decrements refCount and only calls the underlying sess.Close() (and removes the entry from sessions) when refCount reaches zero; additionally ensure that when initiating an actual Close you mark the entry as closed/evicted before performing network logout so no later cache hit reuses a session (adjust logic around sessionsMu, NewSession, sess.Close(), and sess.Error() accordingly; same fix where cache hits are handled at the other occurrence).
🧹 Nitpick comments (3)
pkg/asset/installconfig/vsphere/session_test.go (2)
341-364: 💤 Low valueConsider verifying redacted hostname format.
The test checks that the logged
vcenterfield is not the plaintext hostname ("https://vcenter.example.com"), but it doesn't verify that it matches the expected redacted format (32-character hex string fromredactHostname). WhileTestSessionRedactHostnamevalidates the helper function separately, adding a format assertion here would make this test more robust.♻️ Optional enhancement to verify redacted format
// Check that the logged hostname is redacted found := false for _, entry := range hook.AllEntries() { if field, ok := entry.Data["vcenter"]; ok { vcenter, ok := field.(string) - if ok && vcenter != "https://vcenter.example.com" { + if ok && vcenter != "https://vcenter.example.com" && len(vcenter) == 32 { + // Also verify it's hex format + isHex := true + for _, c := range vcenter { + if !((c >= '0' && c <= '9') || (c >= 'a' && c <= 'f')) { + isHex = false + break + } + } + if isHex { + found = true + break + } + } else if ok && vcenter != "https://vcenter.example.com" { found = true break } } } assert.True(t, found, "should have logged redacted vcenter hostname")Or more simply, just assert the length:
// Check that the logged hostname is redacted found := false for _, entry := range hook.AllEntries() { if field, ok := entry.Data["vcenter"]; ok { vcenter, ok := field.(string) - if ok && vcenter != "https://vcenter.example.com" { + if ok && vcenter != "https://vcenter.example.com" && len(vcenter) == 32 { found = true break } } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/asset/installconfig/vsphere/session_test.go` around lines 341 - 364, The test TestSessionCloseLogsRedactedHostname only ensures the logged vcenter is not the plaintext; update it to assert the logged value matches the expected redacted format by using the redactHostname helper or a 32‑char hex check: after retrieving entry.Data["vcenter"] in the loop, compute expected := redactHostname(sess.server) (or validate that the string length is 32 and contains only hex characters) and assert equality/format; reference Session.Close, TestSessionCloseLogsRedactedHostname, redactHostname and the hook.AllEntries loop to locate where to add the additional assertion.
200-223: ⚡ Quick winUse functional option pattern to test option functions.
The tests
TestSessionOptionsWithTimeout,TestSessionOptionsWithInsecure, andTestSessionOptionsWithLoggerdirectly mutate struct fields after callingapplyDefaults(), but they don't invoke the actual functional option functions (WithTimeout(),WithInsecure(),WithLogger()). This means these tests don't verify that the option functions correctly configure aSessionOptionsinstance.♻️ Proposed refactor to test functional options
Replace the three tests with versions that use the functional option pattern:
func TestSessionOptionsWithTimeout(t *testing.T) { - opts := &SessionOptions{} - opts.applyDefaults() - opts.Timeout = 30 * time.Second + opt := WithTimeout(30 * time.Second) + opts := &SessionOptions{} + opt(opts) assert.Equal(t, 30*time.Second, opts.Timeout) } func TestSessionOptionsWithInsecure(t *testing.T) { - opts := &SessionOptions{} - opts.applyDefaults() - opts.Insecure = true + opt := WithInsecure(true) + opts := &SessionOptions{} + opt(opts) assert.True(t, opts.Insecure) } func TestSessionOptionsWithLogger(t *testing.T) { logger := logrus.New() - opts := &SessionOptions{} - opts.applyDefaults() - opts.Logger = logger + opt := WithLogger(logger) + opts := &SessionOptions{} + opt(opts) assert.Equal(t, logger, opts.Logger) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/asset/installconfig/vsphere/session_test.go` around lines 200 - 223, The tests mutate SessionOptions fields directly instead of exercising the functional option helpers; update TestSessionOptionsWithTimeout, TestSessionOptionsWithInsecure, and TestSessionOptionsWithLogger to call the option functions (WithTimeout, WithInsecure, WithLogger) against a defaults-populated opts (e.g. opts := &SessionOptions{}; opts.applyDefaults(); WithTimeout(30*time.Second)(opts)) rather than assigning opts.Timeout/opts.Insecure/opts.Logger directly, then assert the values remain as expected.pkg/asset/installconfig/vsphere/client.go (1)
51-55: ⚡ Quick winUse public accessor methods instead of private fields.
Line 55 accesses private session fields directly, while other consumers (validation.go, vsphere.go) use the public
Vim25Client()andRestClient()methods. This breaks encapsulation.♻️ Use accessor methods for consistency
- return sess.vim25, sess.rest, cleanup, nil + return sess.Vim25Client(), sess.RestClient(), cleanup, nil🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/asset/installconfig/vsphere/client.go` around lines 51 - 55, Replace direct private field access on the session with its public accessors: instead of returning sess.vim25 and sess.rest from the NewSession result, call sess.Vim25Client() and sess.RestClient() (the public methods used elsewhere) and return those values along with cleanup and nil error so encapsulation is preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/asset/installconfig/vsphere/session_test.go`:
- Around line 81-89: The test TestNewSessionTimeout mixes two timeout sources
(context.WithTimeout and the WithTimeout option) which confuses intent; update
the test to only exercise one behavior: either remove the context timeout and
assert NewSession with WithTimeout(1*time.Millisecond) returns nil session and
an error, or split into two tests—one that uses context.WithTimeout to trigger
context cancellation and another that passes WithTimeout to trigger the
option-based timeout—referencing TestNewSessionTimeout, NewSession, WithTimeout
and context.WithTimeout to locate and adjust the code accordingly.
In `@pkg/asset/installconfig/vsphere/session.go`:
- Around line 78-80: SessionCacheKey currently joins server, username, password
with "|" which is ambiguous; change it to hash length-prefixed fields instead:
build the input to sha256 by writing each field as a length (e.g. 4-byte
big-endian or varint) followed by the raw field bytes for server, then username,
then password, then compute sha256 and return the first 16 bytes hex as before.
Update the SessionCacheKey function to construct the length-prefixed byte
sequence (instead of fmt.Sprintf) before hashing to eliminate delimiter
collision.
- Around line 167-177: Session.Close currently calls
s.rest.Logout(context.Background()) and s.govmomi.Logout(context.Background())
which can block; change Close to use a bounded context derived from the session
timeout (SessionOptions.Timeout) or a stored timeout field on Session: create a
context.WithTimeout(ctx, timeout) before calling s.rest.Logout(...) and
s.govmomi.Logout(...), cancel the context when done, and fall back to a sensible
default if no timeout is set; update the Session struct to store the timeout
from SessionOptions if needed and ensure both Logout calls use the same bounded
context in Session.Close().
---
Duplicate comments:
In `@pkg/asset/installconfig/vsphere/session.go`:
- Around line 114-124: The code double-unlocks the mutex sessionsMu when a
cached session is evicted: inside the sess.Error() != nil branch you call
sessionsMu.Unlock() then fall through to the outer sessionsMu.Unlock(), causing
a panic; remove the inner unlock so that after delete(sessions, key) control
falls through and the single outer sessionsMu.Unlock() releases the lock.
Specifically edit the block around sess.Error() != nil (the delete(sessions,
key) path) in session.go so only the outer sessionsMu.Unlock() is executed.
- Around line 112-121: The cached session is being handed out with a real closer
so multiple callers can call sess.Close() and race to logout shared clients;
update the cache to use lease/reference-count semantics instead of returning the
raw Close: change the sessions map entries to a small struct (e.g., containing
the Session, an int refCount, and a closed flag), increment refCount on cache
hit in NewSession, and return a cleanup func that decrements refCount and only
calls the underlying sess.Close() (and removes the entry from sessions) when
refCount reaches zero; additionally ensure that when initiating an actual Close
you mark the entry as closed/evicted before performing network logout so no
later cache hit reuses a session (adjust logic around sessionsMu, NewSession,
sess.Close(), and sess.Error() accordingly; same fix where cache hits are
handled at the other occurrence).
---
Nitpick comments:
In `@pkg/asset/installconfig/vsphere/client.go`:
- Around line 51-55: Replace direct private field access on the session with its
public accessors: instead of returning sess.vim25 and sess.rest from the
NewSession result, call sess.Vim25Client() and sess.RestClient() (the public
methods used elsewhere) and return those values along with cleanup and nil error
so encapsulation is preserved.
In `@pkg/asset/installconfig/vsphere/session_test.go`:
- Around line 341-364: The test TestSessionCloseLogsRedactedHostname only
ensures the logged vcenter is not the plaintext; update it to assert the logged
value matches the expected redacted format by using the redactHostname helper or
a 32‑char hex check: after retrieving entry.Data["vcenter"] in the loop, compute
expected := redactHostname(sess.server) (or validate that the string length is
32 and contains only hex characters) and assert equality/format; reference
Session.Close, TestSessionCloseLogsRedactedHostname, redactHostname and the
hook.AllEntries loop to locate where to add the additional assertion.
- Around line 200-223: The tests mutate SessionOptions fields directly instead
of exercising the functional option helpers; update
TestSessionOptionsWithTimeout, TestSessionOptionsWithInsecure, and
TestSessionOptionsWithLogger to call the option functions (WithTimeout,
WithInsecure, WithLogger) against a defaults-populated opts (e.g. opts :=
&SessionOptions{}; opts.applyDefaults(); WithTimeout(30*time.Second)(opts))
rather than assigning opts.Timeout/opts.Insecure/opts.Logger directly, then
assert the values remain as expected.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 9998e465-4308-4009-8932-46368d310dee
📒 Files selected for processing (6)
pkg/asset/installconfig/vsphere/client.gopkg/asset/installconfig/vsphere/session.gopkg/asset/installconfig/vsphere/session_test.gopkg/asset/installconfig/vsphere/validation.gopkg/asset/installconfig/vsphere/vsphere.gopkg/destroy/vsphere/client.go
| func TestNewSessionTimeout(t *testing.T) { | ||
| logger, _ := test.NewNullLogger() | ||
| ctx, cancel := context.WithTimeout(context.Background(), 10*time.Millisecond) | ||
| defer cancel() | ||
|
|
||
| sess, _, err := NewSession(ctx, "https://nonexistent.example.com:80", "user", "pass", WithTimeout(1*time.Millisecond), WithLogger(logger)) | ||
| assert.Nil(t, sess, "session should be nil on timeout") | ||
| assert.Error(t, err, "should return error on timeout") | ||
| } |
There was a problem hiding this comment.
Clarify timeout test or split into separate cases.
The test uses both context.WithTimeout(context.Background(), 10*time.Millisecond) and WithTimeout(1*time.Millisecond), introducing two timeout sources. While the effective timeout will be the minimum (1ms), this makes the test harder to understand and doesn't clearly document which timeout behavior is being validated.
Consider either:
- Removing the context timeout and only testing the
WithTimeoutoption - Splitting into two separate tests: one for context cancellation and one for option timeout
♻️ Proposed refactor to test only option timeout
func TestNewSessionTimeout(t *testing.T) {
logger, _ := test.NewNullLogger()
- ctx, cancel := context.WithTimeout(context.Background(), 10*time.Millisecond)
- defer cancel()
- sess, _, err := NewSession(ctx, "https://nonexistent.example.com:80", "user", "pass", WithTimeout(1*time.Millisecond), WithLogger(logger))
+ sess, _, err := NewSession(context.Background(), "https://nonexistent.example.com:80", "user", "pass", WithTimeout(1*time.Millisecond), WithLogger(logger))
assert.Nil(t, sess, "session should be nil on timeout")
assert.Error(t, err, "should return error on timeout")
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func TestNewSessionTimeout(t *testing.T) { | |
| logger, _ := test.NewNullLogger() | |
| ctx, cancel := context.WithTimeout(context.Background(), 10*time.Millisecond) | |
| defer cancel() | |
| sess, _, err := NewSession(ctx, "https://nonexistent.example.com:80", "user", "pass", WithTimeout(1*time.Millisecond), WithLogger(logger)) | |
| assert.Nil(t, sess, "session should be nil on timeout") | |
| assert.Error(t, err, "should return error on timeout") | |
| } | |
| func TestNewSessionTimeout(t *testing.T) { | |
| logger, _ := test.NewNullLogger() | |
| sess, _, err := NewSession(context.Background(), "https://nonexistent.example.com:80", "user", "pass", WithTimeout(1*time.Millisecond), WithLogger(logger)) | |
| assert.Nil(t, sess, "session should be nil on timeout") | |
| assert.Error(t, err, "should return error on timeout") | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/asset/installconfig/vsphere/session_test.go` around lines 81 - 89, The
test TestNewSessionTimeout mixes two timeout sources (context.WithTimeout and
the WithTimeout option) which confuses intent; update the test to only exercise
one behavior: either remove the context timeout and assert NewSession with
WithTimeout(1*time.Millisecond) returns nil session and an error, or split into
two tests—one that uses context.WithTimeout to trigger context cancellation and
another that passes WithTimeout to trigger the option-based timeout—referencing
TestNewSessionTimeout, NewSession, WithTimeout and context.WithTimeout to locate
and adjust the code accordingly.
| func SessionCacheKey(server, username, password string) string { | ||
| hash := sha256.Sum256([]byte(fmt.Sprintf("%s|%s|%s", server, username, password))) | ||
| return fmt.Sprintf("%x", hash[:16]) |
There was a problem hiding this comment.
Encode the cache-key inputs unambiguously.
fmt.Sprintf("%s|%s|%s", ...) is not injective: values containing | can produce the same preimage string for different (server, username, password) tuples. That can make unrelated credentials reuse the same cached session. Hash length-prefixed fields instead of a delimiter-joined string.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/asset/installconfig/vsphere/session.go` around lines 78 - 80,
SessionCacheKey currently joins server, username, password with "|" which is
ambiguous; change it to hash length-prefixed fields instead: build the input to
sha256 by writing each field as a length (e.g. 4-byte big-endian or varint)
followed by the raw field bytes for server, then username, then password, then
compute sha256 and return the first 16 bytes hex as before. Update the
SessionCacheKey function to construct the length-prefixed byte sequence (instead
of fmt.Sprintf) before hashing to eliminate delimiter collision.
| if s.rest != nil { | ||
| if err := s.rest.Logout(context.Background()); err != nil { | ||
| s.logger.WithError(err).Warn("Failed to logout REST client") | ||
| if firstErr == nil { | ||
| firstErr = err | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if s.govmomi != nil { | ||
| if err := s.govmomi.Logout(context.Background()); err != nil { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate and show the relevant code in session.go
FILE="pkg/asset/installconfig/vsphere/session.go"
if [ ! -f "$FILE" ]; then
echo "Missing file: $FILE"
exit 1
fi
nl -ba "$FILE" | sed -n '130,220p'
# Search for Close() and any existing timeout/context usage in the vsphere session code
rg -n "func \\(.*\\) Close\\(|Logout\\(context|WithTimeout\\(|Background\\(\\)" "$FILE"
# Show definitions of any session structs fields related to timeouts or contexts
rg -n "type .*Session|struct .*Session|timeout|ctx|context\\.Context" "$FILE"Repository: openshift/installer
Length of output: 107
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="pkg/asset/installconfig/vsphere/session.go"
[ -f "$FILE" ] || { echo "Missing file: $FILE"; exit 1; }
# Show the relevant region with line numbers
cat -n "$FILE" | sed -n '130,220p'
# Find Close() and Logout usage plus any existing timeouts/context usage nearby
rg -n "func \\(.*\\) Close\\(|Logout\\(context\\.|Logout\\(|WithTimeout\\(|context\\.Background\\(" "$FILE"
# Inspect the Logout method signatures/types referenced in this file (if any)
rg -n "Logout\\(" "$FILE"
# Check whether session struct already carries a timeout or context
rg -n "type .*Session|struct .*Session|timeout|ctx\\b|context\\.Context" "$FILE"Repository: openshift/installer
Length of output: 4817
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="pkg/asset/installconfig/vsphere/session.go"
# Print the top portion: Session/SessionOptions structs and timeout option definitions
cat -n "$FILE" | sed -n '1,140p'
# Print the rest around NewSession and any other context usage
cat -n "$FILE" | sed -n '140,280p'Repository: openshift/installer
Length of output: 9840
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Search for Logout signatures in vendor (govmomi/vim25 and vapi/rest)
rg -n "func \\(.*\\) Logout\\(ctx context\\.Context\\)" vendor/github.com/vmware/govmomi -S || true
rg -n "Logout\\(ctx context\\.Context\\)" vendor/github.com/vmware/govmomi -S | head -n 50
# Also check for rest client Logout signature
rg -n "Logout\\(ctx context\\.Context\\)" vendor/github.com/vmware/govmomi/vapi/rest -S || true
# Show session.go Session struct fields to see if timeout is stored
rg -n "type Session struct \\{|type SessionOptions struct \\{|Timeout" pkg/asset/installconfig/vsphere/session.goRepository: openshift/installer
Length of output: 1249
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show relevant parts around Logout implementations in vendored govmomi
for f in \
vendor/github.com/vmware/govmomi/client.go \
vendor/github.com/vmware/govmomi/vapi/rest/client.go \
vendor/github.com/vmware/govmomi/session/manager.go
do
echo "==== $f ===="
# print around the first Logout occurrence
start=$(rg -n "func \\(.*\\) Logout\\(ctx context\\.Context\\)" "$f" | head -n 1 | cut -d: -f1)
if [ -z "${start:-}" ]; then
echo "No Logout(ctx) found in $f"
continue
fi
# show ~80 lines around it
from=$((start-20))
to=$((start+120))
if [ $from -lt 1 ]; then from=1; fi
sed -n "${from},${to}p" "$f" | nl -ba | sed 's/^ *[0-9]*\t//'
doneRepository: openshift/installer
Length of output: 178
🏁 Script executed:
#!/bin/bash
set -euo pipefail
for f in \
vendor/github.com/vmware/govmomi/client.go \
vendor/github.com/vmware/govmomi/vapi/rest/client.go \
vendor/github.com/vmware/govmomi/session/manager.go
do
echo "==== $f ===="
# Find the first Logout(ctx ...) in the file
line=$(rg -n "func \\(.*\\) Logout\\(ctx context\\.Context\\)" "$f" | head -n 1 | cut -d: -f1 || true)
if [ -z "${line:-}" ]; then
echo "No Logout(ctx) found"
continue
fi
from=$((line-20))
to=$((line+160))
if [ $from -lt 1 ]; then from=1; fi
cat -n "$f" | sed -n "${from},${to}p"
echo
doneRepository: openshift/installer
Length of output: 8535
Bound vCenter logout RPCs with a timeout in Session.Close() (avoid context.Background()).
pkg/asset/installconfig/vsphere/session.go calls s.rest.Logout(context.Background()) and s.govmomi.Logout(context.Background()); govmomi’s REST/SOAP logout APIs accept a context.Context and use it for the underlying network RPC. SessionOptions.Timeout is currently only applied to the initial connection, so Close() can still block indefinitely on a broken connection—store a timeout on the Session (from SessionOptions) or derive a short context.WithTimeout locally and use it for both logout calls.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/asset/installconfig/vsphere/session.go` around lines 167 - 177,
Session.Close currently calls s.rest.Logout(context.Background()) and
s.govmomi.Logout(context.Background()) which can block; change Close to use a
bounded context derived from the session timeout (SessionOptions.Timeout) or a
stored timeout field on Session: create a context.WithTimeout(ctx, timeout)
before calling s.rest.Logout(...) and s.govmomi.Logout(...), cancel the context
when done, and fall back to a sensible default if no timeout is set; update the
Session struct to store the timeout from SessionOptions if needed and ensure
both Logout calls use the same bounded context in Session.Close().
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
pkg/asset/installconfig/vsphere/session.go (4)
167-177:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid unbounded
context.Background()on vSphere RPCs.Logout and optional CNS/PBM initialization calls bypass the session timeout and can block indefinitely on network issues. Use a bounded context (derived from session options or caller context) for these external calls.
#!/bin/bash # Verify unbounded context usage in session lifecycle paths rg -n "Logout\(context\.Background\(\)|cns\.NewClient\(context\.Background\(\)|pbm\.NewClient\(context\.Background\(\)" pkg/asset/installconfig/vsphere/session.goAs per coding guidelines:
context.Context for cancellation and timeouts.Also applies to: 220-220, 227-233
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/asset/installconfig/vsphere/session.go` around lines 167 - 177, The Logout and client initialization calls (s.rest.Logout, s.govmomi.Logout, cns.NewClient, pbm.NewClient) are using context.Background() which can block indefinitely; change these to use a bounded context (e.g., derive from a session-level context or session options timeout/cancel function) and pass that context into the calls so RPCs observe cancellation/timeouts. Locate the calls in session.go (the s.rest.Logout and s.govmomi.Logout blocks and any cns.NewClient/pbm.NewClient usages) and replace context.Background() with a context derived from the caller/session (with a timeout or from an existing ctx parameter) and propagate errors unchanged.
114-115:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon’t use
Error()as the cache “closed” signal.A successful
Close()leavescloseErr == nil, so closed sessions remain cache-valid and can be returned as reusable sessions. Keep close errors separate from lifecycle state and track explicit closed state for eviction.Also applies to: 190-193
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/asset/installconfig/vsphere/session.go` around lines 114 - 115, Currently the code uses sess.Error() to decide whether a cached session is "closed" — change this to track explicit lifecycle state instead: add an explicit closed boolean (e.g., cachedSession.closed) on the session wrapper and set that flag when Close() is called (leave closeErr as a separate error value), then replace any checks of sess.Error() (references around sess.Error() at the existing checks and the other occurrence mentioned) with checks of the explicit closed flag so a successfully-closed session (closeErr == nil) is not treated as invalid and sessions with actual close failures can still expose their closeErr for logging or handling.
78-80:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMake
SessionCacheKeyencoding unambiguous.Delimiter-joining (
"%s|%s|%s") is not injective when inputs contain|, so different credential tuples can map to the same cache key and accidentally share a session.Proposed fix
+import ( + "bytes" + "encoding/binary" +) + func SessionCacheKey(server, username, password string) string { - hash := sha256.Sum256([]byte(fmt.Sprintf("%s|%s|%s", server, username, password))) + var buf bytes.Buffer + writeField := func(v string) { + var lenBuf [binary.MaxVarintLen64]byte + n := binary.PutUvarint(lenBuf[:], uint64(len(v))) + buf.Write(lenBuf[:n]) + buf.WriteString(v) + } + writeField(server) + writeField(username) + writeField(password) + hash := sha256.Sum256(buf.Bytes()) return fmt.Sprintf("%x", hash[:16]) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/asset/installconfig/vsphere/session.go` around lines 78 - 80, SessionCacheKey currently concatenates server, username, password with "|" which is ambiguous if fields contain "|"; change SessionCacheKey to serialize the three strings unambiguously before hashing (e.g., use a canonical serialization such as length-prefixed fields or json.Marshal of a struct {Server,Username,Password string}), then compute the sha256 and return the first 16 bytes hex as before; update the implementation in SessionCacheKey to use that serialization so different credential tuples cannot collide.
112-124:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winFix the double-unlock path in
NewSession.When a cached session exists and
sess.Error() != nil, the mutex is unlocked in-branch and then unlocked again at the end of the block, which can panic (sync: unlock of unlocked mutex).Proposed fix
sessionsMu.Lock() if sess, ok := sessions[key]; ok { if sess.Error() != nil { // Cached session is closed — evict it so a fresh one is created. delete(sessions, key) - sessionsMu.Unlock() // Fall through to create a new session. } else { sessionsMu.Unlock() return sess, func() { sess.Close() }, nil } } sessionsMu.Unlock()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/asset/installconfig/vsphere/session.go` around lines 112 - 124, The mutex is being unlocked twice when a cached session exists but sess.Error() != nil in NewSession: you delete(sessions, key) and call sessionsMu.Unlock() inside that branch, then reach the trailing sessionsMu.Unlock() again causing a panic. Fix by restructuring the conditional so the mutex is unlocked exactly once: when sess.Error() != nil, delete the entry and unlock then immediately proceed to create a new session (e.g., return to the outer create path or use an if/else so the final sessionsMu.Unlock() isn't executed for that branch). Reference symbols: NewSession, sessionsMu, sessions, sess.Error(), delete(sessions, key).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/asset/installconfig/vsphere/session.go`:
- Around line 121-122: The cleanup closure returned on cache hits always calls
sess.Close(), which allows one caller to close a shared cached session while
others still use it; change the cache to maintain a reference count per cached
session (e.g., a struct with sess and refCount) and update the acquire path to
increment refCount when returning a cached session and the cleanup closure to
decrement refCount and only call sess.Close() when refCount reaches zero; update
any places that return "sess, func() { sess.Close() }, nil" (the factory that
currently returns sess and the cleanup closure) to use the new acquire/release
semantics so both cache hits and misses follow the same ref-counted cleanup
behavior.
---
Duplicate comments:
In `@pkg/asset/installconfig/vsphere/session.go`:
- Around line 167-177: The Logout and client initialization calls
(s.rest.Logout, s.govmomi.Logout, cns.NewClient, pbm.NewClient) are using
context.Background() which can block indefinitely; change these to use a bounded
context (e.g., derive from a session-level context or session options
timeout/cancel function) and pass that context into the calls so RPCs observe
cancellation/timeouts. Locate the calls in session.go (the s.rest.Logout and
s.govmomi.Logout blocks and any cns.NewClient/pbm.NewClient usages) and replace
context.Background() with a context derived from the caller/session (with a
timeout or from an existing ctx parameter) and propagate errors unchanged.
- Around line 114-115: Currently the code uses sess.Error() to decide whether a
cached session is "closed" — change this to track explicit lifecycle state
instead: add an explicit closed boolean (e.g., cachedSession.closed) on the
session wrapper and set that flag when Close() is called (leave closeErr as a
separate error value), then replace any checks of sess.Error() (references
around sess.Error() at the existing checks and the other occurrence mentioned)
with checks of the explicit closed flag so a successfully-closed session
(closeErr == nil) is not treated as invalid and sessions with actual close
failures can still expose their closeErr for logging or handling.
- Around line 78-80: SessionCacheKey currently concatenates server, username,
password with "|" which is ambiguous if fields contain "|"; change
SessionCacheKey to serialize the three strings unambiguously before hashing
(e.g., use a canonical serialization such as length-prefixed fields or
json.Marshal of a struct {Server,Username,Password string}), then compute the
sha256 and return the first 16 bytes hex as before; update the implementation in
SessionCacheKey to use that serialization so different credential tuples cannot
collide.
- Around line 112-124: The mutex is being unlocked twice when a cached session
exists but sess.Error() != nil in NewSession: you delete(sessions, key) and call
sessionsMu.Unlock() inside that branch, then reach the trailing
sessionsMu.Unlock() again causing a panic. Fix by restructuring the conditional
so the mutex is unlocked exactly once: when sess.Error() != nil, delete the
entry and unlock then immediately proceed to create a new session (e.g., return
to the outer create path or use an if/else so the final sessionsMu.Unlock()
isn't executed for that branch). Reference symbols: NewSession, sessionsMu,
sessions, sess.Error(), delete(sessions, key).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: a35d0f12-227b-4e6d-b731-33adf1a50dab
📒 Files selected for processing (6)
pkg/asset/installconfig/vsphere/client.gopkg/asset/installconfig/vsphere/session.gopkg/asset/installconfig/vsphere/session_test.gopkg/asset/installconfig/vsphere/validation.gopkg/asset/installconfig/vsphere/vsphere.gopkg/destroy/vsphere/client.go
| return sess, func() { sess.Close() }, nil | ||
| } |
There was a problem hiding this comment.
Shared cached sessions need reference-counted cleanup.
Each caller gets a cleanup function that always calls Close(). On cache hits, one caller can close the shared session while another caller is still using it. This needs ref tracking (increment on acquire, decrement on cleanup, close on zero).
Also applies to: 142-147
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/asset/installconfig/vsphere/session.go` around lines 121 - 122, The
cleanup closure returned on cache hits always calls sess.Close(), which allows
one caller to close a shared cached session while others still use it; change
the cache to maintain a reference count per cached session (e.g., a struct with
sess and refCount) and update the acquire path to increment refCount when
returning a cached session and the cleanup closure to decrement refCount and
only call sess.Close() when refCount reaches zero; update any places that return
"sess, func() { sess.Close() }, nil" (the factory that currently returns sess
and the cleanup closure) to use the new acquire/release semantics so both cache
hits and misses follow the same ref-counted cleanup behavior.
Messing around with Hermes, vLLM and https://huggingface.co/RedHatAI/Qwen3.6-35B-A3B-NVFP4
Summary by CodeRabbit
New Features
Refactor
Tests