fix(auth): persist refresh status/message and reduce redundant refresh attempts#2052
fix(auth): persist refresh status/message and reduce redundant refresh attempts#2052xx299x wants to merge 15 commits intorouter-for-me:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the robustness and observability of Codex token refresh mechanisms. It ensures that permanent refresh failures are clearly communicated through the management API and UI by persisting detailed status and messages. Furthermore, it optimizes the refresh process by preventing repeated attempts for tokens that have already failed permanently, and introduces configurable controls for refresh concurrency and burst timing, leading to a more stable and efficient system. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request improves the token refresh mechanism by persisting failure states to avoid redundant retries for permanently failed tokens and exposing this status via the management API. It also introduces configuration for refresh concurrency and burst delays. My review focuses on the correctness and robustness of these changes. I've identified a critical issue with direct file persistence that bypasses error handling and a medium-severity issue related to brittle error message string matching. The changes to make refresh behavior configurable are well-implemented.
luispater
left a comment
There was a problem hiding this comment.
Summary
This change improves Codex refresh error classification, but the current implementation does not reliably persist the new refresh failure metadata and it also treats permanent refresh failures as successful refreshes in the manager flow.
Key findings
- Blocking:
refresh_status/refresh_messageare not persisted consistently across store backends.FileTokenStoreinjectsauth.Metadataintoauth.StoragebeforeSaveTokenToFile, butGitTokenStore,ObjectTokenStore, andPostgresStorestill writeauth.Storagedirectly. In those backends, Codex auths that still carryauth.Storagewill drop the new metadata instead of persisting it. - Blocking:
CodexExecutor.Refreshreturns(auth, nil)for non-retryable refresh failures and for already-failed auths.Manager.refreshAuthtreats anynilerror as a successful refresh, updatesLastRefreshedAt, clearsLastError, and persists the auth. That breaks the documented meaning ofLastRefreshedAtas the last successful refresh and also means expired permanent-failure auths will keep re-entering the refresh loop.
Test plan
- Reviewed the PR diff and the affected persistence / refresh paths.
- Ran
go test ./...on the PR head.
Code ReviewSolid improvement to Codex refresh error classification and persistence. Found two blocking issues. Blocking Issues1. Inconsistent metadata persistence across backends
auth.Metadata["refresh_status"] = status
auth.Metadata["refresh_message"] = msg
saveTokenToFile(auth.Storage)But other backends write
These backends will DROP the new metadata instead of persisting it. Fix: Ensure all backends include 2. Permanent failures treated as successful refreshes
if isNonRetryableRefreshError(err) || auth.RefreshStatus == RefreshStatusPermanentFailure {
return auth, nil // Returns nil error!
}
auth, err := executor.Refresh(ctx, cred.Auth)
if err == nil { // nil error = success
cred.Auth.LastRefreshedAt = now
cred.Auth.LastError = ""
saveCredential(...)
}Problems:
Fix: Return an error from SummaryThe direction is right, but these two issues break the durability and semantics of refresh failure tracking. Fix persistence consistency and error handling before merge. |
refresh_message(andrefresh_status) when Codex token refresh fails, somainexposes consistent, actionable failure context via the management API/UI.main.