fix(audit): populate missing actor and request metadata in audit logs#159
fix(audit): populate missing actor and request metadata in audit logs#159
Conversation
- Resolve ActorUsername via DB lookup for 11 audit entries in token exchange, refresh, and management paths where sessionless API context left the field empty - Add RequestContextMiddleware to propagate UserAgent, RequestPath, and RequestMethod through request context for all downstream audit logging - Enrich buildAuditLog with context fallback for request metadata fields - Drain logChan on shutdown to prevent silent audit event loss - Add Prometheus counter for audit events dropped due to buffer overflow - Consolidate duplicate GetUserByID call in ExchangeAuthorizationCode - Remove dead IPMiddleware wrapper and update stale references Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
This PR improves audit log completeness and observability by enriching audit entries with missing actor/request metadata, tightening shutdown behavior to avoid audit data loss, and adding dropped-event monitoring.
Changes:
- Add request metadata (User-Agent, path, method) propagation via context middleware + context helpers, and enrich audit logs from context.
- Populate missing
ActorUsernamefor token exchange/refresh/management audit events via store lookups, and reduce duplicate user fetch in auth-code exchange. - Drain the audit log channel during shutdown and add a Prometheus counter for dropped audit events.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/util/context.go | Adds context helpers to store/extract HTTP request metadata. |
| internal/util/context_test.go | Unit tests for request metadata context helpers and chaining. |
| internal/middleware/context.go | Replaces IP-only middleware with request context middleware (IP + UA/path/method). |
| internal/bootstrap/router.go | Wires the new request context middleware into the router. |
| internal/services/audit.go | Enriches audit logs from context, drains channel on shutdown, adds dropped-events counter. |
| internal/services/audit_test.go | Adds tests for enrichment and shutdown draining behavior. |
| internal/services/token.go | Adds helper to resolve usernames for audit entries. |
| internal/services/token_refresh.go | Populates ActorUsername in refresh-path audit logs. |
| internal/services/token_management.go | Populates ActorUsername in token management audit logs. |
| internal/services/token_exchange.go | Populates ActorUsername in exchange-path audit logs; consolidates user lookup for ID token claims + audit. |
| internal/middleware/auth.go | Updates stale comment reference to renamed middleware. |
| internal/handlers/oauth_handler.go | Updates stale comment reference to renamed middleware. |
| internal/handlers/auth.go | Updates stale comment reference to renamed middleware. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…metric registration - Make resolveUsername context-aware: check request context before falling back to DB lookup, avoiding unnecessary queries on authenticated routes - Lazily fetch user profile in ExchangeAuthorizationCode only when openid profile/email claims are needed - Register audit events_dropped_total counter via sync.Once singleton to prevent duplicate-registration panics in tests - Make TestShutdown_DrainsLogChan deterministic by constructing the service without auto-starting the worker and verifying persisted rows Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Reuse the user object fetched for ID-token profile/email claims to populate actorUsername, so the sessionless /oauth/token path never issues two GetUserByID calls for the same user. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- resolveUsername now checks that the context user ID matches the requested userID before using the context username, preventing misattribution when IDs differ - Shutdown drain uses a length snapshot instead of an unbounded loop, so late-arriving producers cannot prevent the worker from exiting Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Truncate UserAgent, RequestPath, RequestMethod to match model varchar limits before persisting, preventing write failures from long values - Gate audit dropped-events Prometheus counter behind an explicit registerer so deployments with metrics disabled do not register collectors from the services layer
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… enqueue - Subtract 3 from varchar limits passed to TruncateString so the appended "..." does not exceed column size constraints - Add atomic stopped flag to reject events in Log() after Shutdown() begins, preventing events from being stuck in logChan
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…down, and fix method truncation - Resolve actorUsername via DB fallback before the ID token audit entry so openid-only requests (without profile/email) are not left empty - Replace len()-snapshot drain with non-blocking receive loop in shutdown worker to flush entries enqueued during the race window - Hard-truncate RequestMethod to varchar(10) limit without ellipsis so standard HTTP methods like PROPFIND (8 chars) are preserved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…down race - Replace shutdownCh with closing logChan to signal the worker, so range-based drain is guaranteed to see all accepted events - Guard Log() sends with sendMu.RLock; Shutdown() takes sendMu.Lock after setting stopped, ensuring all in-flight sends complete before the channel is closed - Worker uses ok-check on receive to detect closed channel and flush
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Rename TestGetRequestMetadataFromContext_Empty to TestRequestMetadataContext_Empty since it tests individual getters, not a single GetRequestMetadataFromContext helper.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… noop lookups - buildAuditLog only fills ActorUsername from context when entry ActorUserID is empty or matches the context user, preventing misattribution when call sites set only ActorUserID - Align audit_events_dropped_total naming with the rest of the metrics package (single Name field, no Namespace/Subsystem) - resolveUsername short-circuits when audit logging is the noop service, avoiding wasted DB round-trips on hot token paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Replace the sync.Once-based counter init with a mutex-protected pattern that allows SetAuditMetricsRegisterer to register a previously created counter. This eliminates the silent misconfiguration when AuditService is constructed before metrics are configured (common in tests). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Remove TokenService.resolveUsername and the explicit ActorUsername
settings from 11 audit entries across token exchange, refresh, and
management. The audit service is now the single source of truth for
ActorUsername resolution:
- buildAuditLog tries the request context (zero cost) when the entry's
ActorUserID matches the context user, then falls back to a DB lookup
when context did not help
- Synthetic machine identities ("client:<clientID>" from the
client_credentials grant) are detected by prefix and skip the DB
lookup, avoiding wasted queries on hot introspection paths
- ExchangeAuthorizationCode caches the user it fetches for OIDC
profile/email claims into context (via SetUserContext) so the audit
layer's context-match check picks it up without a duplicate query
Removes the leaky type assertion against *NoopAuditService and the
asymmetry where ActorUsername needed service-layer help while every
other auto-enriched field (IP, UA, path, method, ActorUserID) was
handled in buildAuditLog.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix audit logging completeness, observability, and shutdown safety
Background
Investigation revealed that audit records written to the database were missing several fields that should have been populated. Further review uncovered four design flaws in the audit subsystem affecting completeness, reliability, and observability.
Issues Fixed
1. Missing
ActorUsername(11 audit entries)Problem: The
/oauth/tokenendpoint is sessionless, so the request context carries no user. Token-related service methods set onlyActorUserIDfrom the token/record, leavingActorUsernameempty even though the user identity was knowable.Fix: Centralized all
ActorUsernameenrichment inbuildAuditLog(single source of truth):entry.ActorUserIDmatches the context user ID (prevents misattribution)client:<clientID>fromclient_credentialsgrant) by prefix check, avoiding wasted DB queries on hot introspection pathsNoopAuditService.Log()is a no-op, sobuildAuditLogis never called, so no DB query happensExchangeAuthorizationCodecaches the user object it fetches for OIDC profile/email claims into request context (models.SetUserContext), so the audit layer's context-match check picks it up without a duplicate query. Result: still only oneGetUserByIDper auth-code exchange for openid+profile/email.This eliminates the previous service-layer
resolveUsernamehelper and 11 explicitActorUsername:settings acrosstoken_exchange.go,token_refresh.go, andtoken_management.go.ActorUsernameis now treated symmetrically withActorIP,UserAgent,RequestPath,RequestMethod, andActorUserID— all auto-enriched bybuildAuditLog.2. Missing request metadata (
UserAgent,RequestPath,RequestMethod)Problem: 41 of 44 audit log call sites left these fields empty.
buildAuditLoghad context fallback forActorIP/ActorUsername/ActorUserIDbut not for request metadata.Fix:
util.SetRequestMetadataContextand three getter helpers ininternal/util/context.gomiddleware.RequestContextMiddleware(replacesIPMiddleware) propagates IP + UA + path + method into request contextbuildAuditLogenriches all three request metadata fields from contextUserAgent,RequestPath→ 497 chars +...;RequestMethod→ hard 10 chars) to fit DB column limits3. Audit event loss on shutdown
Problem: The worker goroutine was signaled by closing a separate
shutdownCh, leaving any unconsumed events inlogChanpermanently lost.Fix:
shutdownChwith closinglogChanitself, so the worker's range-based receive guarantees draining all accepted eventsatomic.Bool stoppedflag and anRWMutex sendMu:Log()takessendMu.RLock()and checksstoppedbefore sendingShutdown()setsstopped, then takessendMu.Lock()to wait for in-flight sends, then closeslogChan4. Silent dropping with no observability
Problem: When the buffer was full, events were dropped via
log.Printfonly — no metric for monitoring/alerting.Fix:
audit_events_dropped_totalPrometheus counterSetAuditMetricsRegisterer, so deployments with metrics disabled do not leak collectorsinternal/metrics(singleNamefield, noNamespace/Subsystem)Architecture Changes
IPMiddlewareRequestContextMiddlewareActorUsernameenrichmentbuildAuditLog(context match → DB fallback → skip machine identities)log.Printfonlyaudit_events_dropped_total*NoopAuditServiceActorUserIDand let audit handle the restFiles Changed
internal/services/audit.gointernal/services/audit_test.gointernal/services/token_exchange.goActorUsernamesettingsinternal/services/token_management.goActorUsernamesettingsinternal/services/token_refresh.goActorUsernamesettingsinternal/util/context.gointernal/util/context_test.gointernal/middleware/context.goRequestContextMiddlewarereplacesIPMiddlewareinternal/bootstrap/bootstrap.goSetAuditMetricsRegistererwhen metrics enabledinternal/bootstrap/router.goRequestContextMiddlewareinternal/middleware/auth.gointernal/handlers/auth.gointernal/handlers/oauth_handler.goTotal: 13 files changed, 440 insertions, 54 deletions
Test Plan
make generatepassesmake lint— 0 issuesmake test— all 14 packages pass, zero failuresbuildAuditLogenrichment behavior:client:<id>) skips DB lookupCompatibility Notes
IPMiddlewareremoved: callers must useRequestContextMiddleware. Internal package, no external consumers.audit_events_dropped_totalmetric: New counter; only registered when metrics are enabled (MetricsEnabled=true).UserAgent,RequestPath,RequestMethodcolumns will now be populated for previously empty rows.ActorUsernameis now reliably populated for all token operations. No schema changes.🤖 Generated with Claude Code