fix: route metadata server GCP token requests through hub client#184
Closed
scion-gteam[bot] wants to merge 2 commits into
Closed
fix: route metadata server GCP token requests through hub client#184scion-gteam[bot] wants to merge 2 commits into
scion-gteam[bot] wants to merge 2 commits into
Conversation
38fa4df to
2640353
Compare
Colocated docker agents ran with --network=host, making the per-agent
metadata server (127.0.0.1:18380) and telemetry OTLP receiver (:4317)
host-global singletons. Only the first agent could bind them; concurrent
or resumed agents got 'address already in use' -> sciontool doctor 502.
Host networking also leaks GCP SA identity across agents.
Route colocated docker agents at the public Caddy domain so each runs in
its own netns under bridge networking:
- ResolveDockerNetworking: add SCION_FORCE_HOST_NETWORK escape hatch;
add DockerSupportsHostGateway capability probe (Engine >= 20.10).
- startRuntimeBroker: ContainerHubEndpoint autocompute prefers the public
domain for colocated docker; falls back to host.docker.internal (host
networking) when force-host is set, host-gateway is unsupported, or no
public domain is configured (warns in the latter two cases).
- applyContainerBridgeOverride: use a public-domain ContainerHubEndpoint
wholesale instead of grafting the localhost port (e.g. :8080) onto it.
- gce-start-hub.sh: export SCION_SERVER_BASE_URL=https://${HUB_DOMAIN}
so the broker dispatches agents to the domain.
Scope is confined to docker + colocated; kubernetes, cloud run, podman,
and remote-hub agents are unaffected. Reverting is a one-flag rollback
(SCION_FORCE_HOST_NETWORK=1) with no redeploy.
6c94700 to
bfd9964
Compare
- DockerSupportsHostGateway: bound the 'docker version' probe with a 5s timeout so an unresponsive daemon can't hang server startup. - parseDockerServerVersion: scan line-by-line and tolerate a leading v/V prefix so daemon warnings or prefixed versions don't defeat the probe. - add tests for v/V prefix, surrounding whitespace, and warning-prefixed multi-line output.
Owner
|
Merged upstream — PR GoogleCloudPlatform#371 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/api/v1/agent/gcp-tokenendpoint usingAuthorization: Bearer <app-token>, which conflicts with OIDC transport auth (IAP/Cloud Run) and uses a different header convention than the hub client (X-Scion-Agent-Token)sciontool doctorshowsmetadata server returned 502: token generation faileddespite scion auth being validChanges
pkg/sciontool/metadata/server.go: AddedFetchGCPTokenandFetchGCPIdentityTokencallbacks toConfig; when set, these are used instead of direct HTTP calls. ExportedGCPAccessTokenResponsetype.pkg/sciontool/hub/client.go: AddedFetchGCPToken()andFetchGCPIdentityToken()methods that call the Hub usingX-Scion-Agent-Tokenauth and the OIDC transport layercmd/sciontool/commands/init.go: Wired the metadata server's callbacks to the hub client using late-binding closures (hub client is created after the metadata server starts)Test plan
go build ./...passesgo test ./pkg/sciontool/metadata/...passes (direct HTTP fallback still works)go test ./pkg/sciontool/hub/...passesgo test ./cmd/sciontool/...passessciontool doctorshows[ OK ] GCP access token retrievableon a resumed agent