refactor: unify upstream label to use config name#35
Conversation
Previously, two metric families used the upstream label with conflicting semantics: - Per-connection metrics (introduced in ustclug#30) used the backend address (IP:port from target.Addr): active_connections_by_module, connection_sent_bytes, connection_received_bytes, connection_connected_timestamp_seconds, connection_duration_seconds. - Queue metrics (introduced in ustclug#34) used the upstream config name (from upstreamConfig.Name): queued_connections, queue_active_max, queue_queued_max, queue_full_rejected_total, upstream_dial_errors_total. Same label name, different value space, so they could not be joined with on(upstream) in PromQL. The config name is also the more stable identifier - backend addresses can change when an upstream is reconfigured, but the config name is fixed. Switch the per-connection metrics to use the upstream config name so the upstream label has a single, consistent meaning across all metrics. This is technically a breaking change for any dashboard or alert that relied on the old IP:port values, but ustclug#30 only landed recently and has not seen widespread use yet, so it is the right time to fix this. Renames: - ConnInfo.UpstreamAddr -> ConnInfo.Upstream - ConnInfo.SetUpstreamAddr -> ConnInfo.SetUpstream - connInfoSnapshot.UpstreamAddr -> connInfoSnapshot.Upstream (JSON tag "upstream" preserved for /connections API compatibility) The /connections JSON value now contains the upstream name. The rsync-proxyctl connections table column header ("Upstream") already fits this semantics.
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR updates connection tracking and metrics/status output to use a logical upstream identifier (e.g., "u1") instead of the upstream address, and adjusts downstream consumers accordingly.
Changes:
- Renamed
ConnInfo.UpstreamAddr/SetUpstreamAddrtoConnInfo.Upstream/SetUpstream. - Updated Prometheus metrics labeling and grouping to use the new upstream value.
- Updated CLI JSON parsing/output and tests to expect the upstream identifier rather than the address.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| pkg/server/server.go | Renames stored upstream field and switches relay to record target.Upstream into connection info. |
| pkg/server/metrics.go | Emits Prometheus labels using the new upstream identifier instead of the address. |
| pkg/server/server_test.go | Updates assertions to expect "u1" and ensures upstream addresses aren’t leaked in metrics output. |
| cmd/cmd.go | Adjusts /connections JSON decoding and table output to the renamed upstream field. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| snapshot := conn.snapshot() | ||
| key := prometheusConnectionGroup{ | ||
| module: prometheusLabelValueOrUnknown(snapshot.Module), | ||
| upstream: prometheusLabelValueOrUnknown(snapshot.UpstreamAddr), | ||
| upstream: prometheusLabelValueOrUnknown(snapshot.Upstream), | ||
| } |
There was a problem hiding this comment.
Documented in the PR description: this is intentionally a breaking change to the per-connection metrics (active_connections_by_module, connection_*). The rationale is that the conflicting upstream semantics introduced in #30 vs #34 cannot be joined in PromQL, and #30 only landed in v0.1.5 with no broad public uptake yet, so cleaning it up now has a low blast radius. An additive migration with a parallel upstream_id label would freeze the inconsistency and double the metric footprint in every existing dashboard query — keeping the simpler unified label seems the better trade-off here. Happy to revisit if maintainers prefer the additive approach.
| RemoteAddr string | ||
| ConnectedAt time.Time | ||
| Module string | ||
| UpstreamAddr string | ||
| Upstream string | ||
| SentBytes atomic.Int64 |
There was a problem hiding this comment.
The field name Upstream is deliberately aligned with the existing Target.Upstream field on the same struct, the [upstreams.<NAME>] config table, and the upstream Prometheus label / JSON tag. Renaming the Go field to UpstreamID while keeping the JSON tag and metric label as upstream would create a Go-vs-wire mismatch, which seems worse than the current naming. The struct already exposes the resolved address separately as Target.Addr for callers that need it.
| return false | ||
| } | ||
| return infos[0].snapshot().UpstreamAddr == upstreamAddr | ||
| return infos[0].snapshot().Upstream == "u1" |
There was a problem hiding this comment.
The literal "u1" is the established convention across the existing test suite (see e.g. TestPerUpstreamQueueIsolation, TestQueueFullRejectsConnection, TestMetricsCountsQueueFullRejection); the new assertions just match it. Centralizing it in a single constant could make sense but is a separate, repo-wide refactor that would change a lot of unrelated tests, so leaving it consistent with the surrounding tests in this PR.
| return false | ||
| } | ||
| return infos[0].snapshot().UpstreamAddr == upstreamAddr | ||
| return infos[0].snapshot().Upstream == "u1" |
There was a problem hiding this comment.
Same as the comment on line 378 — sticking with the existing "u1" convention used across the existing test file.
| assert.Equal(t, http.StatusOK, resp.StatusCode) | ||
| assert.Contains(t, text, "rsync_proxy_active_connections 1\n") | ||
| assert.Contains(t, text, fmt.Sprintf("rsync_proxy_active_connections_by_module{module=\"fake\",upstream=%q} 1\n", upstreamAddr)) | ||
| assert.Contains(t, text, "rsync_proxy_active_connections_by_module{module=\"fake\",upstream=\"u1\"} 1\n") |
There was a problem hiding this comment.
Same as the comment on line 378 — sticking with the existing "u1" convention used across the existing test file.
| conn.RemoteAddr, | ||
| conn.Module, | ||
| conn.UpstreamAddr, | ||
| conn.Upstream, |
There was a problem hiding this comment.
这个看起来会改变 rsync-proxy connections 输出的表格内容,虽然我觉得问题不大
Background
When PR #34 was merged it added queue/upstream-level Prometheus metrics
that label connections by upstream config name (the
Nameof the[upstreams.<NAME>]table), while the older per-connection metricsadded in PR #30 already labeled connections by the resolved upstream
address (
<host>:<port>fromtarget.Addr, plus"unknown").Both sets of series share the same label key
upstreambut their valueshave completely different semantics, so dashboards cannot
on(upstream)join the two groups, and the sameupstreamvalue canmean different things in different panels.
PR #30 only landed in the v0.1.4 → v0.1.5 cycle (no broad public
adoption yet), so cleaning this up now still has a low blast radius.
Change
Switch the per-connection metrics to use the upstream's config name
as well, matching the queue/upstream metrics introduced in PR #34.
Affected metrics (label
upstreamvalue changes from<addr>/"unknown"to the upstream's config Name):rsync_proxy_active_connections_by_modulersync_proxy_connection_sent_bytesrsync_proxy_connection_received_bytesrsync_proxy_connection_connected_timestamp_secondsrsync_proxy_connection_duration_secondsThe JSON tag on
ConnInfo.Upstreamis still"upstream"so the/connectionsdebug endpoint output keeps the same key for clients.Why config name
is an implementation detail that can shift on each pick.
the natural question "how much traffic did this upstream handle?".
per-connection and queue metrics on
upstream.Compatibility
This is a breaking change for anyone querying the per-connection
metrics by upstream address in dashboards/alerts. Series with
addresses will stop receiving data and new series with the upstream's
config Name will appear instead.
Given PR #30 only shipped in v0.1.5 and the metrics are intended for
operator dashboards (not external integrations), changing them now is
safer than leaving the inconsistency in place.
Tests
go vet ./... && go build ./... && go test -race ./pkg/...— all green.TestStatusIncludesSelectedUpstreamandTestMetricsIncludesActiveConnectionsupdated to assert the configName; the latter also gained a
assert.NotContainsfor the old address.