Skip to content

refactor: unify upstream label to use config name#35

Merged
iBug merged 1 commit into
ustclug:masterfrom
yaoge123:upstream-pr/unify-upstream-label
Jun 2, 2026
Merged

refactor: unify upstream label to use config name#35
iBug merged 1 commit into
ustclug:masterfrom
yaoge123:upstream-pr/unify-upstream-label

Conversation

@yaoge123
Copy link
Copy Markdown
Contributor

@yaoge123 yaoge123 commented Jun 2, 2026

Background

When PR #34 was merged it added queue/upstream-level Prometheus metrics
that label connections by upstream config name (the Name of the
[upstreams.<NAME>] table), while the older per-connection metrics
added in PR #30 already labeled connections by the resolved upstream
address
(<host>:<port> from target.Addr, plus "unknown").

Both sets of series share the same label key upstream but their values
have completely different semantics, so dashboards cannot
on(upstream) join the two groups, and the same upstream value can
mean different things in different panels.

PR #30 only landed in the v0.1.4 → v0.1.5 cycle (no broad public
adoption yet), so cleaning this up now still has a low blast radius.

Change

Switch the per-connection metrics to use the upstream's config name
as well, matching the queue/upstream metrics introduced in PR #34.

Affected metrics (label upstream value changes from
<addr> / "unknown" to the upstream's config Name):

  • rsync_proxy_active_connections_by_module
  • rsync_proxy_connection_sent_bytes
  • rsync_proxy_connection_received_bytes
  • rsync_proxy_connection_connected_timestamp_seconds
  • rsync_proxy_connection_duration_seconds

The JSON tag on ConnInfo.Upstream is still "upstream" so the
/connections debug endpoint output keeps the same key for clients.

Why config name

  1. The config name is a stable logical identifier; the resolved address
    is an implementation detail that can shift on each pick.
  2. When an upstream has multiple backends, aggregating by Name answers
    the natural question "how much traffic did this upstream handle?".
  3. PR feat: add observability metrics for queue monitoring, failure tracking, and Go runtime #34 already chose the Name route. Unifying lets dashboards join
    per-connection and queue metrics on upstream.

Compatibility

This is a breaking change for anyone querying the per-connection
metrics by upstream address in dashboards/alerts. Series with
addresses will stop receiving data and new series with the upstream's
config Name will appear instead.

Given PR #30 only shipped in v0.1.5 and the metrics are intended for
operator dashboards (not external integrations), changing them now is
safer than leaving the inconsistency in place.

Tests

  • go vet ./... && go build ./... && go test -race ./pkg/... — all green.
  • TestStatusIncludesSelectedUpstream and
    TestMetricsIncludesActiveConnections updated to assert the config
    Name; the latter also gained a assert.NotContains for the old address.

Previously, two metric families used the upstream label with conflicting
semantics:

- Per-connection metrics (introduced in ustclug#30) used the backend address
  (IP:port from target.Addr): active_connections_by_module,
  connection_sent_bytes, connection_received_bytes,
  connection_connected_timestamp_seconds, connection_duration_seconds.

- Queue metrics (introduced in ustclug#34) used the upstream config name (from
  upstreamConfig.Name): queued_connections, queue_active_max,
  queue_queued_max, queue_full_rejected_total,
  upstream_dial_errors_total.

Same label name, different value space, so they could not be joined
with on(upstream) in PromQL. The config name is also the more stable
identifier - backend addresses can change when an upstream is
reconfigured, but the config name is fixed.

Switch the per-connection metrics to use the upstream config name so
the upstream label has a single, consistent meaning across all
metrics. This is technically a breaking change for any dashboard or
alert that relied on the old IP:port values, but ustclug#30 only landed
recently and has not seen widespread use yet, so it is the right time
to fix this.

Renames:
- ConnInfo.UpstreamAddr -> ConnInfo.Upstream
- ConnInfo.SetUpstreamAddr -> ConnInfo.SetUpstream
- connInfoSnapshot.UpstreamAddr -> connInfoSnapshot.Upstream
  (JSON tag "upstream" preserved for /connections API compatibility)

The /connections JSON value now contains the upstream name. The
rsync-proxyctl connections table column header ("Upstream") already
fits this semantics.
Copilot AI review requested due to automatic review settings June 2, 2026 00:53
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR updates connection tracking and metrics/status output to use a logical upstream identifier (e.g., "u1") instead of the upstream address, and adjusts downstream consumers accordingly.

Changes:

  • Renamed ConnInfo.UpstreamAddr/SetUpstreamAddr to ConnInfo.Upstream/SetUpstream.
  • Updated Prometheus metrics labeling and grouping to use the new upstream value.
  • Updated CLI JSON parsing/output and tests to expect the upstream identifier rather than the address.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
pkg/server/server.go Renames stored upstream field and switches relay to record target.Upstream into connection info.
pkg/server/metrics.go Emits Prometheus labels using the new upstream identifier instead of the address.
pkg/server/server_test.go Updates assertions to expect "u1" and ensures upstream addresses aren’t leaked in metrics output.
cmd/cmd.go Adjusts /connections JSON decoding and table output to the renamed upstream field.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/server/metrics.go
Comment on lines 110 to 114
snapshot := conn.snapshot()
key := prometheusConnectionGroup{
module: prometheusLabelValueOrUnknown(snapshot.Module),
upstream: prometheusLabelValueOrUnknown(snapshot.UpstreamAddr),
upstream: prometheusLabelValueOrUnknown(snapshot.Upstream),
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented in the PR description: this is intentionally a breaking change to the per-connection metrics (active_connections_by_module, connection_*). The rationale is that the conflicting upstream semantics introduced in #30 vs #34 cannot be joined in PromQL, and #30 only landed in v0.1.5 with no broad public uptake yet, so cleaning it up now has a low blast radius. An additive migration with a parallel upstream_id label would freeze the inconsistency and double the metric footprint in every existing dashboard query — keeping the simpler unified label seems the better trade-off here. Happy to revisit if maintainers prefer the additive approach.

Comment thread pkg/server/server.go
Comment on lines 59 to 63
RemoteAddr string
ConnectedAt time.Time
Module string
UpstreamAddr string
Upstream string
SentBytes atomic.Int64
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field name Upstream is deliberately aligned with the existing Target.Upstream field on the same struct, the [upstreams.<NAME>] config table, and the upstream Prometheus label / JSON tag. Renaming the Go field to UpstreamID while keeping the JSON tag and metric label as upstream would create a Go-vs-wire mismatch, which seems worse than the current naming. The struct already exposes the resolved address separately as Target.Addr for callers that need it.

Comment thread pkg/server/server_test.go
return false
}
return infos[0].snapshot().UpstreamAddr == upstreamAddr
return infos[0].snapshot().Upstream == "u1"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The literal "u1" is the established convention across the existing test suite (see e.g. TestPerUpstreamQueueIsolation, TestQueueFullRejectsConnection, TestMetricsCountsQueueFullRejection); the new assertions just match it. Centralizing it in a single constant could make sense but is a separate, repo-wide refactor that would change a lot of unrelated tests, so leaving it consistent with the surrounding tests in this PR.

Comment thread pkg/server/server_test.go
return false
}
return infos[0].snapshot().UpstreamAddr == upstreamAddr
return infos[0].snapshot().Upstream == "u1"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment on line 378 — sticking with the existing "u1" convention used across the existing test file.

Comment thread pkg/server/server_test.go
assert.Equal(t, http.StatusOK, resp.StatusCode)
assert.Contains(t, text, "rsync_proxy_active_connections 1\n")
assert.Contains(t, text, fmt.Sprintf("rsync_proxy_active_connections_by_module{module=\"fake\",upstream=%q} 1\n", upstreamAddr))
assert.Contains(t, text, "rsync_proxy_active_connections_by_module{module=\"fake\",upstream=\"u1\"} 1\n")
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment on line 378 — sticking with the existing "u1" convention used across the existing test file.

Comment thread cmd/cmd.go
conn.RemoteAddr,
conn.Module,
conn.UpstreamAddr,
conn.Upstream,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个看起来会改变 rsync-proxy connections 输出的表格内容,虽然我觉得问题不大

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个其实挺好的,我一直都想这么做

@taoky taoky requested a review from iBug June 2, 2026 14:08
@iBug iBug merged commit 2bab952 into ustclug:master Jun 2, 2026
1 check passed
@yaoge123 yaoge123 deleted the upstream-pr/unify-upstream-label branch June 3, 2026 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants