Skip to content

feat: add connection lifecycle protections (idle/max-duration/throughput floor, per-IP cap, TCP keepalive, dial timeout)#37

Open
yaoge123 wants to merge 4 commits into
ustclug:masterfrom
yaoge123:upstream-pr/connection-protections
Open

feat: add connection lifecycle protections (idle/max-duration/throughput floor, per-IP cap, TCP keepalive, dial timeout)#37
yaoge123 wants to merge 4 commits into
ustclug:masterfrom
yaoge123:upstream-pr/connection-protections

Conversation

@yaoge123
Copy link
Copy Markdown
Contributor

@yaoge123 yaoge123 commented Jun 2, 2026

Summary

This PR adds a comprehensive set of connection-lifecycle protection mechanisms to defend rsync-proxy against slow-loris-style abuse, half-open / hung connections, dead upstreams, and single-IP exhaustion. All settings default to off (0), preserving existing behavior, and can be enabled per-deployment via [proxy] config.

The new metrics that go alongside these features make it possible to observe what's happening in production and tune the limits with evidence rather than guesses.

The commits are organized so each one can be reviewed independently:

  1. feat: add relay-phase idle timeout — mirrors rsyncd.conf timeout = semantics for the bidirectional relay phase.
  2. feat: add connection controls (TCP keepalive, per-IP cap, relay max duration) — three independent knobs.
  3. feat: add dial timeout, throughput floor, and termination-cause counters — defensive dial timeout, slow-throughput detection, and per-cause counters that distinguish watcher-killed from natural connection completions.

Motivation

Production experience operating this proxy in front of a busy public mirror surfaced several pain points that the existing max_active_connections / max_queued_connections knobs cannot address:

  • Connections that sit idle indefinitely (clients suspended, NAT entries expired, network half-open) hold a slot until the OS TCP keepalive kicks in (~2 hours by default).
  • Single-IP misuse — one client opening many parallel rsync connections to the same upstream — easily fills the active-conn pool and causes legitimate users to hit the queue.
  • "Slow loris" style abuse — clients that read at byte-per-minute rates — keep slots locked for hours without ever triggering an idle timeout.
  • A dead upstream causes every new dial to wait ~75 s on kernel SYN retries before failing, amplifying the impact during outages.
  • When a connection ends, there was no way to tell from metrics whether it ended naturally or was force-closed by the proxy, making it hard to tune the new timeouts safely.

What's added

[proxy] configuration knobs

Setting Type Default Behavior
relay_idle_timeout int (s) 0 Close the relay if neither direction has any I/O for the given duration. Mirrors rsyncd.conf timeout = semantics. 0 disables.
relay_max_duration int (s) 0 Hard ceiling on a single relay's lifetime. Long-running rsync clients reconnect transparently. 0 disables.
tcp_keepalive int (s) 0 TCP keepalive period applied to both accepted client connections and dialed upstream connections. 0 keeps OS defaults.
dial_timeout int (s) 0 net.Dialer.Timeout used when dialing upstreams. 0 keeps OS SYN-retry defaults (~75 s).
per_ip_max_active_connections int 0 Proxy-wide default cap on simultaneous relayed connections per (upstream, client IP). 0 disables. Per-upstream override available.
min_throughput_bytes int64 0 Together with min_throughput_window, defines a floor: if a connection moves fewer than this many bytes (sent + received) over any window after the grace period, it is closed. 0 disables.
min_throughput_window int (s) 60 Sliding window length for the throughput floor check. Inactive unless min_throughput_bytes > 0.
min_throughput_grace int (s) =window Grace period after relay start during which the throughput floor is not enforced. Defaults to the window value if unset. Useful for letting the rsync file-list phase complete on very large modules without triggering the floor.

[upstreams.<name>] override

Setting Type Default Behavior
per_ip_max_active_connections int inherits proxy-wide Per-upstream override of the cap. 0 falls back to the proxy-wide setting; if both are 0 there is no cap.

New Prometheus metrics

Metric Type Labels Increments when
rsync_proxy_per_ip_rejected_total counter upstream A connection is rejected because the (upstream, IP) cap was already at limit.
rsync_proxy_relay_idle_timeout_terminated_total counter upstream The relay watcher closed a connection due to relay_idle_timeout.
rsync_proxy_relay_max_duration_terminated_total counter upstream The relay watcher closed a connection due to relay_max_duration.
rsync_proxy_throughput_floor_terminated_total counter upstream The relay watcher closed a connection due to falling below min_throughput_bytes over a window.

These three *_terminated_total counters are deliberately separate from the existing completed_connections_total so operators can distinguish natural completions from enforced terminations and tune limits without flying blind.

Implementation notes

Watcher consolidation

The three relay-phase watchers (idle, max-duration, throughput floor) are implemented as a single goroutine with one time.Ticker. The tick interval is computed as min(idle/4, maxDuration/4, window/4) over the enabled checks, with a 1 s lower bound. The watcher is only started if at least one check is enabled, so connections with all checks disabled have zero overhead beyond what's already there.

Race-safety with config reload

All new timing fields are read through helpers that take reloadLock.RLock() (getRelayTimings, getTCPKeepAlive, getDialer, getPerIPLimitForUpstream). getRelayTimings returns a relayTimings struct snapshot to keep all six values mutually consistent within one tick. getDialer returns a copy of s.dialer so that the dialer's Timeout and KeepAlive can be updated by reload without racing with in-flight dials.

Per-IP counters

Stored in a sync.Map keyed by {upstream, ip} with *atomic.Int64 values. Lazy-initialized on first use; entries are not removed when the count returns to zero (kept simple — the working set is bounded by len(upstreams) × distinct active IPs, which is small in practice).

The check is CAS-style: increment first, decrement+reject if over limit, otherwise defer counter.Add(-1). This avoids any TOCTOU race between checking and reserving the slot.

TCP keepalive

Applied via a small helper applyTCPKeepAlive(conn, period) that no-ops on non-TCP connections (e.g., unix sockets) and on period <= 0. Called both on the accepted client connection in handleConn and on the dialed upstream connection after relay() connects.

Throughput floor

The watcher samples info.SentBytes + info.ReceivedBytes (already tracked atomically by the countingReader). On each tick after the grace period, it checks whether the per-window delta has fallen below the configured byte floor. The grace period is critical: it allows the rsync file-list / generator phase to complete on modules with millions of small files without false positives. The grace defaults to the window length, which is enough for most modules; operators with very large modules (e.g. debian-archive, opensuse-source-vault) should set it higher.

Why throughput floor and idle timeout are both needed

A "slow loris" style client that sends one byte per second will never trigger an idle timeout (there's always recent activity), but will trivially trip a sub-MiB-per-minute throughput floor. Conversely, a fully suspended client triggers idle quickly without needing the throughput check.

Backward compatibility

All new knobs default to 0 and the new behaviors are off unless explicitly enabled. Existing deployments see no behavioral change.

The new counters all have upstream labels matching the existing PR #34 convention (config name, not IP:port). They are pure additions — no existing series are renamed or removed.

Testing

Each commit ships with focused unit tests:

  • TestRelayIdleTimeoutClosesIdleConnection / TestRelayIdleTimeoutNotTriggeredWhenActive
  • TestPerIPCapRejectsConnectionsBeyondLimit
  • TestRelayMaxDurationClosesLongConnection
  • TestApplyTCPKeepAliveOnTCPConn
  • TestLoadConfigPropagatesTCPKeepAliveToDialer
  • TestLoadConfigRejectsNegativeTimings (table-driven)
  • TestThroughputFloorTerminatesSlowConnection
  • TestLoadConfigPropagatesDialTimeoutAndThroughputSettings (incl. grace-defaults-to-window)
  • TestLoadConfigRejectsNegativeNewSettings
  • Tests assert both the underlying counters (upstreamCounters.*Terminated) and the rendered /metrics output.

go vet, go build, go test -race ./..., and golangci-lint run all pass.

Production validation

This change set has been running on a busy public rsync mirror for several weeks. The new counters surfaced previously-invisible problems immediately — most notably a sustained rate of queue-full rejections during peaks and individual relays that had stayed open for multiple days at a time — and the protection knobs brought the situation under control.

yaoge123 added 3 commits June 2, 2026 08:55
Once a client connection passes the rsync handshake and enters the
bidirectional relay phase, the existing ReadTimeout/WriteTimeout no
longer apply: the proxy explicitly clears deadlines and just runs
io.Copy in both directions. A misbehaving or stuck client could then
hold a connection open indefinitely without transferring data.

Add an opt-in idle timeout for the relay phase, mirroring rsyncd's
"timeout" semantics (rsyncd.conf(5)):

* New config option [proxy].relay_idle_timeout (seconds). 0 (default)
  disables the timeout, matching the previous behavior.
* countingReader now optionally records a UnixNano timestamp of the
  last non-zero read. A single timestamp is shared between the two
  directions, so any traffic in either direction resets the clock.
* relay() starts a watcher goroutine when the timeout is enabled. It
  ticks at idle_timeout/4 (minimum 1s) and closes both endpoints once
  the configured duration has elapsed without activity, logging an
  access-log entry.
* The expected I/O errors that follow such a forced close are
  suppressed from the error log via an idleTimedOut atomic flag.

Tests cover both the timeout-fires path (idle upstream is torn down
and an access-log line is written) and the steady-traffic path (a
stream slower than the timeout but continuously active is not cut).
…uration)

Three new mechanisms to mitigate clients holding upstream slots
without making progress ("squatters"):

* tcp_keepalive (proxy-wide, seconds): enables TCP keepalive with
  the configured period on both accepted client connections and
  dialed upstream connections, so a half-open peer is detected
  within minutes rather than the OS default (~2 hours, or
  disabled). 0 keeps OS-default behavior.

* per_ip_max_active_connections (proxy-wide default with
  per-upstream override): caps the number of simultaneous active
  relay connections a single client IP may have to one upstream.
  Counted before queue admission so the cap also bounds queueing.
  Rejected requests get an @error line and are tracked via the new
  rsync_proxy_per_ip_rejected_total{upstream} counter.

* relay_max_duration (proxy-wide, seconds): hard wall-clock cap on
  the bidirectional relay phase. Connections still alive past this
  duration are closed regardless of activity, complementing the
  existing relay_idle_timeout (which only fires on no-I/O). The
  existing relay watcher goroutine now handles both checks.

The reload-mutated timing fields (RelayIdleTimeout,
RelayMaxDuration, TCPKeepAlive, dialer.KeepAlive) are now read
under the reload lock via getRelayTimings/getTCPKeepAlive/getDialer
helpers to avoid data races with config reloads.

Tests cover per-IP rejection (counter + metric + access log),
max-duration enforcement (with idle disabled), the keepalive
helper, config propagation to the dialer, and rejection of
negative values for the three new settings.
Three additions to the connection-controls family that close gaps the
existing idle/max-duration/per-IP/keepalive controls leave open and
make the existing teardown paths observable.

dial_timeout (proxy-wide)
  Bounds upstream Dial latency, sourced from net.Dialer.Timeout. The
  previous default behaviour relied on the kernel SYN-retry budget
  (~75s on Linux), which let unreachable backends pile up handshake-
  blocked goroutines and exhaust per-upstream queues.

min_throughput_bytes / min_throughput_window / min_throughput_grace
  Sliding-window throughput floor for the relay phase. Idle timeout
  cannot catch slow drips (a client that sips a few hundred bytes
  every second never goes idle). The watcher samples
  info.SentBytes+info.ReceivedBytes once per window after grace and
  tears the connection down if the delta falls below the configured
  byte budget. min_throughput_grace defaults to min_throughput_window
  when unset to avoid false positives during handshake.

Termination-cause counters
  rsync_proxy_relay_idle_timeout_terminated_total{upstream}
  rsync_proxy_relay_max_duration_terminated_total{upstream}
  rsync_proxy_throughput_floor_terminated_total{upstream}
  Until now an operator could not distinguish a watcher-killed
  connection from a normal close on the lifetime counter; these
  separate the three forced-shutdown paths.

Watcher restructured to share one ticker across all three checks; the
interval is min(idle/4, maxDuration/4, window/4) bounded at 1s. New
config values are validated for non-negative seconds. The race-safe
helper getRelayTimings() now returns a relayTimings struct so the
RLock-guarded snapshot covers all six fields atomically.

Tests cover counter increments and metrics output for idle, max-
duration and throughput-floor paths, propagation of dial_timeout and
min_throughput_* settings into the dialer and Server fields, the
grace-defaults-to-window fallback, and rejection of negative values
for all four new settings.
Copilot AI review requested due to automatic review settings June 2, 2026 01:03
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR introduces new connection-control features for the rsync proxy (per-IP concurrency caps, relay idle/max-duration timeouts, TCP keepalive, and throughput floor enforcement), along with Prometheus metrics and tests to validate behavior.

Changes:

  • Add new proxy/upstream config options (timeouts, keepalive, throughput floor, per-IP cap) and propagate them into server behavior.
  • Add relay-phase watcher logic and per-upstream counters, exposing new Prometheus metrics.
  • Add extensive end-to-end tests plus documentation updates in the example config.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
pkg/server/server.go Implements new relay controls (watchers, keepalive, per-IP cap), config propagation, and new counters.
pkg/server/metrics.go Exposes new per-upstream counters via Prometheus metrics.
pkg/server/config.go Adds TOML fields and documentation for new proxy/upstream settings.
pkg/server/server_test.go Adds integration tests for idle timeout, max duration, per-IP cap, keepalive helper, config parsing, and throughput floor.
assets/config.example.toml Documents new configuration knobs with examples and explanations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/server/server.go
Comment on lines +824 to +834
if perIPLimit := s.getPerIPLimitForUpstream(target.Upstream); perIPLimit > 0 {
counter := s.getPerIPCounter(target.Upstream, ip)
if n := counter.Add(1); int(n) > perIPLimit {
counter.Add(-1)
s.getUpstreamCounters(target.Upstream).perIPRejected.Add(1)
s.accessLog.F("client %s rejected for upstream %s module %s: per-IP cap of %d reached", ip, target.Upstream, moduleName, perIPLimit)
_, _ = writeWithTimeout(downConn, fmt.Appendf(nil, "@ERROR: per-IP connection limit of %d for upstream %s reached, retry later\n", perIPLimit, target.Upstream), writeTimeout)
return nil
}
defer counter.Add(-1)
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in af5c335 — comparing as n > int64(perIPLimit) directly, dropping the conversion. In practice cap values are tiny (single-digit defaults are common) so overflow was theoretical, but the int64 form is cleaner.

Comment thread pkg/server/server.go
Comment on lines +218 to +221
// Per-(upstream, client IP) active-connection counters.
// Lazy-initialized via getPerIPCounter.
// map key is perIPCountKey. Value is *atomic.Int64.
perIPCounts sync.Map
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair concern — tracked as a follow-up rather than rolling into this PR. The clean fix needs a CAS-style decrement-and-delete to avoid a race with a concurrent Add(1) that just observed the entry, which felt like material worth its own commit and review. As a rough bound, the working set is Σ(distinct active client IPs per upstream) and entries are tiny (*atomic.Int64), so the practical leak is small even on busy public mirrors. Will open a follow-up issue once these PRs land.

Comment thread pkg/server/server_test.go
Comment on lines +244 to +248
srv.modules = map[string][]Target{
"fake": {{Upstream: "u1", Addr: fakeRsync.Listener.Addr().String()}},
}
srv.upstreams = []upstreamConfig{{Name: "u1"}}
srv.upstreamQueues = map[string]*queue.Queue{"u1": queue.New(0, 0)}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern (mutating srv.modules / srv.upstreamQueues after startServer without taking reloadLock) is the established convention in the existing test file (see e.g. TestClientReadTimeout, TestRelayBetweenClientAndUpstream). The mutations land before any net.Dial against the listener, so the listener loop has not observed any traffic at that point and -race has not flagged it across many runs. Locking these for consistency is reasonable but feels like a separate cleanup of the existing tests rather than something to ship in this PR.

Comment thread pkg/server/server.go Outdated
Comment on lines +975 to +978
// Wake at most a few times per the smallest of the
// configured windows so timeouts are detected within roughly
// 1.25x the configured value in the worst case, while keeping
// wakeup overhead negligible.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in af5c335 — the comment now explicitly documents the 1s floor and the resulting degradation for sub-4s settings (only relevant in tests, since production-meaningful values are well above 4 seconds).

Comment thread pkg/server/server.go
Comment on lines +997 to +999
if interval < time.Second {
interval = time.Second
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix as the comment on line 978 — see af5c335.

Comment thread pkg/server/server.go
Comment on lines +537 to +550
// applyTCPKeepAlive enables TCP keepalive on the given connection if
// it is a *net.TCPConn and a positive period is provided. Other
// connection types (e.g. unix sockets) are silently ignored.
func applyTCPKeepAlive(conn net.Conn, period time.Duration) {
if period <= 0 {
return
}
tc, ok := conn.(*net.TCPConn)
if !ok {
return
}
_ = tc.SetKeepAlive(true)
_ = tc.SetKeepAlivePeriod(period)
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional. applyTCPKeepAlive is a best-effort optimization: if the platform / connection type does not support the syscall (e.g. unix sockets, exotic listeners) the proxy still works correctly, just falling back to the OS default. Logging a warning per accepted connection on such platforms would be more noise than signal. Open to adding a one-shot startup probe if reviewers feel observability there is important.

- Use int64 comparison for per-IP cap. perIPLimit values from config are
  always small (typical caps are single-digit to low-double-digit), so the
  practical risk of overflow on 32-bit GOARCH was zero. Still, comparing
  the int64 counter against int64(perIPLimit) is more correct and removes
  the conversion entirely from review noise.

- Clarify the watcher interval comment. The 1.25x worst-case detection
  claim only holds when all enabled timeouts are above 4 seconds, because
  of the explicit 1s floor on the wakeup interval. Document the trade-off
  explicitly so future maintainers know that sub-second timeouts (used
  only in tests) trade detection latency for kernel timer overhead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants