Skip to content

vpuhoff/ZeroSock

Repository files navigation

ZeroSock

ZeroSock is not a tool for bypassing internet restrictions. It is a high-performance L4 SOCKS5 router and load balancer built for server infrastructure, microservices, and large-scale data collection (e.g. scraping pipelines). It works like a scalpel: no bloat, strict destination control (whitelist-only routing), and maximum speed via zero-copy (splice on Linux).

📖 Overview & documentation

Key Use Cases

Resilient Egress Gateway and client-side load balancing

Problem: The app talks to external APIs or internal upstreams via DNS round-robin. When an upstream fails, DNS keeps returning its IP (TTL caching). The app hits timeouts and errors.

Solution with ZeroSock: Run ZeroSock locally next to the app (sidecar container or daemon). Point all outbound traffic to the local SOCKS5 port. ZeroSock performs active TCP health checks and removes dead upstreams from rotation immediately, bypassing DNS. The app sees a stable, healthy pool of backends without being aware of upstream failures.

Features

  • Custom minimal SOCKS5 parser (no third-party SOCKS library)
  • Supported auth methods: NO AUTH only
  • Supported commands: CONNECT only
  • Supported ATYP: IPv4 (0x01) and FQDN (0x03)
  • Unsupported by design: IPv6 (0x04), BIND, UDP ASSOCIATE
  • Host-based routing by FQDN from local YAML config
  • Round-robin load balancing across backend IP pools
  • TCP (L4) or optional HTTP (L7) healthchecks with Alive/Dead backend rotation
  • Strict zero-copy-compatible relay (io.Copy between raw *net.TCPConn, no bufio.Reader in data plane)
  • Built-in Prometheus metrics exporter (/metrics)
  • Graceful shutdown (SIGINT, SIGTERM) with configurable grace period

Why SOCKS5, not a transparent TCP balancer?

  • Explicit routing: The application must be explicitly configured to use SOCKS5. There is no accidental direct traffic if the OS or firewall is misconfigured — no proxy, no connection.
  • Strict whitelisting: Routing follows “deny all, allow listed”. If a compromised app tries to reach an unknown host, the connection is rejected at the SOCKS5 handshake (no extra network load).
  • Low overhead: On Linux, zero-copy via splice() moves data between sockets in kernel space. That yields multi-gigabit throughput with minimal CPU and memory (~30 MB RAM for thousands of connections).

Requirements

  • Go 1.21+
  • Linux for kernel-level splice optimization behind io.Copy

Quick start

  1. Copy config template:

    • cp config.example.yaml config.yaml
  2. Define backends (named groups with addresses and optional per-group healthcheck) and routes (host → group name).

  3. Run:

    • go run ./cmd/zerosock -config config.yaml
    • Validate config only: go run ./cmd/zerosock -c config.yaml
  4. Scrape metrics:

    • curl http://127.0.0.1:9090/metrics

Auto-Discovery

When auto_discovery.enabled: true, unknown hosts (FQDN or IP) are resolved, added to config, and saved to disk. The first request to an unknown host triggers DNS lookup (for FQDN) or direct add (for IP), appends a new backend group and route, and reloads the config. Subsequent requests use the new route.

auto_discovery:
  enabled: true
  • FQDN: Resolves host to IPv4 addresses and creates auto-<host> backend group.
  • IPv4: Adds the ip:port as a single-address backend.
  • New routes are written to the config file and applied via hot-reload.

Hot Reload

Reload the config without restarting the process:

  • kill -HUP <zerosock-pid>

What is applied on SIGHUP:

  • routes
  • backends
  • healthcheck settings

What is not applied until full restart:

  • server.listen_addr
  • metrics.listen_addr
  • metrics.enabled
  • connection and timeout tuning such as max_connections, max_inflight_dials, dial_ms, read_ms, write_ms, idle_ms, keepalive_ms

If a new config is invalid, ZeroSock logs the reload error and keeps the previous working configuration.

Config

Example structure (full template in config.example.yaml):

  • backends — named groups: each has addresses (list of ip:port) and optional healthcheck (interval_ms, timeout_ms, path). If path is set, L7 HTTP GET is used; otherwise L4 TCP. Unset fields fall back to the global healthcheck block.
  • routes — host → group name (string). Several hosts can reference the same group (shared backends, one healthcheck per group).
healthcheck:
  interval_ms: 5000
  timeout_ms: 2000
  # path: ""   # optional global default; per-group path overrides

backends:
  api-pool:
    addresses:
      - "10.0.1.10:8080"
      - "10.0.1.11:8080"
    healthcheck:
      interval_ms: 5000
      timeout_ms: 2000
      path: "/healthz"

routes:
  "api.internal": "api-pool"
  "api.example.com": "api-pool"

Behavior

  • FQDN: If the destination host from the SOCKS5 request exists in routes, ZeroSock picks an Alive backend via round robin and dials it.
  • IPv4 (Local vs Remote DNS): When the client sends an IP address (ATYP 0x01), ZeroSock looks up which route has that ip:port in its backend list and uses that route’s pool (same logic as FQDN). This lets the app use local DNS or cached IPs while still getting health checks and load balancing. If the IP is not in any route’s backends, the request is denied (whitelist).
  • If host (or IP) does not match any route, request is denied.
  • If all backends for host are dead, request is denied until healthcheck marks at least one backend alive.
  • server.max_connections limits simultaneously handled client sessions.
  • server.max_inflight_dials limits concurrent backend dial attempts.
  • timeouts.read_ms, timeouts.write_ms, and timeouts.idle_ms control socket deadlines.

Metrics (Prometheus)

Exported base metrics include:

  • zerosock_connections_total, zerosock_connections_active
  • zerosock_tcp_state_total{state=...} — TCP lifecycle: syn (accepted), established (relay started), fin (graceful close), rst (reset/error)
  • zerosock_handshake_latency_seconds
  • zerosock_requests_total{atyp=...}
  • zerosock_requests_backend_total{host,backend,result}
  • zerosock_route_failures_total{host,reason}
  • zerosock_backend_dial_latency_seconds, zerosock_backend_dial_failures_total{host,reason}
  • zerosock_relay_bytes_total{direction=...}, zerosock_relay_session_bytes{direction=...}
  • zerosock_session_duration_seconds
  • zerosock_healthchecks_total{host,backend,result}, zerosock_backend_alive{host,backend}

Performance & Benchmarks

ZeroSock has been rigorously stress-tested to validate its throughput, memory stability, and concurrency limits. Built for maximum efficiency, it operates near the physical limits of the OS network stack, offering L4 performance comparable to industry standards like HAProxy.

Key Highlights

  • Zero-Copy Routing: On Linux environments, ZeroSock utilizes the splice() system call. This allows data to be transferred directly between sockets within the kernel space, completely bypassing user-space overhead.
  • High Throughput: In loopback stress tests (10 vCPUs, k6, Nginx backend), ZeroSock successfully processed ~1 GB/s (8 Gbps) of payload data without becoming the CPU bottleneck.
  • Ultra-Low Memory Footprint: Memory consumption scales linearly and predictably. Under a sustained load of 2000 concurrent connections, the proxy consumed only ~34 MB of RAM (averaging a mere ~17 KB per connection), with zero memory leaks after the connections were closed.
  • Minimal Latency: Functional tests (e.g., pulling Docker images via Skopeo) showed that ZeroSock adds an almost imperceptible +1.9% (+135 ms) overhead compared to direct, unproxied connections.

Concurrency Limits & Scaling

  • Up to 500 Concurrent Users: Flawless stability. Achieved a 100% success rate with 500 VUs downloading 100 MB payloads simultaneously over 2 minutes.
  • Extreme Load (2000+ Concurrent Users): Maintained ~1 GB/s overall throughput with a 96.7% success rate. Under extreme CPU contention, ~3.3% of connections may experience relay timeouts. For environments expecting 2000+ simultaneous active transfers, tuning proxy limits (timeouts.idle_ms) and backend server timeouts is recommended.

The Verdict

ZeroSock acts as a highly optimized "scalpel" for SOCKS5 proxying. By stripping away heavy L7 features, it delivers raw, kernel-level networking performance and an exceptionally small resource footprint.

For full test methodology, metrics, and environment details, see STRESS_TEST_RESULTS.md.

Comparison with alternatives

Characteristic / Tool ZeroSock Gost Glider HAProxy Envoy 3proxy Xray-core
SOCKS5 (inbound) Yes (CONNECT only, IPv4/FQDN) Yes (full) Yes (full) Limited (via TCP) Yes (TCP proxy) Yes (full) Yes (full)
Round-robin balancing Yes Yes Yes Yes (advanced) Yes (advanced) Yes (via parent) Yes
Active health checks Yes (L4 TCP / L7 HTTP) Yes Yes Yes (very flexible) Yes (advanced) Basic Yes
Zero-copy (Linux splice) Yes (out of the box) No (standard io.Copy) No Yes No (but C++ highly optimized) No Depends on protocol
Strict egress (whitelisting) Yes (allowed hosts only) Yes (via ACL/rules) Yes (via rules) Yes (ACL) Yes (RBAC/routing) Yes (powerful ACL) Yes (powerful routing)
Resource usage (RAM) Ultra-low (~30 MB) Medium Low Medium High Ultra-low Medium
Configuration complexity Low (simple YAML) Medium (CLI/JSON) Medium High Very high Medium (own syntax) High (complex JSON)
Primary focus Minimal SOCKS5 sidecar / scraping Universal toolkit / tunnels Forward proxy / router Enterprise L4/L7 balancer Enterprise service mesh / egress Classic lightweight proxy Bypass / complex routing
Code size & scope No bloat (core only) Many features (crypto, tunnels) Many protocols Large codebase Large codebase Legacy protocol support Many obfuscation protocols

Takeaways

  1. Maximum throughput with minimal CPU (zero-copy): ZeroSock has a strong position among Go tools thanks to splice(). HAProxy is the main competitor here but is much harder to configure and is not a native SOCKS5 server.
  2. Balancing + SOCKS5 but need UDP or IPv6: Consider Gost or Glider.
  3. Production-grade for Kubernetes microservices: Envoy or HAProxy are the industry standard — heavier, but battle-tested with rich metrics, logging, and community.

In short: ZeroSock is a scalpel; Envoy or Xray are Swiss Army knives with many blades.

Roadmap

  • Prometheus metrics
  • Active upstream health checks
  • Zero-copy relay (splice)
  • Hot-reload on SIGHUP
  • Auto-discovery for unknown hosts

About

High-performance, zero-copy SOCKS5 load balancer and strict egress gateway written in Go.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors