feat(skynet): hold+reuse multihop routes via a yamux skyfwd mux (fix route-death)#3368
Merged
Merged
Conversation
…route-death) The skynet resolving proxy "only worked well for direct routes": over a multihop route the routing rules expire and the whole route re-sets-up on every reconnect (seconds). Root cause — a rule's 10-min TTL is refreshed only by an OPEN RouteGroup's keepalive, and the native `.skynet` proxy dialed a FRESH route per SOCKS5 connection with no caching (embedded_skynetweb.go), closing it when the connection ended. Direct re-setup is instant; multihop isn't. skysocks-lite never had this because it caches+yamux-muxes its route and holds it open. Generalize that pattern: - **pkg/skyroute.Pool** — holds ONE route group per destination PK and yamux-muxes logical connections over it, so the held route's keepalive keeps every hop warm and reconnects reuse it with zero setup. Idle groups are reclaimed after ~10 min (DefaultIdleTTL); a holder that's done (e.g. a closing iframe window) can Release one eagerly. Because the route lives on the visor, this needs no browser "page-open" signal — it fixes the external-browser SOCKS5 case and the in-tab iframe case identically. - **skyfwd mux server** (skyenv.SkyForwardingMuxPort = 59) — one accepted route group carries a yamux session; each stream runs the SAME ready-byte + ClientMsg handshake + forward as the 1:1 SkyForwardingServerPort, reusing handleServerConn. The route group's peer PK is carried onto each stream (muxPeerConn) so the per-port PK whitelist still works. Additive + version-negotiated: a caller dials the mux port and falls back to the 1:1 port (ErrNoMux, negative-cached) against older visors. - **routerSkynetDialer** dials through the pool for the route path (direct transport + explicit source routes are unchanged); a real route-setup failure is surfaced, only ErrNoMux falls through to the legacy 1:1 dial. Unit-tested (pkg/skyroute): route-group reuse, ErrNoMux fallback + negative cache, idle reap, eager Release. Full build + vet clean. Design + the wider unified routing-control plan (skysocks-lite + skychat as the other two consumers of RoutingPolicy, the iframe control surface, and a voice-chat assessment) in docs/skynet-routing-control-rfc.md. This is Phase 1.
…ing server Wires the REAL yamux mux forwarding server (serveSkyForwardingMuxSession) to the REAL skyroute.Pool over one in-memory conn and asserts the core property: 5 logical connections reuse ONE route group (dial count == 1), each reaching a registered service through the real handleServerConn ready-byte + ClientMsg handshake + dispatch. A RouteGroup already satisfies net.Conn and runs under yamux in production (skysocks-lite), so this covers the mux-server↔pool integration the skyroute unit tests (which mock the far end) don't — without standing up the full transport/ route-setup stack. Also asserts muxPeerConn carries the route group's peer PK (keeps the per-port whitelist working over the muxed path).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The skynet resolving proxy "only worked well for direct routes." Over a multihop route the routing rules expire and the whole route re-sets-up on every reconnect (a multi-second stall), making multihop skynet browsing unusable through a browser-configured SOCKS5 proxy.
Root cause isn't the route type — it's reuse. A rule's 10-min TTL is refreshed only by an open RouteGroup's keepalive, and the native
.skynetproxy dialed a fresh route per SOCKS5 connection with no caching (embedded_skynetweb.go), closing it when the connection ended. Direct re-setup is instant; multihop isn't.skysocks-client-litenever had this because it caches + yamux-muxes its route and holds it open — so its multihop rules stay warm.Fix
Generalize that pattern into a shared primitive:
pkg/skyroute.Pool— holds ONE route group per destination PK and yamux-muxes logical connections over it, so the held route's keepalive keeps every hop warm and reconnects reuse it with zero setup. Idle groups are reclaimed after ~10 min (DefaultIdleTTL); a holder that's done (a closing iframe window) canReleaseone eagerly. Because the route lives on the visor, this needs no browser "page-open" signal — it fixes the external-browser SOCKS5 case and the in-tab iframe case identically.skyenv.SkyForwardingMuxPort = 59) — one accepted route group carries a yamux session; each stream runs the same ready-byte +ClientMsghandshake + forward as the 1:1SkyForwardingServerPort, reusinghandleServerConn. The route group's peer PK is carried onto each stream (muxPeerConn) so the per-port PK whitelist still works. Additive + version-negotiated: a caller dials the mux port and falls back to the 1:1 port (ErrNoMux, negative-cached) against older visors.routerSkynetDialerdials through the pool for the route path; the direct-transport fast path and explicit source routes are unchanged, and a real route-setup failure is surfaced (onlyErrNoMuxfalls through to the legacy 1:1 dial).Tests
pkg/skyrouteunit tests: route-group reuse (one dial, many streams),ErrNoMuxfallback + negative cache, idle reap, eagerRelease. Full build + vet clean.Scope / follow-ups
This is Phase 1 of
docs/skynet-routing-control-rfc.md(included). Phases 2–3 unifyskysocks-liteand skychat'sskynet:1path onto a sharedRoutingPolicyand add the iframe routing-control surface. Live validation on a real 2-visor no-direct-transport (forced multihop) path is the recommended acceptance test before/after merge — the unit tests cover the pool↔yamux logic but not a live route group.