Skip to content

feat: native SDK fallback to ICANN when PubkyTLS direct endpoint is unreachable#352

Merged
SHAcollision merged 3 commits into
mainfrom
feat/native-icann-fallback
Apr 1, 2026
Merged

feat: native SDK fallback to ICANN when PubkyTLS direct endpoint is unreachable#352
SHAcollision merged 3 commits into
mainfrom
feat/native-icann-fallback

Conversation

@SHAcollision
Copy link
Copy Markdown
Contributor

@SHAcollision SHAcollision commented Mar 19, 2026

Problem

Native clients (Pubky Ring, CLI tools) can't reach homeservers behind Cloudflare Tunnel or NAT proxies.

A homeserver's PKARR record advertises two HTTPS endpoints:

  • Direct (priority 1): target="." + A/AAAA + port, PubkyTLS (RFC 7250)
  • ICANN (priority 10): `target="example.com", standard X.509 TLS

The native SDK delegates to pkarr's reqwest Resolve impl, which returns only the first endpoint (direct). When that IP is unreachable, typical for Umbrel behind Cloudflare Tunnel, the TCP connection hangs for ~75-120s and fails. The ICANN endpoint is never tried.

WASM builds are unaffected: they already resolve all endpoints and pick the ICANN one.

Prior work

  • pubky/pkarr#220, proposed moving endpoint selection into pkarr
  • #324, SDK-side fallback (superseded by this PR)

Why this belongs in pubky-sdk, not pkarr

The fallback requires choosing between two reqwest::Client instances with different TLS stacks (self.http for PubkyTLS vs self.icann_http for X.509). The reqwest::dns::Resolve trait only returns socket addresses, it cannot express "switch TLS config." Pkarr already exposes all endpoints via resolve_https_endpoints(); the SDK just wasn't consuming them on native.

Closes pubky/pkarr#220. Supersedes #324.

Changes

cross_request() now explicitly resolves PKARR endpoints and selects transport:

Scenario Behavior
Only direct endpoints PubkyTLS, unchanged, no probe
Only ICANN endpoints icann_http + pubky-host header
Both exist, direct reachable PubkyTLS, TCP probe succeeds (<1.5s)
Both exist, direct unreachable ICANN fallback, TCP probe times out (1.5s)

Decisions are cached per public key (60s TTL), this value is likely too small, we might want +1hour. Concurrent requests for the same host coalesce on a single probe via per-key async guards.

@SHAcollision SHAcollision force-pushed the feat/native-icann-fallback branch from 17ebc0e to a6eed11 Compare March 19, 2026 12:17
@SHAcollision SHAcollision force-pushed the feat/native-icann-fallback branch from a6eed11 to fb9dc82 Compare March 20, 2026 12:25
@SHAcollision SHAcollision requested a review from 86667 March 23, 2026 14:03
@SHAcollision SHAcollision marked this pull request as ready for review March 23, 2026 14:03
@SHAcollision SHAcollision changed the title (WIP) feat: native SDK fallback to ICANN when PubkyTLS direct endpoint is unreachable feat: native SDK fallback to ICANN when PubkyTLS direct endpoint is unreachable Mar 23, 2026
Copy link
Copy Markdown
Collaborator

@86667 86667 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable to have this logic in cross_request() and solid caching stratergy.

The code is quite ugly though, I'm sure AI could clean it up a bit. eg:

  • cross_request() can be split up - currently mixes transport resolution with request building
  • A small struct owning caching and some of this logic would be nice

I especially dont like this pattern:

        let transport = if let Some(t) = transport {
            t
        } else {
             // 30 lines of code
        }

@86667
Copy link
Copy Markdown
Collaborator

86667 commented Mar 26, 2026

@SeverinAlexB made this useful comment in Homegate - pubky/homegate#20 (review)

Probably applies here too.

@SHAcollision SHAcollision force-pushed the feat/native-icann-fallback branch from 7369fa1 to 4336784 Compare March 30, 2026 14:25
@SHAcollision SHAcollision force-pushed the feat/native-icann-fallback branch from 4336784 to 917627a Compare March 30, 2026 14:30
@SHAcollision
Copy link
Copy Markdown
Contributor Author

SHAcollision commented Mar 30, 2026

@86667 Thanks a lot for the review! Pushed one commit addressing the function complexity problem. Much cleaner now, used Sev's "skills". One function resolve_from_pkarr remains ~30 lines because it's irreducible logic (breaking it will just make maintainability harder). Note that the Pubky SDK is using clippy warnings for high cognitive complexity functions, this PR passes the threshold of function complexity that we deem acceptable on the pubky-sdk crate.

Copy link
Copy Markdown
Collaborator

@86667 86667 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Couple bits which could be added but not blockers imo.

Comment thread pubky-sdk/src/client/http_targets/native.rs Outdated
Comment thread pubky-sdk/src/client/http_targets/native.rs Outdated
Comment thread pubky-sdk/src/client/http_targets/native.rs
@SHAcollision SHAcollision merged commit 0d9c534 into main Apr 1, 2026
14 checks passed
@SHAcollision SHAcollision deleted the feat/native-icann-fallback branch April 1, 2026 20:43
@SeverinAlexB SeverinAlexB mentioned this pull request May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Endpoint selection policy in pkarr: prefer reachable direct endpoint, fallback to ICANN

2 participants