Skip to content

iOS v2 client: gate/pairing, reconnect hardening, host-key verification, one-finger scroll#7

Draft
omriariav wants to merge 25 commits into
mainfrom
ios-reconnect-hardening
Draft

iOS v2 client: gate/pairing, reconnect hardening, host-key verification, one-finger scroll#7
omriariav wants to merge 25 commits into
mainfrom
ios-reconnect-hardening

Conversation

@omriariav

@omriariav omriariav commented May 30, 2026

Copy link
Copy Markdown
Collaborator

Long-running ios-reconnect-hardening branch, now up to date with main (merged Sagi's 29b51b6 — atomic, lock-protected devices.json writes in lib/handoff-common.sh). This PR is the branch's accumulated iOS work — ~20 commits / 35 files. Opening as a draft to track until we resume. (The phone notification bridge and bin/handoff attach hardening are already on main, so they're not part of this diff.)

iOS — v2 client + reconnect hardening

  • v2 gate/pairing: v2 QR payload (protocol version + nonce), gate-protocol client, verification handshake, terminal attach routed through the gate, device permissions + structured gate errors.
  • Reconnect hardening: survive backgrounding (reconnect Sessions + Terminal on resume), cancel stale silent refreshes, restart the Tailscale transport after resume, fix stale-state races.
  • UI: app lock, host-key verification (trust prompt + mismatch), settings + licenses screens, session filtering + pinned tabs, expanded mobile toolbar, sessions status line, read-only state surfaced into the terminal, gate/read-only error surfacing.
  • One-finger swipe → tmux scrollback (newest): ports Android's Termux doScroll to SwiftTerm — disables SwiftTerm mouse reporting so the pan is ours, converts drag→rows, emits SGR wheel events (ESC[<64;…M up / 65 down) or arrow keys on a bare alt-screen. Tap-to-focus + long/double/triple-tap selection still work. Built, installed, and verified on a physical iPhone.

Android

  • SSH host-key verification (trust-on-first-use + mismatch dialogs) and supporting screen updates.

Docs

  • ios/IOS_ROADMAP.md — parity-gap analysis.

Status

Draft — long-running branch we'll resume later. Contained work is functional; the iOS scroll change is verified on-device.

🤖 Generated with Claude Code

omriariav and others added 23 commits April 17, 2026 10:32
Captures the path from current v1 iOS client to full parity with the
Android app and the handoff CLI's gate/v2 permission model. P0 is
moving iOS onto QR v2 + gate commands + device verification; P1
covers settings/biometric/TOFU/renewal/reconnect; P2 is polish.

Explicitly flags non-goals (no hardware-bound SSH key — Android's
biometric is UI-only; no in-app access log; no networking rewrite)
and corrections to prior assumptions (Android still uses
StrictHostKeyChecking=no; Android's renewal UX isn't surfaced yet).

Authored by codex (CTO review role) at iosdev's request.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Folds in Sagi's last 24h of Android + CLI work:

- New P1 item: ios-background-survival. Android shipped
  HandoffConnectionService.kt; iOS needs a platform-idiomatic
  equivalent (background modes / URLSession bg task / fast resume),
  not a direct port of the foreground-service model.
- New P2 item: ios-terminal-keyboard-parity. Behavior-level port of
  Shift+Enter, modifier chaining, toolbar layout, and keyboard
  show/hide fixes (commits afa13b4, 093def6, 158776a, f2cfaf3,
  45cc7d2).
- P0 gate item explicitly defers visual parity with the redesigned
  SessionsScreen/SessionCard until after the functional port lands.
- P1 reconnect item references TerminalSessionHolder.kt and the new
  cache-driven refresh flow.
- Corrections note: Android just gained terminal/session persistence
  across navigation; iOS already had the concept in
  TerminalSessionStore.swift, so that gap is smaller than it
  looked.

Authored by codex (CTO review role) at iosdev's request.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Parses the compact v2 QR shape (i/u/k/t/n) alongside legacy v1
(ip/user/key/tmux), reconstructing the OpenSSH PEM from the inner
key blob and base64-encoding it so downstream SSH key parsing stays
format-agnostic.

ConnectionConfig gains protocolVersion (default 1 for legacy) and
nonce. ConfigStore persists both; load() treats a missing
protocolVersion as 1 so pre-v2 installs keep working after upgrade.
Unpair clears the new keys alongside the rest.

This is step 1 of the gate/v2 migration. Next: gate command layer in
SSHManager, then verification handshake, then gate-based attach.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SSHManager now branches on protocolVersion (captured at connect):
- v2 sends gate commands: list, windows <s>, create-session,
  create-window, kill-session, kill-window, pair <name>, renew.
- v1 keeps raw tmux shell-outs for backward compatibility.

Output-parsing updates:
- listSessions consumes the #permissions: header on v2 and exposes
  the parsed value on a new @published devicePermissions; UI can
  bind directly to render read-only state and session scope.
- Every gate-aware call routes `error:<code>` lines through a typed
  GateError (enumerated: pending, soft_expired, not_found,
  read_only, denied, unknown_command, failed) so callers can route
  recoverable states without regex-matching localizedDescription.

Two new gate-only APIs for the verification flow landing next:
- sendPairCommand(deviceName:) returns the wire `verify:<code>`.
- requestRenewal() returns `requested` on success.

disconnect() resets protocolVersion to 1 and clears
devicePermissions so a subsequent v1 pairing never inherits stale v2
state.

No call sites changed; SessionsView/TerminalView still call the
existing methods. The verification flow and gate-based attach land
in follow-up commits on this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds VerificationView, shown after a v2 QR scan while the phone
completes the gate pairing handshake with the Mac. Flow mirrors
Android's VerificationScreen.kt:

- SSH-connect over the Tailscale loopback proxy.
- `list` first: if the device is already active we skip the rest
  and route straight to Sessions. This also matters for force-quit
  recovery — the gate state is the source of truth, not local UI.
- If `list` fails with error:pending, send `pair <device-name>` and
  display the returned `verify:<code>` as "XXX YYY".
- Poll every 2s (up to 60s): reconnect + `list`. Success → verified.
  `error:not_found` → rejected on Mac. SSH auth failure → Mac
  deleted the device (rejected or revoked). Anything else is
  transient and retries within budget.

ConfigStore gains a persisted `pendingVerification` flag (true on
v2 save, false on markVerified or unpair). ContentView gates
between TailscaleAuthView → VerificationView → SessionsView on
this flag, so a mid-pair force-quit resumes the handshake cleanly
on relaunch.

UIDevice.current.name supplies the device label sent with the
`pair` command, matching Android's use of Build.MODEL.

Next: route terminal attach through the gate so `tmux attach`
stops bypassing read-only / allowed-session enforcement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TerminalChannel now branches on SSHManager.protocolVersion:
- v2: exec command is the bare `attach <session> <window>`. The
  Mac's `handoff gate` forced-command handles `exec env LANG=…
  tmux attach -t target`, including read-only via `-r`. Read-only
  attach semantics come from the gate, not client convention.
- v1: unchanged — raw `export LANG=en_US.UTF-8; <tmux> attach -t …`.

The fixed `export LANG=…;` prefix used to be added unconditionally
in the channel handler. That worked on v1 (SSH runs it as a shell)
but would have been parsed as literal args by the gate on v2 and
rejected as error:unknown_command. The prefix now lives at the
construction site in `openTerminal`, so each path controls its
full command.

This closes the P0 gate migration: discovery, mutation, and attach
all go through the gate on v2. Read-only enforcement and
allowed-session filtering are now applied server-side for every
terminal action.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex review of ios-v2-pairing-gate flagged two races:

1. devicePermissions stale write. An in-flight v2 `list` could
   publish parsed permissions after disconnect() had already nilled
   state or after a later reconnect had begun, leaving stale
   read-only/session-scope UI across connection boundaries.

   Fix: stamp every successful connect() with a UUID connectionID.
   The v2 header-parsing path captures the ID pre-hop and the
   MainActor apply guards on `self.connectionID == capturedID`,
   dropping the write if the connection was torn down or rotated.
   disconnect() nils the ID up-front so the next write is dropped
   regardless of scheduling order. MainActor.run is now awaited
   inside listSessions (the call is already async), so the update
   is synchronous against the calling task's timeline.

2. VerificationView task leak. The 60s polling Task survived
   onDisappear: a mid-pair unpair or route change kept reconnecting
   with the now-orphaned config and could still flip
   pendingVerification off via markVerified().

   Fix: capture the Task in @State, cancel it on onDisappear, and
   bracket every step of the verify flow with Task.checkCancellation.
   CancellationError is caught and swallowed. finish() now also
   checks Task.isCancelled before calling markVerified, so even a
   last-ms cancel doesn't leave pendingVerification in an
   inconsistent state.

No call sites or public API changed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SessionsView and SessionCard now react to the gate's #permissions:
header, and catch blocks route through a new ErrorMessages.swift
that mirrors Android's friendly-string map.

Read-only surfacing:
- SessionCard gains a readOnly flag. When set, the "+ new tab"
  button is hidden and long-press-to-kill on session and tab is
  inert. Suppressing the affordance is clearer than showing a
  disabled button that fails server-side on tap.
- SessionsView hides its "+ new session" footer + empty-state
  button in read-only mode, and shows a "READ-ONLY" capsule next
  to "● Connected" in the header.

Error routing:
- New ErrorMessages.swift with friendlyGate / friendlyConnection /
  friendlyAction helpers. Mirrors Android/ErrorMessages.kt so copy
  stays aligned.
- loadSessions catches GateError separately. softExpired drives an
  alert sheet with a "Request renewal" action that calls the new
  gate renew command and surfaces the Mac's confirmation.
  notFound and other gate errors render the friendly string
  inline.
- Every create/kill catch now routes through friendlyAction.

No new runtime state paths: this is pure UI consumption of signals
the ios-v2-pairing-gate branch already publishes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live-device testing surfaced the iOS-background-teardown footgun:
`parentChannel.isActive` can read true after the app returns from
background even though iOS has torn down the Tailscale SOCKS5
proxy underneath, so every gate command hangs for minutes before
eventually surfacing as "The request timed out".

Three targeted fixes:

1. SessionsView gains a forceReload() helper that disconnects
   before loading. The toolbar reload button, the errorView Retry
   button, and a new scenePhase observer (background→active) all
   route through it instead of trusting isConnected. Mirrors what
   TerminalView already does for the same class of failure.

2. loadSessions gains a forceReconnect parameter so non-user
   entrypoints (e.g., retry-after-error) can opt in too.

3. executeCommand in SSHManager caps each exec at 10s. On a
   zombie channel createChannel never fulfills its promise — the
   timeout turns that into a clean throw instead of a silent hang
   that piles up Tasks behind silentRefresh.

silentRefresh's catch block now also disconnects on error so the
next foreground iteration reconnects cleanly without waiting for
the user to tap Retry.

Reference: roadmap P1 item "Tighten sessions-level reconnect
behavior" (ios-reconnect-hardening branch name now matches).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Follow-up to 6c4aed5 after live-device testing turned up two
residual races that the first pass didn't cover.

1. SessionsView.silentRefresh was untracked. A refresh kicked off
   before backgrounding could finish after foreground recovery
   began and either overwrite the fresh list or, worse, call
   sshManager.disconnect() in its catch block — tearing down the
   newly re-established SSH session.

   Fix: route silentRefresh through a new @State refreshTask,
   cancel it alongside loadTask in the reload / unpair / sign-out
   paths, and guard the inner loop + both MainActor writes with
   Task.checkCancellation / Task.isCancelled. The guard at entry
   also dedupes concurrent refreshes.

2. TerminalView foreground resume gated the reconnect on
   !terminal.sshManager.isConnected. Same zombie-channel class of
   failure that SessionsView already worked around: isConnected
   can read true after iOS tore down the SOCKS5 proxy underneath.
   Terminal continuity is tmux's job, not the socket's, so always
   drop the stored terminal and reconnect fresh when we come back
   from background.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pulls Sagi's Mac CLI + Android work (no iOS files touched):
- Phone notification bridge (Mac→Android via gate)
- tmux mouse mode in setup, mouse-mode + verify defaults to yes
- Detect tmux server/client version mismatch on attach
- Fall back to script(1) when SSH skips PTY allocation
- Pairing UX hardening + foreground service specialUse
- Termux: keep cursor and trailing rows visible under keyboard

# Conflicts:
#	android/app/src/main/java/com/handoff/app/ui/screens/TerminalScreen.kt
#	android/app/src/main/java/com/handoff/app/ui/screens/VerificationScreen.kt
The iOS terminal had no working scrollback gesture: tmux runs in the
alternate screen (no local scrollback), and SwiftTerm turns a one-finger
pan into a mouse *drag* whenever the remote app has mouse tracking on
(tmux `mouse on`) — which tmux reads as a text selection, not scrollback.
SwiftTerm never emits scroll-wheel events on a pan, unlike Android's
Termux view.

Port Android's `doScroll` to iOS:
- Disable SwiftTerm's mouse reporting so the one-finger pan is ours.
  Tap-to-focus and long/double/triple-tap selection still work; only
  tap-as-mouse-click to the remote app is sacrificed.
- Add a one-finger pan that accumulates drag, converts pixels to whole
  rows by cell height (carrying the sub-row remainder), and emits one
  scroll step per row.
- Same branch as Termux: mouse tracking on -> SGR wheel events
  (ESC[<64;col;rowM up / 65 down); bare alternate screen -> arrow keys;
  normal buffer -> SwiftTerm's own scroll view.

Known limits: read-only sessions (tmux -r) ignore mouse input so they
won't scroll; remote TUIs no longer receive tap-as-click.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@omriariav omriariav changed the title iOS reconnect hardening: notification bridge, attach robustness, one-finger scroll iOS v2 client: gate/pairing, reconnect hardening, host-key verification, one-finger scroll May 30, 2026
HostKeyFingerprint.compute used String(describing: key) on a
NIOSSHPublicKey. NIOSSHPublicKey does NOT conform to
CustomStringConvertible, so this yielded Swift's reflection dump instead
of the OpenSSH wire format ("ssh-ed25519 <base64>"). The base64 decode of
parts[1] then failed and the code hashed the reflection string — a stable
but wrong fingerprint that never matches `ssh-keygen -lf` on the Mac.
TOFU mismatch detection still worked (stored == seen is self-consistent),
but the SHA256 shown to the user at first-trust was meaningless, defeating
the visual-verification step the feature exists for.

Use String(openSSHPublicKey:) (NIOSSH's public serializer) instead, which
emits the algorithm-prefixed base64 wire format OpenSSH fingerprints.

Verified against the live Mac host key: the computed value now equals
`ssh-keygen -lf /etc/ssh/ssh_host_ed25519_key.pub` byte-for-byte.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@omriariav

omriariav commented May 30, 2026

Copy link
Copy Markdown
Collaborator Author

Review pass — 1 blocker found, fixed, build verified on device ✅

Full correctness/security pass over the diff (iOS reconnect + gate v2 + TOFU host-key verification, Android host-key verification, the scroll gesture).

🔴 BLOCKER (fixed in c83562b)

HostKeyFingerprint.compute used String(describing: key) on a NIOSSHPublicKey. That type doesn't conform to CustomStringConvertible, so it produced Swift's reflection dump, not the OpenSSH wire format — the base64 decode failed and it hashed the reflection string. Mismatch detection still worked (stored==seen is self-consistent), but the SHA256:… shown to the user at first-trust was meaningless, defeating the visual-verification step.

Fixed by using NIOSSH's public serializer String(openSSHPublicKey:) (NIOSSHPublicKey.swift:461), which emits "algorithm base64-wire".

Verified two ways:

  • Algorithm — SHA256-over-base64-wire now produces a byte-identical result to ssh-keygen -lf /etc/ssh/ssh_host_ed25519_key.pub on the live Mac host key (SHA256:bb56MkAaEJpL+BwGAQvkqVYdWvwzvNF8eakz9JwF7+w, MATCH).
  • BuildBUILD SUCCEEDED, 0 errors, for platform=iOS (iPhone 17, iOS 26.5); installed on device via devicectl (exit 0).

Build-environment note: the host's Xcode auto-updated to 26.5 mid-review, which removed device support and the Metal toolchain (SwiftTerm ships a .metal shader). Resolved with xcodebuild -downloadPlatform iOS + xcodebuild -downloadComponent MetalToolchain; the build is green only after both. Pure-environment, not a code issue.

🟡 Non-blocking follow-ups (not addressed here)

  • SSHManager.invalidatePendingTrust() lacks @MainActor though it touches @Published trust state. No crash today — its only caller disconnect() runs on the MainActor. Annotation gap worth closing later.
  • Android AndroidHostKeyRepository.add() decodes with java.util.Base64 while the rest of the file uses android.util.Base64 (NO_WRAP). Dead code under StrictHostKeyChecking=yes (JSch never calls it), so zero runtime impact — fix for consistency when convenient.
  • Minor: GateError.Code.unknown raw value can collide with a real error:unknown wire response; VerificationView poll loop swallows non-auth SSH errors and burns the 60s budget. Both low-severity.

✅ Verdict

Merge-blocking correctness bug resolved, algorithm-verified against ssh-keygen, and build-verified on device. Remaining items are minor and tracked above. Staying as draft per the plan to resume this branch later.

omriariav added a commit that referenced this pull request May 30, 2026
… consistency

Three non-blocking review findings:

- SSHManager.invalidatePendingTrust() touched @published / MainActor-isolated
  trust state without isolation. Mark it @mainactor and move its call out of
  disconnect()'s synchronous body into the existing main-actor Task (disconnect
  does blocking NIO calls, so it can't itself be @mainactor). FIFO main-actor
  ordering means the old attempt is cleared before connect()'s new
  currentAttemptID lands, so a fresh first-trust prompt isn't invalidated.

- GateError.Code.unknown used raw value "error:unknown", which could collide
  with a real gate `error:unknown` wire response. Every gate code starts with
  `error:`, so change the fallback's raw value to "handoff:unrecognized" — it
  can never decode from the wire and is only reached via the `?? .unknown`
  fallback, which preserves rawCode for display.

- Android AndroidHostKeyRepository.add() decoded with java.util.Base64 while
  HostKeyStore round-trips via android.util.Base64 (NO_WRAP). Dead code under
  StrictHostKeyChecking=yes (JSch never calls add()), but align the decoder so
  it stays correct if strict checking is ever relaxed.

iOS builds clean and installs on device. The VerificationView poll-loop
"swallows non-auth errors" finding was intentionally NOT changed: retrying
transient non-auth errors within the 60s budget is correct pairing behavior;
fast-failing there would reduce robustness.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@omriariav

omriariav commented May 30, 2026

Copy link
Copy Markdown
Collaborator Author

Follow-up — non-blocking iOS review nits addressed (ae1faac)

Cleared the iOS items from the review pass. Builds clean (BUILD SUCCEEDED, 0 errors) and installs on device (iPhone 17, iOS 26.5).

  • SSHManager.invalidatePendingTrust() MainActor isolation ✅ — now @MainActor; its call moved out of disconnect()'s synchronous body into the existing main-actor Task (disconnect does blocking NIO, so it can't itself be @MainActor). FIFO main-actor ordering clears the old attempt before connect()'s new currentAttemptID lands, so an in-flight first-trust prompt for the new connection isn't invalidated.
  • GateError.Code.unknown wire collision ✅ — fallback raw value changed from "error:unknown" to "handoff:unrecognized". Every gate code starts with error:, so the fallback can never decode straight from the wire; it's only reached via ?? .unknown, which preserves rawCode for display.

Intentionally not changed:

  • The VerificationView poll loop "swallows non-auth errors." Retrying transient non-auth errors (Tailscale settling, Mac waking) within the 60s budget is correct pairing behavior — typed gate errors and SSH-auth failures already exit fast via their own catches. Left as-is by design.
  • The Android AndroidHostKeyRepository.add() Base64 nit — dropped from this branch. It's Android-side and this is an iOS-focused branch; keeping it out avoids scope creep. (It's dead code under StrictHostKeyChecking=yes regardless.)

Branch HEAD = ae1faac, in sync with origin. Still draft per the resume-later plan.

(Supersedes the earlier 1b38dea reference — that commit was amended to drop the Android change.)

Two non-blocking iOS review findings:

- SSHManager.invalidatePendingTrust() touched @published / MainActor-isolated
  trust state without isolation. Mark it @mainactor and move its call out of
  disconnect()'s synchronous body into the existing main-actor Task (disconnect
  does blocking NIO calls, so it can't itself be @mainactor). FIFO main-actor
  ordering means the old attempt is cleared before connect()'s new
  currentAttemptID lands, so a fresh first-trust prompt isn't invalidated.

- GateError.Code.unknown used raw value "error:unknown", which could collide
  with a real gate `error:unknown` wire response. Every gate code starts with
  `error:`, so change the fallback's raw value to "handoff:unrecognized" — it
  can never decode from the wire and is only reached via the `?? .unknown`
  fallback, which preserves rawCode for display.

iOS builds clean and installs on device. The VerificationView poll-loop
"swallows non-auth errors" finding was intentionally NOT changed: retrying
transient non-auth errors within the 60s budget is correct pairing behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@omriariav omriariav force-pushed the ios-reconnect-hardening branch from 1b38dea to ae1faac Compare May 30, 2026 07:51
@omriariav

Copy link
Copy Markdown
Collaborator Author

@SagiMedina — heads-up on the Android changes in this PR before you review.

This branch isn't iOS-only; it carries one Android commit from the April parity push:

  • 16d62b2 — "Android: add SSH host key verification" (Apr 19, 10 files, +565/−83)
    • New: data/HostKeyStore.kt, data/HostKeyValidation.kt, ui/screens/HostKeyDialogs.kt
    • Changed: data/SshManager.kt (TOFU via a custom HostKeyRepository, StrictHostKeyChecking=yes, mismatch/unknown-key exceptions), plus MainActivity.kt, data/ErrorMessages.kt, SessionsScreen.kt, SettingsScreen.kt, TerminalScreen.kt, VerificationScreen.kt.

This brings Android to host-key-verification parity with iOS (trust-on-first-use + mismatch dialogs).

Two things worth a closer look when you review the Android side:

  1. Base64 consistency in AndroidHostKeyRepositoryHostKeyStore round-trips key bytes via android.util.Base64 (NO_WRAP), but add() decodes with java.util.Base64. It's currently dead code (under StrictHostKeyChecking=yes, JSch throws instead of calling add()), so zero runtime impact — but the decoders are mismatched if strict checking is ever relaxed. I had a one-line fix for it staged and then pulled it back out to keep this session's work iOS-only; flagging so it's on your radar as an Android-side decision rather than something I silently bundled.
  2. getDecoder().decode(config.privateKey) for the identity key in connect() also uses java.util.Base64 — fine as-is, just noting the same library split exists on the key-parsing path.

Everything else in the PR is iOS. The bulk of recent work (gate v2 client, reconnect hardening, host-key fingerprint fix, one-finger scroll) is Swift; full breakdown is in the review comments above.

Note: still marked draft — the plan is to resume this branch later, but flagging the Android scope now so you know what you're signing up to review.

@omriariav omriariav requested a review from SagiMedina May 30, 2026 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants