Skip to content

snap/read returns 0 elements on heavy JS SPAs (YouTube, X, Coupang) #1

@kiyeonjeon21

Description

@kiyeonjeon21

Problem

snact snap captures the DOM at Page.loadEventFired, which fires after the initial HTML loads. On heavy JS SPAs like YouTube, Twitter/X, and Coupang, the actual content is rendered asynchronously after this event — so snap sees an empty or near-empty page.

Tested sites with 0 or minimal elements

Site Elements Root cause
YouTube (youtube.com) 3 SPA — content rendered after load event
Twitter/X (x.com/explore) 0 Auth wall + SPA rendering
Coupang (coupang.com) 0 Bot detection + SPA
Amazon (amazon.com/dp/...) 4 Bot detection

Sites like Reddit (SPA but server-rendered), GitHub (SPA with SSR), and Vercel (Next.js) work fine because their initial HTML includes meaningful content.

Proposed solution

Add a --wait strategy option to snap and read:

# Wait for network to be idle (no requests for 500ms)
snact snap https://youtube.com --wait=networkidle

# Wait for a specific selector to appear
snact snap https://youtube.com --wait="#contents"

# Wait N milliseconds after load
snact snap https://youtube.com --wait=3000

Implementation notes

  • Network idle: Use CDP Network.requestWillBeSent / Network.loadingFinished events to detect when all network activity settles. Wait until no new requests for 500ms.
  • Selector wait: Use DOM.querySelector in a polling loop (already have this pattern in action/wait.rs).
  • Timeout wait: Simple tokio::time::sleep.
  • The existing snact wait command already handles selector and navigation waits — this would integrate similar logic into snap/read as a pre-step.

What this does NOT solve

  • Bot detection (Coupang, Amazon): These sites actively block headless/automated browsers. This is outside snact's scope — users should use a real browser session with snact session load.
  • Auth walls (Twitter/X when logged out): Users need to log in via the browser first.

Priority

Medium — most popular sites (GitHub, Reddit, StackOverflow, Wikipedia, Google, news sites) already work. This primarily affects YouTube-class pure SPAs where SSR is minimal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions