diff --git a/README.md b/README.md index 9428793..647dc32 100644 --- a/README.md +++ b/README.md @@ -38,7 +38,7 @@ A community gallery for sharing single-card presets and adopting others' designs **Currently implemented.** 28 SVG card endpoints (`/api/*`), 17 built-in themes plus gist-hosted custom palettes via `theme_url=`, five bundled variable fonts, `/api/stack` composition with namespaced child IDs, a live playground at [profilekit.vercel.app](https://profilekit.vercel.app), and an MCP wrapper at [`@heznpc/profilekit-mcp`](https://www.npmjs.com/package/@heznpc/profilekit-mcp). Zero runtime dependencies, 30-minute CDN cache, deployed on Vercel. -**Planned.** A single-card preset gallery at `/gallery` — adopt someone else's design URL as a starting point, then tweak parameters in the editor. Cross-agent preset compile (one preset → Claude Code, Cursor, Codex CLI configs). `theme_url=` adoption across the rest of the catalog. +**Planned.** A single-card preset gallery at `/gallery` — adopt someone else's design URL as a starting point, then tweak parameters in the editor. Cross-agent preset compile (one preset → Claude Code, Cursor, Codex CLI configs). **Design intent.** *No ranking, composable presentation.* Each card is a parameter-only URL — every visual property exposed as a query string so the same endpoint renders in a GitHub README, a dev.to bio, a Hashnode header, or a slide cover with no template forking. The gallery is for *adoption*, not voting: you start from someone else's preset and edit it; we do not show which preset is "most popular." Pure SVG with CSS / SMIL keeps animations alive inside GitHub's image proxy and removes the JavaScript attack surface. @@ -191,7 +191,7 @@ ProfileKit cards are plain SVG. They render anywhere a platform allows external | `/api/timeline` | Vertical timeline | | `/api/tags` | Tag cloud / skill pills | | `/api/toc` | Table of contents | -| `/api/posts` | Latest posts from dev.to / Hashnode / RSS | +| `/api/posts` | Latest posts from dev.to / Medium / RSS (Hashnode via its RSS feed) | | **Animations** | | | `/api/typing` | Typewriter text | | `/api/wave` | Layered animated sin waves | @@ -277,7 +277,7 @@ The JSON shape mirrors the entries in `src/common/themes.js`: - Responses are cached for 30 minutes per URL. - On any failure (host not allowed, network, schema mismatch) the card falls back to the default `dark` palette and the response carries an `X-Theme-Error` header explaining why. -**Currently supported by**: `/api/stats`, `/api/stack` (and any cards rendered through `/api/stack`). Other endpoints will adopt `theme_url=` as a follow-up — no behavior change for callers that don't use the parameter. +**Currently supported by**: every card endpoint — `?theme_url=` is parsed by the shared option resolver (`src/common/options.js`), so `/api/stats`, `/api/hero`, `/api/posts`, `/api/stack`, and the rest all accept it. Cards rendered through `/api/stack` inherit the resolved palette. ## Common Options @@ -468,11 +468,13 @@ Pick a card from the sidebar, tweak parameters in the right panel, copy the URL #### `/api/posts` | Param | Description | |-------|-------------| -| `source` | `devto` (default) / `hashnode` / `medium` / `rss` | -| `username` | Author username (devto / hashnode / medium) | +| `source` | `devto` (default) / `medium` / `rss` | +| `username` | Author username (devto / medium) | | `url` | Feed URL (rss source) | | `count` | Number of posts (default 5, max 10) | +> Note: `source=hashnode` was retired in 2026-05 when Hashnode moved their GraphQL API behind a Pro-tier auth wall. Use `source=rss&url=https://.hashnode.dev/rss.xml` instead — the RSS path still works without auth. + ### Animations #### `/api/typing` diff --git a/SECURITY.md b/SECURITY.md index faedf23..82cf2a4 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -3,16 +3,28 @@ ## Reporting a vulnerability ProfileKit is a serverless SVG endpoint service that fetches data from -external URLs on behalf of users. The two user-controlled URL surfaces are: +external URLs on behalf of users. The user-controlled URL surfaces are: -- `?theme_url=` on `/api/stats` and `/api/stack` — see [`src/common/theme-url.js`](src/common/theme-url.js) -- `?source=rss|medium&url=` on `/api/posts` — see [`src/fetchers/posts.js`](src/fetchers/posts.js) +- `?theme_url=` on **every card endpoint** — parsed by the shared option + resolver in [`src/common/options.js`](src/common/options.js); the + underlying fetch lives in [`src/common/theme-url.js`](src/common/theme-url.js) + with a single-host allowlist (`gist.githubusercontent.com`), a 5-second + timeout that covers the body read, a 256 KB streaming byte cap, and + `redirect: "error"`. +- `?source=rss|medium&url=` on `/api/posts` — see + [`src/fetchers/posts.js`](src/fetchers/posts.js). 13-host allowlist plus + https-only, `redirect: "error"`, 5-second timeout across the body read, + and a 2 MB streaming byte cap (the cap aborts the underlying controller + mid-stream if exceeded, so chunked / no-content-length responses cannot + OOM the function). -Both apply a host allowlist, https-only, `redirect: "error"`, a hard timeout, -and a response-body size cap. If you find a bypass — or any other issue -(prompt injection through user-controlled text, XSS in rendered SVG, -auth/scope issue with GitHub API token pool, theme schema escape) — please -report it privately. +Both fetch paths spread caller `init` BEFORE forcing `redirect: "error"` +and `signal: controller.signal`, so a future caller cannot weaken either +guard through their own `init`. + +If you find a bypass — or any other issue (prompt injection through +user-controlled text, XSS in rendered SVG, auth/scope issue with GitHub +API token pool, theme schema escape) — please report it privately. **How to report:** @@ -29,7 +41,7 @@ Please do **not** open a public issue for security reports. | Acknowledgement | within 7 days | | Coordinated disclosure | up to 90 days from report | -Single-maintainer project — only the two endpoints above are committed. +Single-maintainer project — only the surfaces above are committed. If a fix lands earlier, disclosure happens at fix time. If a fix needs more than 90 days (e.g., upstream dependency), we coordinate a longer window with the reporter. @@ -44,10 +56,13 @@ Self-hosters: re-deploy from `main` after any security advisory. ## Out of scope -- Denial of service via expensive `?source=rss` feeds — the per-request +- Denial of service via expensive `?source=rss` feeds — the streaming body cap (2 MB) and timeout (5s) bound the surface; Vercel function duration (10s, see `vercel.json`) is the upper bound. - Rate-limit consumption of the deployer's GitHub API token pool — public data only, mitigated via the token pool design (`src/common/github-token.js`). - Issues in third-party platforms (GitHub Camo proxy, Notion image proxy, etc.) that affect how cards render — report to those platforms directly. +- `source=hashnode` on `/api/posts` — Hashnode retired the free public + GraphQL API in 2026-05; the source is now disabled at the entry point + and returns a clean "use source=rss instead" error. No SSRF surface. diff --git a/src/common/card.js b/src/common/card.js index 5833e64..34042fd 100644 --- a/src/common/card.js +++ b/src/common/card.js @@ -51,6 +51,10 @@ function renderCard({ width, height, title, ariaLabel, colors, hideBorder, hideT // titleTarget is an optional data-cas-target hook for the playground // composer's inline-edit feature. Pure HTML attribute, no visual effect. + // TODO(2nd-pass-audit-2026-05-21): wrap titleTarget in escapeHtml as + // defense-in-depth. All current callers pass the literal "username", but + // an attribute interpolation without escape is a latent XSS regression + // path if a future caller threads user input through here. const titleAttr = titleTarget ? ` data-cas-target="${titleTarget}"` : ""; const titleMarkup = hideTitle ? "" diff --git a/src/common/options.js b/src/common/options.js index 6ccfee8..9c704a0 100644 --- a/src/common/options.js +++ b/src/common/options.js @@ -66,11 +66,21 @@ function parseCardOptions(params) { }; } -async function resolveCardOptions(params) { +// `prefetched` lets /api/stack avoid N+1 gist fetches: stack.js resolves the +// top-level theme_url once, then passes { url, palette } (or { url, error }) +// here for every child slot whose theme_url matches. A child that overrides +// with `.theme_url=` falls through to the live fetch path. +async function resolveCardOptions(params, prefetched = null) { const opts = parseCardOptions(params); const themeUrl = params.get("theme_url"); if (!themeUrl) return { opts, themeError: null }; + if (prefetched && prefetched.url === themeUrl) { + if (prefetched.error) return { opts, themeError: prefetched.error }; + const colors = applyOverrides(prefetched.palette, readColorOverrides(params)); + return { opts: { ...opts, colors }, themeError: null }; + } + try { const externalPalette = await fetchExternalTheme(themeUrl); // External palette becomes the base; per-param color overrides still @@ -85,9 +95,23 @@ async function resolveCardOptions(params) { } } +// Single-shot prefetch for a top-level theme_url. Used by /api/stack to +// resolve the gist once and reuse for every slot via `resolveCardOptions`'s +// `prefetched` arg. +async function prefetchExternalTheme(themeUrl) { + if (!themeUrl) return null; + try { + const palette = await fetchExternalTheme(themeUrl); + return { url: themeUrl, palette }; + } catch (err) { + return { url: themeUrl, error: err.message }; + } +} + module.exports = { parseCardOptions, resolveCardOptions, + prefetchExternalTheme, parseSearchParams, CARD_WIDTH_MIN, CARD_WIDTH_MAX, diff --git a/src/common/theme-url.js b/src/common/theme-url.js index cd4082a..e6437f6 100644 --- a/src/common/theme-url.js +++ b/src/common/theme-url.js @@ -29,6 +29,10 @@ const ALLOWED_HOSTS = new Set(["gist.githubusercontent.com"]); const REQUIRED_KEYS = ["bg", "title", "text", "muted", "icon", "border", "accentStops"]; const TTL_MS = 30 * 60 * 1000; const FETCH_TIMEOUT_MS = 5000; +// Real theme palettes are ~500 bytes. 256 KB is a generous ceiling that still +// bounds the body buffer so a hostile or malformed gist cannot OOM the +// function via a multi-MB JSON payload. +const MAX_BODY_BYTES = 256 * 1024; const MAX_CACHE_ENTRIES = 128; // Insertion-ordered Map → cheapest possible LRU. On every set() we drop the @@ -91,7 +95,7 @@ function validatePalette(json) { async function fetchExternalTheme( rawUrl, - { now = Date.now, fetchImpl = globalThis.fetch } = {} + { now = Date.now, fetchImpl = globalThis.fetch, timeoutMs = FETCH_TIMEOUT_MS } = {} ) { const url = validateUrl(rawUrl); const cached = cache.get(url.href); @@ -100,37 +104,52 @@ async function fetchExternalTheme( } const controller = new AbortController(); - const timeoutId = setTimeout(() => controller.abort(), FETCH_TIMEOUT_MS); + const timeoutId = setTimeout(() => controller.abort(), timeoutMs); - let res; + let palette; try { - res = await fetchImpl(url.href, { + // The timer MUST stay alive across the body read — a drip-fed body would + // otherwise hold the connection open after the fetch promise resolved. + // Both the headers-only fetch AND the body read are inside this try; + // clearTimeout fires in the outer finally only. + const res = await fetchImpl(url.href, { headers: { "User-Agent": "ProfileKit/1.0 (+theme_url)" }, redirect: "error", signal: controller.signal, }); + if (!res.ok) { + throw new ThemeUrlError(`theme_url fetch failed: HTTP ${res.status}`); + } + // Pre-read body cap. Sanitized as non-negative integer so a + // `Content-Length: -1` from a buggy upstream can't bypass the check. + const declaredLen = Number(res.headers.get("content-length")); + if (Number.isInteger(declaredLen) && declaredLen >= 0 && declaredLen > MAX_BODY_BYTES) { + throw new ThemeUrlError(`theme_url payload too large: ${declaredLen} bytes`); + } + // Streaming byte counter — caps memory even for chunked / lying-content- + // length responses. Aborts mid-read once the cap is exceeded. + const text = await readBodyCapped(res, controller); + let json; + try { + json = JSON.parse(text); + } catch { + throw new ThemeUrlError("theme_url payload is not valid JSON"); + } + palette = validatePalette(json); } catch (err) { - if (err.name === "AbortError") { + // Distinguish timer-induced aborts from upstream-surfaced AbortErrors by + // consulting the controller, not just the duck-typed error name. Undici + // can surface AbortError for non-timeout reasons (mid-stream server + // reset) and we don't want to mislabel those as timeouts. + if (controller.signal.aborted && err.name === "AbortError") { throw new ThemeUrlError("theme_url fetch timed out"); } + if (err instanceof ThemeUrlError) throw err; throw new ThemeUrlError(`theme_url fetch failed: ${err.message}`); } finally { clearTimeout(timeoutId); } - if (!res.ok) { - throw new ThemeUrlError(`theme_url fetch failed: HTTP ${res.status}`); - } - - const text = await res.text(); - let json; - try { - json = JSON.parse(text); - } catch { - throw new ThemeUrlError("theme_url payload is not valid JSON"); - } - - const palette = validatePalette(json); cache.set(url.href, { palette, expiresAt: now() + TTL_MS }); if (cache.size > MAX_CACHE_ENTRIES) { cache.delete(cache.keys().next().value); @@ -138,6 +157,42 @@ async function fetchExternalTheme( return palette; } +// Read a Response body chunk by chunk, tracking bytes against the cap. +// Aborts the underlying controller and throws if the cap is exceeded mid- +// stream. Falls back to res.text() for mocks / older Response shapes that +// don't expose a streamable body — in that case the cap is enforced after +// the read using Buffer.byteLength (UTF-8), which is correct for bytes but +// only as good as the mock's willingness to materialize the whole body. +async function readBodyCapped(res, controller) { + if (!res.body || typeof res.body.getReader !== "function") { + const text = await res.text(); + if (Buffer.byteLength(text, "utf8") > MAX_BODY_BYTES) { + throw new ThemeUrlError(`theme_url payload too large: ${Buffer.byteLength(text, "utf8")} bytes`); + } + return text; + } + const reader = res.body.getReader(); + const decoder = new TextDecoder("utf-8"); + const parts = []; + let bytes = 0; + try { + while (true) { + const { done, value } = await reader.read(); + if (done) break; + bytes += value.byteLength; + if (bytes > MAX_BODY_BYTES) { + try { controller.abort(); } catch { /* ignore */ } + throw new ThemeUrlError(`theme_url payload too large: ${bytes} bytes`); + } + parts.push(decoder.decode(value, { stream: true })); + } + parts.push(decoder.decode()); + return parts.join(""); + } finally { + try { reader.releaseLock(); } catch { /* ignore */ } + } +} + function clearCache() { cache.clear(); } @@ -152,5 +207,6 @@ module.exports = { REQUIRED_KEYS, TTL_MS, FETCH_TIMEOUT_MS, + MAX_BODY_BYTES, MAX_CACHE_ENTRIES, }; diff --git a/src/endpoints/catalog.js b/src/endpoints/catalog.js index 6ad77a1..80e9670 100644 --- a/src/endpoints/catalog.js +++ b/src/endpoints/catalog.js @@ -83,7 +83,7 @@ const CARDS = { common_params: ["theme", "width"], }, posts: { - description: "Latest posts from devto/hashnode/medium/rss", + description: "Latest posts from devto/medium/rss (hashnode source retired 2026-05 — use rss against your Hashnode blog's /rss feed)", required: ["source"], common_params: ["username", "url", "count", "theme"], }, @@ -160,21 +160,56 @@ const CARDS = { }, }; +// Params accepted by every card endpoint via the shared option resolver in +// src/common/options.js. Injected into each card's common_params at response +// time rather than duplicated into every CARDS entry — single edit if a new +// universal param appears. +const UNIVERSAL_PARAMS = [ + "theme", + "theme_url", + "font", + "bg_color", + "text_color", + "title_color", + "icon_color", + "border_color", + "accent_color", + "hide_border", + "hide_title", + "hide_bar", + "border_radius", + "card_width", +]; + +function buildCatalogResponse() { + const cards = {}; + for (const [name, entry] of Object.entries(CARDS)) { + // health/catalog are utility endpoints — they don't render cards, so + // they don't accept the universal card-rendering params. + if (name === "health" || name === "catalog") { + cards[name] = entry; + continue; + } + const merged = new Set([ + ...(entry.common_params || []), + ...UNIVERSAL_PARAMS, + ]); + cards[name] = { ...entry, common_params: Array.from(merged) }; + } + return { + version: CATALOG_VERSION, + cards, + themes: Object.keys(themes), + }; +} + module.exports = async (req, res) => { res.setHeader("Content-Type", "application/json; charset=utf-8"); res.setHeader("Cache-Control", cacheHeaders()); - return res.send( - JSON.stringify( - { - version: CATALOG_VERSION, - cards: CARDS, - themes: Object.keys(themes), - }, - null, - 2 - ) - ); + return res.send(JSON.stringify(buildCatalogResponse(), null, 2)); }; module.exports.CATALOG_VERSION = CATALOG_VERSION; module.exports.CARDS = CARDS; +module.exports.UNIVERSAL_PARAMS = UNIVERSAL_PARAMS; +module.exports.buildCatalogResponse = buildCatalogResponse; diff --git a/src/endpoints/posts.js b/src/endpoints/posts.js index f233836..59795b6 100644 --- a/src/endpoints/posts.js +++ b/src/endpoints/posts.js @@ -22,7 +22,7 @@ module.exports = async (req, res) => { res.setHeader("Content-Type", "image/svg+xml"); if (themeError) res.setHeader("X-Theme-Error", themeError); - if ((source === "devto" || source === "hashnode") && !username) { + if (source === "devto" && !username) { res.setHeader("Cache-Control", errorCacheHeaders("bad_input")); return res.send(renderError("Missing ?username= parameter", { colors, font })); } diff --git a/src/endpoints/stack.js b/src/endpoints/stack.js index c728b78..932d8d9 100644 --- a/src/endpoints/stack.js +++ b/src/endpoints/stack.js @@ -22,6 +22,7 @@ const { stackVertical } = require("../common/stack"); const { parseSearchParams, resolveCardOptions, + prefetchExternalTheme, } = require("../common/options"); const { parseArray, @@ -151,8 +152,14 @@ function scopeParams(params, cardName) { async function buildStack(params) { const themeErrors = []; + + // Resolve the top-level theme_url ONCE and reuse for every slot whose + // theme_url matches. Avoids the 1+N concurrent gist fetch fan-out a naive + // per-slot resolveCardOptions would issue on cold cache. + const prefetched = await prefetchExternalTheme(params.get("theme_url")); + const { opts: baseOpts, themeError: baseThemeError } = - await resolveCardOptions(params); + await resolveCardOptions(params, prefetched); if (baseThemeError) themeErrors.push(`base: ${baseThemeError}`); const cardList = parseArray(params.get("cards")); @@ -196,8 +203,12 @@ async function buildStack(params) { }; } const scoped = scopeParams(params, cardName); + // Pass the prefetched top-level theme so the per-slot resolveCardOptions + // doesn't re-fetch the gist. A child overriding with its own + // `.theme_url=` URL falls through to the live fetch path inside + // resolveCardOptions (different URL → no cache hit on `prefetched`). const { opts: cardOpts, themeError: cardThemeError } = - await resolveCardOptions(scoped); + await resolveCardOptions(scoped, prefetched); try { const svg = await builder(scoped, cardOpts); return { svg, themeError: cardThemeError, cardName }; diff --git a/src/fetchers/posts.js b/src/fetchers/posts.js index 360d548..3cbc485 100644 --- a/src/fetchers/posts.js +++ b/src/fetchers/posts.js @@ -3,6 +3,12 @@ // and a body cap (a 100 MB feed would OOM the function). The timer must stay // alive across the body read — drip-fed responses can otherwise hold the // connection open after fetch() resolves. +// Both this file and src/common/theme-url.js use the same 5s fetch timeout +// and the same streaming byte-count cap (different size: 2 MB here, 256 KB +// for theme palettes — they're intentionally different). If a third +// user-controlled fetch surface appears, co-locate the constants in a +// dedicated fetch-config module — NOT in utils.js, which has fanout to +// every endpoint and shouldn't inherit network concerns. const FETCH_TIMEOUT_MS = 5000; const MAX_BODY_BYTES = 2_000_000; const HEADERS = { "User-Agent": "profilekit-posts-card" }; @@ -15,16 +21,14 @@ const HEADERS = { "User-Agent": "profilekit-posts-card" }; // loopback-mounted services, or cloud provider instance metadata. The host // allowlist plus `redirect: "error"` (set in fetchCapped) plus scheme check // in validateFeedUrl closes the three classic SSRF bypass routes (direct, -// redirect, scheme smuggling). Hashnode's own GraphQL API endpoint -// (gql.hashnode.com) is here too because fetchHashnode pipes through -// fetchCapped — its allowlist membership is what keeps that call path from -// becoming a different SSRF surface if the host is ever parameterized. +// redirect, scheme smuggling). hashnode.dev is kept for `source=rss` +// against a Hashnode blog's /rss feed; the retired gql.hashnode.com is no +// longer hit by any code path. const ALLOWED_FEED_HOSTS = [ "medium.com", "dev.to", "hashnode.dev", "hashnode.com", - "gql.hashnode.com", "substack.com", "github.io", "wordpress.com", @@ -64,34 +68,49 @@ function validateFeedUrl(raw) { return url; } -async function fetchCapped(url, init = {}) { +async function fetchCapped(url, init = {}, opts) { + // Guard against callers passing `null` as the options bag — destructure + // defaults only kick in for `undefined`. `opts || {}` lets `null` and + // `{ fetchImpl: null }` both fall back to the real fetch. + const { fetchImpl, timeoutMs } = opts || {}; + const realFetch = fetchImpl || globalThis.fetch; + const timeout = typeof timeoutMs === "number" ? timeoutMs : FETCH_TIMEOUT_MS; + const controller = new AbortController(); - const timer = setTimeout(() => controller.abort(), FETCH_TIMEOUT_MS); + const timer = setTimeout(() => controller.abort(), timeout); try { // `redirect: "error"` defends against a classic SSRF bypass where an // allowlisted host returns a 302 pointing at an internal resource — we - // refuse to follow any redirect at all. A caller that needs to follow - // redirects has to override this explicitly. - const res = await fetch(url, { - redirect: "error", + // refuse to follow any redirect at all. Caller's init is spread FIRST so + // `redirect` and `signal` are non-overridable — any future caller passing + // `{ redirect: "follow" }` or `{ signal: theirOwn }` cannot weaken these. + const res = await realFetch(url, { ...init, + redirect: "error", signal: controller.signal, }); if (!res.ok) { throw new Error(`HTTP ${res.status}`); } - const len = Number(res.headers.get("content-length")); - if (Number.isFinite(len) && len > MAX_BODY_BYTES) { - throw new Error(`Response too large: ${len} bytes`); + // Pre-read cap. Sanitized as non-negative integer so a malformed + // `Content-Length: -1` or `Content-Length: 1.5e10` from a buggy or + // hostile upstream cannot bypass the check via NaN / negative-comparison. + const declaredLen = Number(res.headers.get("content-length")); + if (Number.isInteger(declaredLen) && declaredLen >= 0 && declaredLen > MAX_BODY_BYTES) { + throw new Error(`Response too large: ${declaredLen} bytes`); } - const text = await res.text(); - if (text.length > MAX_BODY_BYTES) { - throw new Error(`Response too large: ${text.length} bytes`); - } - return text; + // Streaming byte counter — caps real bytes (not UTF-16 code units), and + // prevents OOM mid-read for chunked / no-content-length responses by + // aborting the controller the moment the cap is crossed. + return await readBodyCapped(res, controller); } catch (e) { - if (e.name === "AbortError") { - throw new Error(`Fetch timed out after ${FETCH_TIMEOUT_MS}ms`); + // Distinguish timer-induced aborts from upstream-surfaced AbortErrors + // by consulting the controller, not just the duck-typed error name. + // Undici can throw AbortError for non-timeout reasons (mid-stream + // server reset); mislabeling those as "timed out" sends the operator + // chasing a phantom slowness bug. + if (controller.signal.aborted && (e.name === "AbortError" || e.message === "aborted")) { + throw new Error(`Fetch timed out after ${timeout}ms`); } throw e; } finally { @@ -99,6 +118,41 @@ async function fetchCapped(url, init = {}) { } } +// Stream-read a Response body, throwing if the running byte count exceeds +// MAX_BODY_BYTES. Falls back to res.text() for mock responses (and pre-fetch +// Response shapes) that don't expose a streamable body; the cap is then +// applied post-read against UTF-8 byte length. +async function readBodyCapped(res, controller) { + if (!res.body || typeof res.body.getReader !== "function") { + const text = await res.text(); + const byteLen = Buffer.byteLength(text, "utf8"); + if (byteLen > MAX_BODY_BYTES) { + throw new Error(`Response too large: ${byteLen} bytes`); + } + return text; + } + const reader = res.body.getReader(); + const decoder = new TextDecoder("utf-8"); + const parts = []; + let bytes = 0; + try { + while (true) { + const { done, value } = await reader.read(); + if (done) break; + bytes += value.byteLength; + if (bytes > MAX_BODY_BYTES) { + try { controller.abort(); } catch { /* ignore */ } + throw new Error(`Response too large: ${bytes} bytes`); + } + parts.push(decoder.decode(value, { stream: true })); + } + parts.push(decoder.decode()); + return parts.join(""); + } finally { + try { reader.releaseLock(); } catch { /* ignore */ } + } +} + function decodeEntities(str) { if (!str) return ""; return String(str) @@ -177,44 +231,16 @@ async function fetchDevTo(username, count) { })); } -async function fetchHashnode(username, count) { - const query = `query Posts($host: String!, $first: Int!) { - publication(host: $host) { - posts(first: $first) { - edges { - node { - title - url - publishedAt - brief - readTimeInMinutes - reactionCount - } - } - } - } - }`; - const text = await fetchCapped("https://gql.hashnode.com/", { - method: "POST", - headers: { "Content-Type": "application/json", ...HEADERS }, - body: JSON.stringify({ - query, - variables: { host: `${username}.hashnode.dev`, first: count }, - }), - }).catch((e) => { - throw new Error(`Hashnode API error: ${e.message}`); - }); - const json = JSON.parse(text); - const edges = json?.data?.publication?.posts?.edges; - if (!edges) throw new Error("Hashnode publication not found"); - return edges.map(({ node }) => ({ - title: node.title, - url: node.url, - published: node.publishedAt, - description: node.brief, - readingTime: node.readTimeInMinutes, - reactions: node.reactionCount, - })); +// Hashnode retired the free public GraphQL API in 2026-05 — gql.hashnode.com +// now returns 301 to https://hashnode.com/announcements/graphql-api. There is +// no free path forward; the paid Pro-tier API requires per-user auth that +// ProfileKit's "drop URL in a README" model cannot supply. The source stays +// listed as removed (with a helpful error) instead of silently failing the +// fetch via redirect:error. +function fetchHashnodeRetired() { + throw new Error( + "hashnode source is retired (gql.hashnode.com requires Pro-tier auth as of 2026-05). Use source=rss with your Hashnode blog's /rss feed URL instead." + ); } async function fetchRssUrl(rawUrl, count) { @@ -234,7 +260,7 @@ async function fetchRssUrl(rawUrl, count) { async function fetchPosts({ source, username, url, count }) { const n = Math.max(1, Math.min(count, 10)); if (source === "devto") return fetchDevTo(username, n); - if (source === "hashnode") return fetchHashnode(username, n); + if (source === "hashnode") return fetchHashnodeRetired(); if (source === "rss" || source === "medium") { let feedUrl = url; if (source === "medium" && username && !feedUrl) { @@ -267,4 +293,7 @@ module.exports = { validateFeedUrl, isAllowedFeedHost, ALLOWED_FEED_HOSTS, + fetchCapped, + FETCH_TIMEOUT_MS, + MAX_BODY_BYTES, }; diff --git a/tests/catalog.test.js b/tests/catalog.test.js new file mode 100644 index 0000000..9459952 --- /dev/null +++ b/tests/catalog.test.js @@ -0,0 +1,81 @@ +const test = require("node:test"); +const assert = require("node:assert/strict"); +const { + buildCatalogResponse, + UNIVERSAL_PARAMS, + CARDS, +} = require("../src/endpoints/catalog"); + +// /api/catalog is the machine-readable discovery surface consumed by +// @heznpc/profilekit-mcp and any client that doesn't scrape README. The +// README promise of "theme_url on every card endpoint" has to be reflected +// here too — otherwise discovery clients silently disagree with the docs. + +test("buildCatalogResponse advertises theme_url on every card endpoint", () => { + const { cards } = buildCatalogResponse(); + for (const [name, entry] of Object.entries(cards)) { + if (name === "health" || name === "catalog") continue; + assert.ok( + entry.common_params.includes("theme_url"), + `card "${name}" must list theme_url in common_params` + ); + } +}); + +test("buildCatalogResponse advertises the full universal param set on every card", () => { + const { cards } = buildCatalogResponse(); + for (const [name, entry] of Object.entries(cards)) { + if (name === "health" || name === "catalog") continue; + for (const p of UNIVERSAL_PARAMS) { + assert.ok( + entry.common_params.includes(p), + `card "${name}" missing universal param "${p}"` + ); + } + } +}); + +test("buildCatalogResponse preserves card-specific common_params alongside universals", () => { + // stats had username, hide, layout, etc. — none of those should disappear. + const { cards } = buildCatalogResponse(); + const originalStats = CARDS.stats.common_params || []; + for (const p of originalStats) { + assert.ok( + cards.stats.common_params.includes(p), + `stats common_params lost original "${p}"` + ); + } +}); + +test("buildCatalogResponse does NOT add card params to the health endpoint", () => { + // health is a diagnostic endpoint — it doesn't render cards and shouldn't + // pretend to accept the universal palette params. + const { cards } = buildCatalogResponse(); + assert.deepEqual( + cards.health.common_params, + CARDS.health.common_params, + "health should not inherit card universal params" + ); +}); + +test("buildCatalogResponse deduplicates if a card already listed a universal param", () => { + // hero's common_params already include "theme" and "font". The merge must + // not produce duplicates. + const { cards } = buildCatalogResponse(); + const set = new Set(); + for (const p of cards.hero.common_params) { + assert.ok(!set.has(p), `hero common_params has duplicate "${p}"`); + set.add(p); + } +}); + +test("posts source description no longer claims hashnode works directly", () => { + // README + catalog have to agree: hashnode source is retired. The catalog + // description should not say "devto/hashnode/medium" without the retirement + // note — otherwise a discovery client offers a dead option. + assert.match( + CARDS.posts.description, + /retired|rss/i, + "posts description must signal that hashnode is no longer a direct source" + ); +}); diff --git a/tests/posts-security.test.js b/tests/posts-security.test.js index 470c7db..b7472f9 100644 --- a/tests/posts-security.test.js +++ b/tests/posts-security.test.js @@ -4,6 +4,9 @@ const { validateFeedUrl, isAllowedFeedHost, ALLOWED_FEED_HOSTS, + fetchCapped, + FETCH_TIMEOUT_MS, + MAX_BODY_BYTES, } = require("../src/fetchers/posts"); // --- isAllowedFeedHost --- @@ -126,3 +129,280 @@ test("ALLOWED_FEED_HOSTS includes the baseline blog platforms", () => { ); } }); + +// --- fetchCapped wire-level guards (mocked fetch) --- +// +// SECURITY.md promises four properties of the user-controlled fetch path: +// 1. redirect: "error" — no following 302 to internal IPs +// 2. AbortController timeout — slow remote can't hold the function open +// 3. content-length cap — declared-length large responses are rejected +// 4. body-byte cap — bodies that exceed cap during streaming are aborted +// +// These tests verify each property by spying on the init args passed to fetch +// and by returning crafted responses. Without them, a refactor that drops any +// one guard would not fail a single existing test. +// +// Header fidelity: use real `Headers` (Node 18+) rather than a `Map`, so a +// future refactor to `get('Content-Length')` (capital) still matches against +// the same case-insensitive `Headers.get` production uses. + +function makeResponse({ ok = true, status = 200, headers = {}, body = "" } = {}) { + return { + ok, + status, + headers: new Headers(headers), + text: async () => body, + // No `body` stream — exercises fetchCapped's res.text() fallback path, + // which is what production also hits when Undici delivers a small or + // non-streamable response. + }; +} + +test("fetchCapped passes redirect:'error' to the underlying fetch", async () => { + let capturedInit; + await fetchCapped( + "https://medium.com/feed/@user", + {}, + { + fetchImpl: async (_url, init) => { + capturedInit = init; + return makeResponse(); + }, + } + ); + assert.equal( + capturedInit.redirect, + "error", + "redirect:error must be set so an allowlisted host's 302 to IMDS can't bypass SSRF guards" + ); +}); + +test("fetchCapped's redirect:'error' is non-overridable by caller init", async () => { + // Defense in depth: even if a caller mistakenly passes `{ redirect: "follow" }`, + // the wrapper must still enforce redirect:error. Caller init is spread BEFORE + // the security defaults in the implementation, so the guard always wins. + let capturedInit; + await fetchCapped( + "https://medium.com/feed/@user", + { redirect: "follow" }, + { + fetchImpl: async (_url, init) => { + capturedInit = init; + return makeResponse(); + }, + } + ); + assert.equal( + capturedInit.redirect, + "error", + "redirect must remain 'error' even when caller asks for 'follow'" + ); +}); + +test("fetchCapped's signal is non-overridable by caller init", async () => { + // Same defense-in-depth as redirect, but for the signal property. A caller + // passing their own signal would (without this guard) replace the internal + // timer-controller, silently defeating the hard-timeout contract. + let capturedInit; + const callerSignal = new AbortController().signal; + await fetchCapped( + "https://medium.com/feed/@user", + { signal: callerSignal }, + { + fetchImpl: async (_url, init) => { + capturedInit = init; + return makeResponse(); + }, + } + ); + assert.notEqual( + capturedInit.signal, + callerSignal, + "signal must remain the internal timer controller's signal, never the caller's" + ); +}); + +test("fetchCapped rejects when declared content-length exceeds the cap, and does NOT read the body", async () => { + // The whole point of the pre-read cap is to avoid materializing a giant + // body. A test that only checks the rejection message can't distinguish + // pre-read rejection from post-read rejection. Spy text() to lock that in. + let textCalled = false; + await assert.rejects( + fetchCapped( + "https://medium.com/feed/@user", + {}, + { + fetchImpl: async () => ({ + ok: true, + status: 200, + headers: new Headers({ "content-length": String(MAX_BODY_BYTES + 1) }), + text: async () => { + textCalled = true; + return "should not be read"; + }, + }), + } + ), + /Response too large/ + ); + assert.equal(textCalled, false, "pre-read cap must throw before text() is invoked"); +}); + +test("fetchCapped accepts a sane content-length", async () => { + // Mirror of the cap test: a legitimate declared length should NOT trip the + // pre-read check. Catches a refactor that flips the > to >=. + const out = await fetchCapped( + "https://medium.com/feed/@user", + {}, + { + fetchImpl: async () => + makeResponse({ headers: { "content-length": String(MAX_BODY_BYTES) }, body: "ok" }), + } + ); + assert.equal(out, "ok"); +}); + +test("fetchCapped ignores negative or non-integer content-length", async () => { + // A buggy/hostile upstream sending `Content-Length: -1` must not bypass + // the cap by exploiting Number('-1') === -1 (finite, not greater than cap). + // The post-read fallback should still reject if the actual body is over. + const oversize = "x".repeat(MAX_BODY_BYTES + 10); + await assert.rejects( + fetchCapped( + "https://medium.com/feed/@user", + {}, + { + fetchImpl: async () => + makeResponse({ + headers: { "content-length": "-1" }, + body: oversize, + }), + } + ), + /Response too large/ + ); +}); + +test("fetchCapped rejects body that exceeds the cap (counted in bytes, not chars)", async () => { + // The cap is named MAX_BODY_BYTES; enforce it in bytes. Multibyte UTF-8 + // text whose char count fits the cap but whose byte count doesn't must + // still be rejected. 800K emoji = ~3.2 MB UTF-8 (4 bytes each). + const multibyte = "🔥".repeat(800_000); + await assert.rejects( + fetchCapped( + "https://medium.com/feed/@user", + {}, + { + fetchImpl: async () => makeResponse({ body: multibyte }), + } + ), + /Response too large/ + ); +}); + +test("fetchCapped translates timer-induced AbortError into a clean timeout message", async () => { + // Use a 10ms override timeout so we don't wait the real 5s budget. The + // fetchImpl hangs on the signal so the abort-via-timer is the only way to + // resolve it — that's the exact production path being asserted. + await assert.rejects( + fetchCapped( + "https://medium.com/feed/@user", + {}, + { + timeoutMs: 10, + fetchImpl: (_url, init) => + new Promise((_resolve, reject) => { + init.signal.addEventListener("abort", () => { + const err = new Error("aborted"); + err.name = "AbortError"; + reject(err); + }); + }), + } + ), + /timed out after 10ms/ + ); +}); + +test("fetchCapped does NOT mislabel non-timer AbortError as a timeout", async () => { + // Undici can surface AbortError for non-timeout reasons (server-side + // reset mid-stream). The catch path checks controller.signal.aborted + // before translating, so this synthetic non-timer abort must propagate + // with its own message — not be re-labeled "timed out". + await assert.rejects( + fetchCapped( + "https://medium.com/feed/@user", + {}, + { + fetchImpl: async () => { + const err = new Error("upstream connection reset"); + err.name = "AbortError"; + throw err; + }, + } + ), + /upstream connection reset/ + ); +}); + +test("fetchCapped throws on non-ok response with the status surfaced", async () => { + await assert.rejects( + fetchCapped( + "https://medium.com/feed/@user", + {}, + { + fetchImpl: async () => makeResponse({ ok: false, status: 503 }), + } + ), + /HTTP 503/ + ); +}); + +test("fetchCapped tolerates null options arg (treats as defaults)", async () => { + // `null` is a common 'no options' sentinel. The wrapper must not crash + // destructuring it; should fall back to globalThis.fetch via the internal + // `opts || {}` guard. + let called = false; + const realFetch = globalThis.fetch; + globalThis.fetch = async () => { + called = true; + return makeResponse({ body: "global-ok" }); + }; + try { + const out = await fetchCapped("https://medium.com/feed/@user", {}, null); + assert.equal(out, "global-ok"); + assert.equal(called, true); + } finally { + globalThis.fetch = realFetch; + } +}); + +test("fetchCapped tolerates { fetchImpl: null } (falls back to global)", async () => { + let called = false; + const realFetch = globalThis.fetch; + globalThis.fetch = async () => { + called = true; + return makeResponse({ body: "global-ok" }); + }; + try { + const out = await fetchCapped( + "https://medium.com/feed/@user", + {}, + { fetchImpl: null } + ); + assert.equal(out, "global-ok"); + assert.equal(called, true); + } finally { + globalThis.fetch = realFetch; + } +}); + +// --- fetchHashnodeRetired (source=hashnode is intentionally dead) --- + +test("fetchPosts with source=hashnode throws a 'retired' error", async () => { + const { fetchPosts } = require("../src/fetchers/posts"); + await assert.rejects( + fetchPosts({ source: "hashnode", username: "anyone", count: 5 }), + /retired/ + ); +}); diff --git a/tests/theme-url.test.js b/tests/theme-url.test.js index 13a63ba..34fb655 100644 --- a/tests/theme-url.test.js +++ b/tests/theme-url.test.js @@ -126,10 +126,23 @@ function mockFetch(response) { return async () => ({ ok: response.ok ?? true, status: response.status ?? 200, + // Use real `Headers` (Node 18+) — case-insensitive lookup like production. + headers: new Headers(response.headers ?? {}), text: async () => response.text ?? JSON.stringify(VALID_PALETTE), }); } +// Inline-built mock responses elsewhere in this file omit `headers`; this +// helper wraps a raw mock and fills in an empty Headers so the production +// `res.headers.get('content-length')` call doesn't crash. +function withHeaders(mock) { + return async (...args) => { + const r = await mock(...args); + if (!r.headers) r.headers = new Headers(); + return r; + }; +} + test("fetchExternalTheme returns parsed palette on success", async () => { clearCache(); const palette = await fetchExternalTheme(VALID_URL, { @@ -143,7 +156,7 @@ test("fetchExternalTheme caches results within TTL", async () => { let callCount = 0; const fetchImpl = async () => { callCount++; - return { ok: true, status: 200, text: async () => JSON.stringify(VALID_PALETTE) }; + return { ok: true, status: 200, headers: new Headers(), text: async () => JSON.stringify(VALID_PALETTE) }; }; await fetchExternalTheme(VALID_URL, { fetchImpl }); await fetchExternalTheme(VALID_URL, { fetchImpl }); @@ -156,7 +169,7 @@ test("fetchExternalTheme refetches after TTL expires", async () => { let callCount = 0; const fetchImpl = async () => { callCount++; - return { ok: true, status: 200, text: async () => JSON.stringify(VALID_PALETTE) }; + return { ok: true, status: 200, headers: new Headers(), text: async () => JSON.stringify(VALID_PALETTE) }; }; let fakeNow = 1_000_000; const now = () => fakeNow; @@ -170,7 +183,7 @@ test("fetchExternalTheme throws on HTTP error", async () => { clearCache(); await assert.rejects( fetchExternalTheme(VALID_URL, { - fetchImpl: async () => ({ ok: false, status: 404, text: async () => "" }), + fetchImpl: async () => ({ ok: false, status: 404, headers: new Headers(), text: async () => "" }), }), /HTTP 404/ ); @@ -180,7 +193,7 @@ test("fetchExternalTheme throws on invalid JSON", async () => { clearCache(); await assert.rejects( fetchExternalTheme(VALID_URL, { - fetchImpl: async () => ({ ok: true, status: 200, text: async () => "not json{" }), + fetchImpl: async () => ({ ok: true, status: 200, headers: new Headers(), text: async () => "not json{" }), }), /not valid JSON/ ); @@ -193,6 +206,7 @@ test("fetchExternalTheme throws on schema-invalid payload", async () => { fetchImpl: async () => ({ ok: true, status: 200, + headers: new Headers(), text: async () => JSON.stringify({ bg: "#000" }), }), }), @@ -214,19 +228,85 @@ test("fetchExternalTheme rejects disallowed hosts before any request", async () assert.equal(called, false, "fetch should not have been invoked"); }); -test("fetchExternalTheme surfaces an AbortError as a clean timeout message", async () => { +test("fetchExternalTheme surfaces a timer-induced AbortError as a clean timeout message", async () => { + // Use a short timeoutMs override so the abort fires via the actual timer + // path (not synchronous throw) — that's what production hits and what the + // new controller.signal.aborted guard distinguishes from upstream-surfaced + // AbortErrors. clearCache(); - const fetchImpl = async () => { - const err = new Error("aborted"); - err.name = "AbortError"; - throw err; - }; await assert.rejects( - fetchExternalTheme(VALID_URL, { fetchImpl }), + fetchExternalTheme(VALID_URL, { + timeoutMs: 10, + fetchImpl: (_url, init) => + new Promise((_resolve, reject) => { + init.signal.addEventListener("abort", () => { + const err = new Error("aborted"); + err.name = "AbortError"; + reject(err); + }); + }), + }), /timed out/ ); }); +test("fetchExternalTheme does NOT mislabel non-timer AbortError as a timeout", async () => { + // Non-timer AbortError (e.g., upstream connection reset in some Undici + // versions) must propagate with its own message, not be re-labeled + // 'timed out'. controller.signal.aborted is false in this path. + clearCache(); + await assert.rejects( + fetchExternalTheme(VALID_URL, { + fetchImpl: async () => { + const err = new Error("upstream connection reset"); + err.name = "AbortError"; + throw err; + }, + }), + /upstream connection reset/ + ); +}); + +test("fetchExternalTheme rejects responses whose declared content-length exceeds the cap", async () => { + const { MAX_BODY_BYTES } = require("../src/common/theme-url"); + clearCache(); + let textCalled = false; + await assert.rejects( + fetchExternalTheme(VALID_URL, { + fetchImpl: async () => ({ + ok: true, + status: 200, + headers: new Headers({ "content-length": String(MAX_BODY_BYTES + 1) }), + text: async () => { + textCalled = true; + return "should not be read"; + }, + }), + }), + /too large/ + ); + assert.equal(textCalled, false, "pre-read cap must reject before text() runs"); +}); + +test("fetchExternalTheme rejects bodies whose actual size exceeds the cap (bytes, not chars)", async () => { + const { MAX_BODY_BYTES } = require("../src/common/theme-url"); + clearCache(); + // 4 bytes per emoji in UTF-8; (cap/4 + 100) emoji exceeds the cap in bytes + // even though text.length is much smaller. + const oversize = "🔥".repeat(Math.ceil(MAX_BODY_BYTES / 4) + 100); + await assert.rejects( + fetchExternalTheme(VALID_URL, { + fetchImpl: async () => ({ + ok: true, + status: 200, + headers: new Headers(), + text: async () => oversize, + }), + }), + /too large/ + ); +}); + test("fetchExternalTheme rejects http:// before any request", async () => { clearCache(); let called = false; @@ -273,6 +353,7 @@ test("resolveCardOptions per-param color overrides win over external palette", a fetchImpl: async () => ({ ok: true, status: 200, + headers: new Headers(), text: async () => JSON.stringify(VALID_PALETTE), }), }); @@ -284,3 +365,55 @@ test("resolveCardOptions per-param color overrides win over external palette", a assert.equal(opts.colors.bg, "#ff0000"); // per-param wins assert.equal(opts.colors.title, "#ffffff"); // from external palette }); + +// --- prefetched palette reuse (stack N+1 dedup) --- + +test("resolveCardOptions reuses a prefetched palette without a network call", async () => { + // /api/stack passes a prefetched { url, palette } to every child slot so + // the gist URL is fetched once, not 1+N times. Pin that contract here. + const { resolveCardOptions } = require("../src/common/options"); + clearCache(); + const url = VALID_URL; + const prefetched = { url, palette: VALID_PALETTE }; + const params = new URLSearchParams(`theme_url=${encodeURIComponent(url)}`); + // No fetchImpl available — if resolveCardOptions tried to call fetch, it + // would hit the real network or crash. The prefetched path must short-circuit. + const { opts, themeError } = await resolveCardOptions(params, prefetched); + assert.equal(themeError, null); + assert.equal(opts.colors.bg, VALID_PALETTE.bg); +}); + +test("resolveCardOptions surfaces a prefetched error without retry", async () => { + // If the top-level fetch failed, child slots inherit the same error rather + // than each retrying — preserves the single-fetch contract on failure too. + const { resolveCardOptions } = require("../src/common/options"); + clearCache(); + const url = VALID_URL; + const prefetched = { url, error: "theme_url payload too large: 999999 bytes" }; + const params = new URLSearchParams(`theme_url=${encodeURIComponent(url)}`); + const { themeError } = await resolveCardOptions(params, prefetched); + assert.match(themeError, /too large/); +}); + +test("resolveCardOptions falls through when a child overrides with a different theme_url", async () => { + // .theme_url= in /api/stack must NOT reuse the prefetched + // palette — it's a different URL. Falls back to fetchExternalTheme via the + // standard cache, which we pre-populate to avoid hitting the network. + const { resolveCardOptions } = require("../src/common/options"); + const { fetchExternalTheme } = require("../src/common/theme-url"); + clearCache(); + const otherUrl = "https://gist.githubusercontent.com/u/zzz/raw/other.json"; + const otherPalette = { ...VALID_PALETTE, bg: "#abcdef" }; + await fetchExternalTheme(otherUrl, { + fetchImpl: async () => ({ + ok: true, + status: 200, + headers: new Headers(), + text: async () => JSON.stringify(otherPalette), + }), + }); + const prefetched = { url: VALID_URL, palette: VALID_PALETTE }; + const params = new URLSearchParams(`theme_url=${encodeURIComponent(otherUrl)}`); + const { opts } = await resolveCardOptions(params, prefetched); + assert.equal(opts.colors.bg, "#abcdef"); +});