Chunked media uploads to bypass Vercel's 4.5 MB request limit by carlosjdelgado · Pull Request #408 · hunvreus/pagescms

carlosjdelgado · 2026-06-24T10:34:14Z

Problem

Vercel caps request bodies at 4.5 MB. Any media upload above that fails outright — including images embedded from the rich-text editor and direct uploads from the media browser.

Solution

Split uploads into chunks ≤ 4 MB, stage them server-side, and reassemble at finalize time. The first chunk rides inline in the finalize request, so files ≤ 4 MB still complete in a single round-trip with zero DB writes.

Wire surface

All media writes — file uploads and folder markers — share one endpoint prefix: /api/[owner]/[repo]/[branch]/media/[name]/[path].

Method	Path	Purpose
`POST`	`/media/[name]/[path]/chunk`	Stage one chunk (`uploadId`, `idx`, `chunk` blob). `idx` must be in `[1, MAX_CHUNK_IDX]` — chunk 0 rides inline in finalize.
`POST`	`/media/[name]/[path]`	Finalize: reassemble inline `firstChunk` + DB-staged chunks, commit to GitHub. Accepts a 0-byte `firstChunk` when path ends in `.gitkeep` (folder marker).

Chunks are scoped to (uploadId, userId); one user cannot read another's staging buffer.

Limits

Constant	Value	Why
`CHUNK_BYTES`	4 MB	Fits in multipart body under the 4.5 MB Vercel cap (overhead < 1 KB)
`MAX_TOTAL_BYTES`	15 MB	Per-upload ceiling, enforced server-side after reassembly
`MAX_CHUNK_IDX`	derived (`ceil(MAX_TOTAL_BYTES / CHUNK_BYTES) - 1`)	Tight cap at the chunk-staging endpoint; prevents DB and memory DoS
`CHUNK_CONCURRENCY`	4	Parallel chunk uploads (batched)
`STALE_CHUNK_AGE_MS`	10 min	Stale-chunk TTL; cleanup runs piggyback on every finalize

CHUNK_BYTES and MAX_TOTAL_BYTES live in lib/utils/upload-media.ts and are imported by both the client helper and the server routes. MAX_CHUNK_IDX is derived locally in the chunk handler.

Storage

New table upload_chunk (db/migrations/0013_upload_chunks.sql):

Composite PK: (upload_id, chunk_idx)
data stored as bytea (not base64) for size + perf
user_id and created_at for ownership and TTL cleanup

Client surface

lib/utils/upload-media.ts exports uploadMediaChunked — a single source of truth used by every callsite that writes to media:

uploadMediaChunked({
  file,                // a File (including 0-byte for .gitkeep markers)
  owner, repo, branch,
  mediaName,           // media schema name
  targetPath,          // path inside schema.input
  onConflict,          // "error" | "rename" (default rename)
});

Callers wired to it in this PR:

components/media/media-upload.tsx — media browser uploader (drag-drop, click)
fields/core/rich-text/edit-component.tsx — inline image upload from the rich-text editor
components/folder-create.tsx (media branch) — .gitkeep folder marker creation

Refactors bundled

githubSaveFile extracted to lib/utils/github-save-file.ts. Previously inline in files/[path]/route.ts; now shared between that route and the media POST.
Rich-text image upload migrated from the JSON POST /files/[path] route (type: "media", base64 in body) to the chunked endpoint. Per-upload cap rises from ~4.5 MB (Vercel JSON body limit) to 15 MB. Drops the local FileReader/base64 conversion.
.gitkeep for media folders moved from POST /files/[path] (type: "media", JSON content: "") to the chunked endpoint with a 0-byte inline firstChunk. The server special-cases .gitkeep: skips the extension check and allows an empty firstChunk.
POST /files/[path] case "media" removed entirely. No live callers remain after points 2 and 3. The files endpoint now handles only content and settings.

What did NOT change

POST /files/[path] for content and settings (collection/file folder markers and the .pages.yml config) — unchanged.
DELETE / rename surfaces for media — unchanged.

Test plan

Upload media file under 4 MB → single request, zero rows in upload_chunk
Upload media file between 4 MB and 15 MB → multi-chunk path, staged then cleaned
Upload > 15 MB → client-side cap rejects before any network call
Upload via rich-text editor → uses the same flow; image renders in document
Create media folder via the media browser → 0-byte .gitkeep round-trips through the chunked endpoint
Create content folder via the collection view → still goes through POST /files/[path]
Migration 0013 applies cleanly on a fresh DB and on a DB already at 0012

Vercel serverless functions cap request bodies at 4.5 MB, which after base64 overhead limited media uploads to ~3.3 MB. The browser now slices files into ~3 MB chunks, POSTs each to /api/upload/chunk (multipart), and then calls /api/upload/finalize which reassembles, pushes to GitHub, and deletes the chunks. Chunks are staged in a new upload_chunk table (Postgres text, base64). Stale chunks are reaped opportunistically on each insert. Max file size is 50 MB / 50 chunks, configurable in both endpoints and the client. githubSaveFile is extracted to lib/utils/github-save-file.ts so both the existing files endpoint and the new finalize endpoint share rename-on- conflict logic.

Client now uploads chunks in batches of 4 instead of sequentially. Server moves opportunistic stale-chunk cleanup (chunk endpoint) and per-upload chunk deletion (finalize endpoint) into next/server `after`, so neither blocks the response.

When the file fits in Vercel's 4.5 MB request body (after base64 overhead and JSON envelope), upload directly via the existing files endpoint instead of the chunked path. Cuts DB write/read traffic by the share of small uploads, which is most of them in practice.

The last chunk is sent inline with the finalize metadata (multipart) instead of going through the DB, saving one INSERT+SELECT per upload. For a 4 MB file (2 chunks) this halves DB writes; for larger files the ratio drops but every saved chunk still helps under Neon Free quotas.

Eliminates the +33% base64 overhead in the upload_chunk table. The INSERT/SELECT bandwidth per chunk drops by ~25% with no client change and no CPU cost. Migration uses decode('base64') in USING so any chunks in flight at upgrade time are converted instead of dropped. For a 20 MB upload, total DB bandwidth goes from ~48 MB to ~36 MB (2.4x -> 1.8x file size).

Multipart carries the chunk as raw binary, not base64, so 4 MB fits in Vercel's 4.5 MB body with about 500 KB of headroom. For a 20 MB upload this drops the chunk count from 7 to 5 (4 staged in DB, 1 inline) and total DB bandwidth from ~36 MB to ~32 MB.

The first chunk is always CHUNK_BYTES (4 MB) for any multi-chunk upload; the last one can be smaller. Sending the first inline keeps the largest chunk out of the DB. For files whose size is a multiple of CHUNK_BYTES there is no change; for the rest, DB bandwidth drops by up to (CHUNK_BYTES - lastChunkSize) * 2.

Limits per upload to 22 MB of DB bandwidth in the worst case, sized to fit Neon Free quotas with margin for other traffic.

Merges the old 0013 (CREATE TABLE with text data) and 0014 (ALTER to bytea) into a single 0013 that creates the table with bytea directly. Also drops the unused chunk-assembly self-check script.

Files <=3 MB previously skipped the chunked path and posted base64+JSON to the files endpoint. Removed because the chunked path already short-circuits to inline-only when totalChunks=1, no DB rows are created, and the request body is smaller (binary multipart vs base64 JSON). One code path now handles every size up to 15 MB.

carlosjdelgado changed the title ~~Feat/chunked media upload~~ [WIP] Feat/chunked media upload Jun 24, 2026

carlosjdelgado force-pushed the feat/chunked-media-upload branch from 2d16dab to 9614425 Compare June 24, 2026 10:39

carlosjdelgado changed the title ~~[WIP] Feat/chunked media upload~~ [WIP] Chunked media uploads to bypass Vercel's 4.5 MB request limit Jun 24, 2026

carlosjdelgado changed the title ~~[WIP] Chunked media uploads to bypass Vercel's 4.5 MB request limit~~ Chunked media uploads to bypass Vercel's 4.5 MB request limit Jun 24, 2026

carlosjdelgado mentioned this pull request Jun 24, 2026

Can't upload large files (>= 4.5Mb) #393

Open

carlosjdelgado marked this pull request as ready for review June 24, 2026 11:54

carlosjdelgado changed the base branch from main to development June 24, 2026 15:00

carlosjdelgado added 12 commits June 24, 2026 17:05

Bump chunk size to 4 MB

63359c7

Multipart carries the chunk as raw binary, not base64, so 4 MB fits in Vercel's 4.5 MB body with about 500 KB of headroom. For a 20 MB upload this drops the chunk count from 7 to 5 (4 staged in DB, 1 inline) and total DB bandwidth from ~36 MB to ~32 MB.

Cap media uploads at 15 MB

9e644e4

Limits per upload to 22 MB of DB bandwidth in the worst case, sized to fit Neon Free quotas with margin for other traffic.

Consolidate upload_chunk into a single migration

43d543a

Merges the old 0013 (CREATE TABLE with text data) and 0014 (ALTER to bytea) into a single 0013 that creates the table with bytea directly. Also drops the unused chunk-assembly self-check script.

Drop ponytail tag from comments

a98b3aa

Move stale chunk cleanup into finalize and shorten TTL to 10m

543c2d1

carlosjdelgado force-pushed the feat/chunked-media-upload branch from ad2b11f to 543c2d1 Compare June 24, 2026 15:06

carlosjdelgado added 5 commits June 26, 2026 07:49

Move chunk and finalize routes under files path prefix

73dcda9

Move chunk and finalize routes under media path prefix

e592289

Migrate rich-text image upload to chunked media endpoint

a9b00c7

Share upload-media constants and tighten chunk idx cap

7f80f7e

Route media .gitkeep folder creation through chunked endpoint

db6ffd8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Chunked media uploads to bypass Vercel's 4.5 MB request limit#408

Chunked media uploads to bypass Vercel's 4.5 MB request limit#408
carlosjdelgado wants to merge 17 commits into
hunvreus:developmentfrom
carlosjdelgado:feat/chunked-media-upload

carlosjdelgado commented Jun 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

carlosjdelgado commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Wire surface

Limits

Storage

Client surface

Refactors bundled

What did NOT change

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

carlosjdelgado commented Jun 24, 2026 •

edited

Loading