Skip to content

LawrenceHunt/nextjs-self-hosted

Repository files navigation

Next.js Self-Hosted on AWS

A production-oriented reference for running a modern Next.js app on Amazon Web Services using OpenNext and SST.

graph TD
  U[User] --> CF[CloudFront CDN]
  CF -->|SSR requests| L[Lambda]
  CF -->|Static assets| S3[S3 Bucket]
  L -->|Read/write| ISR[S3 ISR Cache]
Loading

Why this exists

Most Next.js apps today are deployed on Vercel. That’s a great default, but it hides a lot of important infrastructure decisions.

This project explores:

  • What Next.js actually requires at runtime
  • How those requirements map onto AWS primitives
  • The tradeoffs between platform convenience and infrastructure control
  • When you might choose this custom setup over Vercel

Architecture overview

OpenNext compiles a Next.js app into AWS-native components:

  • CloudFront — CDN, routing, edge handling
  • S3 — static assets and ISR cache
  • Lambda — SSR, API routes, image optimization
  • (Optional) Edge Functions — middleware

This effectively recreates the “Vercel runtime” using standard AWS services.

Key design decisions

Serverless over containers

Using Lambda keeps the system:

  • horizontally scalable by default
  • cost-efficient at low to medium traffic

Tradeoff:

  • cold starts (mitigated but not eliminated)

CloudFront as the entry point

All traffic flows through CloudFront, which:

  • caches static and dynamic responses
  • reduces origin load
  • enables global low-latency delivery

Tradeoff:

  • more complex debugging vs a simple Node server

ISR backed by S3

Incremental Static Regeneration is implemented via:

  • S3 for cache storage
  • Lambda for revalidation

Tradeoff:

  • less seamless than Vercel’s proprietary cache layer
  • requires understanding cache invalidation explicitly

Infrastructure as code with SST

SST provides:

  • a higher-level abstraction over CDK
  • fast local development (sst dev)
  • tight integration with Lambda

Tradeoff:

  • abstraction layer to learn
  • less “raw control” than hand-written CDK

Local development

npm install
npx sst dev

This runs a hybrid environment where:

  • the frontend runs locally
  • AWS resources are deployed on-demand
  • logs stream in real time

Deployment

npx sst deploy

See DEPLOYMENT.md for stages (production / staging / ad-hoc), Cloudflare DNS, GitHub Actions OIDC, and secrets.

CI: Pushes to main deploy staging (staging.deal-drill.com). Pull requests (same repo) get preview stages pr-<number> with teardown when the PR closes. Production is deployed only via the Deploy production workflow (manual). See DEPLOYMENT.md for OIDC (including pull_request and workflow_dispatch) and optional GitHub Environment approvals.

This provisions:

  • CloudFront distribution
  • S3 buckets
  • Lambda functions
  • IAM roles

Tear down

To delete the AWS resources for a given stage (with credentials configured, same account/region as deploy):

npx sst remove --stage <stage-name>

Non-production stages (dev, staging, preview stages, etc.): this is the normal path. This repo sets removal: "remove" for those stages so the stack is torn down when you remove it.

Production (--stage production): sst.config.ts sets protect: true, which blocks sst remove so production is not deleted by accident. It also sets removal: "retain" for production, which can leave some resources in AWS when the stack is deleted.

To tear down production (e.g. end of an experiment):

  1. In sst.config.ts, inside app(), temporarily set protect: false (or remove the protect line) so sst remove is allowed.
  2. Optionally set removal: "remove" for production as well (e.g. use removal: "remove" for every stage while tearing down) so resources are not left behind by the retain policy.
  3. Save the file and run npx sst remove --stage production with AWS credentials in the right account/region.
  4. Tear down other stages the same way, e.g. npx sst remove --stage staging and npx sst remove --stage pr-<n> for any preview stacks (non-prod stages are usually removable without step 1).
  5. Revert sst.config.ts to the previous protect / removal values if you keep developing this repo.

Retention: With removal: "retain", some physical resources may persist until you delete them in the AWS console or change removal and run remove again—using removal: "remove" before remove reduces that.

DNS: The domain can stay in Cloudflare; teardown removes AWS resources. Review the Cloudflare zone and delete any DNS records that pointed at this app if you no longer need them.

Secrets: Values managed with sst secret live in your AWS account; confirm in SST’s docs for your version if anything needs manual cleanup after remove.

More context: DEPLOYMENT.md.

Tradeoffs vs Vercel

Advantages:

  • Full infrastructure control
  • No vendor lock-in
  • Easier integration with existing AWS systems
  • Predictable cost model at scale

Disadvantages:

  • More operational complexity
  • Weaker developer experience out of the box
  • Manual handling of caching and invalidation
  • Observability requires additional setup

When this approach makes sense:

  • You already run significant infrastructure on AWS
  • You need tighter backend integration
  • You care about vendor independence
  • You want to understand and control your runtime

When it doesn’t:

  • You want zero DevOps overhead
  • You’re building a small app with no special requirements
  • You value speed of iteration over infrastructure control

Future work

  • Observability (structured logging, tracing)
  • Performance tuning (cold starts, caching strategy)

Gotchas & Lessons Learned

This setup works well, but it surfaces a number of non-obvious behaviours that are usually abstracted away by platforms like Vercel.

Cold starts are real (and uneven)

Lambda-backed SSR introduces cold starts, typically in the 100–500ms range.

Observations:

  • First request after inactivity is noticeably slower
  • Subsequent requests are fast due to container reuse
  • API routes and page rendering can have different cold start profiles

Mitigations:

  • Prefer static generation where possible
  • Keep functions small to reduce init time
  • Accept that some latency variance is part of the model

CloudFront caching is powerful, and easy to get wrong

Amazon CloudFront sits in front of everything, which means:

  • Responses may be cached even when you don’t expect it
  • Headers like Cache-Control and Vary become critical
  • Debugging often requires bypassing cache entirely

Gotcha: A misconfigured cache policy can make your app appear “stuck” or inconsistent across users.

ISR is not magic anymore

With OpenNext, Incremental Static Regeneration is implemented using:

  • S3 for storage
  • Lambda for revalidation

This means:

  • Cache invalidation is explicit, not implicit
  • Race conditions can occur under load
  • Stale content can persist longer than expected

This is the biggest difference from Vercel’s managed experience.

Logs are fragmented by default

There is no single “app log”.

Logs are split across:

  • Lambda (CloudWatch)
  • CloudFront (optional access logs)
  • Client-side logs (browser)

Without aggregation, debugging requires jumping between multiple systems.

Middleware ≠ full Node environment

Next.js middleware runs at the edge, not in Node:

  • Limited APIs
  • No access to full Node libraries
  • Different performance characteristics

This can break assumptions if you’re used to server-side code.

Deployment is eventually consistent

Unlike a single-server deploy:

  • CloudFront distributions take time to update
  • Cache invalidation is not instant
  • You can briefly serve mixed versions of your app

This is subtle but important when debugging production issues.

IAM and permissions can fail in non-obvious ways

Because everything is provisioned via infrastructure as code:

  • Missing permissions surface as runtime failures
  • Errors are often indirect (“AccessDenied” from a downstream service)

Using broad permissions initially (e.g. AdministratorAccess) helps reduce noise during setup.

Local development is hybrid, not local

With SST:

  • Some code runs locally
  • Some runs in AWS
  • Network latency is part of your dev loop

This is powerful, but different from a purely local Node environment.

Takeaway

Self-hosting Next.js on AWS gives you control and flexibility, but it also exposes the underlying complexity that platforms usually hide.

The biggest shift is moving from “the platform handles it" to “I am responsible for how this behaves in production”.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors