A production-oriented reference for running a modern Next.js app on Amazon Web Services using OpenNext and SST.
graph TD
U[User] --> CF[CloudFront CDN]
CF -->|SSR requests| L[Lambda]
CF -->|Static assets| S3[S3 Bucket]
L -->|Read/write| ISR[S3 ISR Cache]
Most Next.js apps today are deployed on Vercel. That’s a great default, but it hides a lot of important infrastructure decisions.
This project explores:
- What Next.js actually requires at runtime
- How those requirements map onto AWS primitives
- The tradeoffs between platform convenience and infrastructure control
- When you might choose this custom setup over Vercel
OpenNext compiles a Next.js app into AWS-native components:
- CloudFront — CDN, routing, edge handling
- S3 — static assets and ISR cache
- Lambda — SSR, API routes, image optimization
- (Optional) Edge Functions — middleware
This effectively recreates the “Vercel runtime” using standard AWS services.
Using Lambda keeps the system:
- horizontally scalable by default
- cost-efficient at low to medium traffic
Tradeoff:
- cold starts (mitigated but not eliminated)
All traffic flows through CloudFront, which:
- caches static and dynamic responses
- reduces origin load
- enables global low-latency delivery
Tradeoff:
- more complex debugging vs a simple Node server
Incremental Static Regeneration is implemented via:
- S3 for cache storage
- Lambda for revalidation
Tradeoff:
- less seamless than Vercel’s proprietary cache layer
- requires understanding cache invalidation explicitly
SST provides:
- a higher-level abstraction over CDK
- fast local development (sst dev)
- tight integration with Lambda
Tradeoff:
- abstraction layer to learn
- less “raw control” than hand-written CDK
npm install
npx sst devThis runs a hybrid environment where:
- the frontend runs locally
- AWS resources are deployed on-demand
- logs stream in real time
npx sst deploySee DEPLOYMENT.md for stages (production / staging / ad-hoc), Cloudflare DNS, GitHub Actions OIDC, and secrets.
CI: Pushes to main deploy staging (staging.deal-drill.com). Pull requests (same repo) get preview stages pr-<number> with teardown when the PR closes. Production is deployed only via the Deploy production workflow (manual). See DEPLOYMENT.md for OIDC (including pull_request and workflow_dispatch) and optional GitHub Environment approvals.
This provisions:
- CloudFront distribution
- S3 buckets
- Lambda functions
- IAM roles
To delete the AWS resources for a given stage (with credentials configured, same account/region as deploy):
npx sst remove --stage <stage-name>Non-production stages (dev, staging, preview stages, etc.): this is the normal path. This repo sets removal: "remove" for those stages so the stack is torn down when you remove it.
Production (--stage production): sst.config.ts sets protect: true, which blocks sst remove so production is not deleted by accident. It also sets removal: "retain" for production, which can leave some resources in AWS when the stack is deleted.
To tear down production (e.g. end of an experiment):
- In
sst.config.ts, insideapp(), temporarily setprotect: false(or remove theprotectline) sosst removeis allowed. - Optionally set
removal: "remove"for production as well (e.g. useremoval: "remove"for every stage while tearing down) so resources are not left behind by the retain policy. - Save the file and run
npx sst remove --stage productionwith AWS credentials in the right account/region. - Tear down other stages the same way, e.g.
npx sst remove --stage stagingandnpx sst remove --stage pr-<n>for any preview stacks (non-prod stages are usually removable without step 1). - Revert
sst.config.tsto the previousprotect/removalvalues if you keep developing this repo.
Retention: With removal: "retain", some physical resources may persist until you delete them in the AWS console or change removal and run remove again—using removal: "remove" before remove reduces that.
DNS: The domain can stay in Cloudflare; teardown removes AWS resources. Review the Cloudflare zone and delete any DNS records that pointed at this app if you no longer need them.
Secrets: Values managed with sst secret live in your AWS account; confirm in SST’s docs for your version if anything needs manual cleanup after remove.
More context: DEPLOYMENT.md.
Advantages:
- Full infrastructure control
- No vendor lock-in
- Easier integration with existing AWS systems
- Predictable cost model at scale
Disadvantages:
- More operational complexity
- Weaker developer experience out of the box
- Manual handling of caching and invalidation
- Observability requires additional setup
When this approach makes sense:
- You already run significant infrastructure on AWS
- You need tighter backend integration
- You care about vendor independence
- You want to understand and control your runtime
When it doesn’t:
- You want zero DevOps overhead
- You’re building a small app with no special requirements
- You value speed of iteration over infrastructure control
- Observability (structured logging, tracing)
- Performance tuning (cold starts, caching strategy)
This setup works well, but it surfaces a number of non-obvious behaviours that are usually abstracted away by platforms like Vercel.
Lambda-backed SSR introduces cold starts, typically in the 100–500ms range.
Observations:
- First request after inactivity is noticeably slower
- Subsequent requests are fast due to container reuse
- API routes and page rendering can have different cold start profiles
Mitigations:
- Prefer static generation where possible
- Keep functions small to reduce init time
- Accept that some latency variance is part of the model
Amazon CloudFront sits in front of everything, which means:
- Responses may be cached even when you don’t expect it
- Headers like Cache-Control and Vary become critical
- Debugging often requires bypassing cache entirely
Gotcha: A misconfigured cache policy can make your app appear “stuck” or inconsistent across users.
With OpenNext, Incremental Static Regeneration is implemented using:
- S3 for storage
- Lambda for revalidation
This means:
- Cache invalidation is explicit, not implicit
- Race conditions can occur under load
- Stale content can persist longer than expected
This is the biggest difference from Vercel’s managed experience.
There is no single “app log”.
Logs are split across:
- Lambda (CloudWatch)
- CloudFront (optional access logs)
- Client-side logs (browser)
Without aggregation, debugging requires jumping between multiple systems.
Next.js middleware runs at the edge, not in Node:
- Limited APIs
- No access to full Node libraries
- Different performance characteristics
This can break assumptions if you’re used to server-side code.
Unlike a single-server deploy:
- CloudFront distributions take time to update
- Cache invalidation is not instant
- You can briefly serve mixed versions of your app
This is subtle but important when debugging production issues.
Because everything is provisioned via infrastructure as code:
- Missing permissions surface as runtime failures
- Errors are often indirect (“AccessDenied” from a downstream service)
Using broad permissions initially (e.g. AdministratorAccess) helps reduce noise during setup.
With SST:
- Some code runs locally
- Some runs in AWS
- Network latency is part of your dev loop
This is powerful, but different from a purely local Node environment.
Self-hosting Next.js on AWS gives you control and flexibility, but it also exposes the underlying complexity that platforms usually hide.
The biggest shift is moving from “the platform handles it" to “I am responsible for how this behaves in production”.