Self-hosted, S3-compatible object storage. One small Go binary, a React admin UI, and a filesystem-backed store. Speaks the AWS S3 wire protocol on one port and exposes the same operations through a browser-friendly admin API on another.
docker pull ghcr.io/byteink/bytebucket:latest
- S3 API on port
9000β AWS Signature V4, XML responses. Works with the AWS SDK,aws s3,rclone,s3cmd,mc, Terraform, boto3, anything that speaks S3. - Admin API + UI on port
9001β header-authenticated JSON surface plus an embedded React dashboard at/. - Multipart upload, per-bucket CORS, presigned URLs, real ETags, request IDs, structured JSON logs, Prometheus metrics.
Heads up β security. The admin port (
9001) must not be exposed to the public internet. Put it behind a private network, VPN, SSH tunnel, or reverse proxy with access control. Details and the deferred-hardening list are in SECURITY.md.
Working on the code? See DEVELOPMENT.md for the contributor guide (repo layout, local setup, Vite dev loop, testing, release flow, conventions).
- Quick start
- Configuration
- Admin web UI
- S3 API (port 9000)
- Admin API (port 9001)
- Per-bucket CORS
- Observability
- Using it from code
- Storage layout and persistence
- Limits
- Troubleshooting
- License
docker run -d \
--name bytebucket \
-p 9000:9000 \
-p 9001:9001 \
-v bytebucket-data:/data \
-e ENCRYPTION_KEY="$(openssl rand -base64 32)" \
-e ACCESS_KEY_ID="admin" \
-e SECRET_ACCESS_KEY="$(openssl rand -base64 32)" \
ghcr.io/byteink/bytebucket:latestThen open http://localhost:9001 and log in with the admin access key / secret you just set.
services:
bytebucket:
image: ghcr.io/byteink/bytebucket:latest
restart: unless-stopped
ports:
- "9000:9000"
- "9001:9001"
environment:
ENCRYPTION_KEY: "32-byte-random-or-base64-encoded-key"
ACCESS_KEY_ID: "admin"
SECRET_ACCESS_KEY: "your-strong-secret"
volumes:
- bytebucket-data:/data
volumes:
bytebucket-data:On first boot only, the server reads ACCESS_KEY_ID / SECRET_ACCESS_KEY to create a super-user in BoltDB. After that, those env vars are ignored β rotate credentials through the admin API. ENCRYPTION_KEY is required on every boot (it decrypts stored secrets at rest).
All configuration is via environment variables.
| Variable | Required | Default | Description |
|---|---|---|---|
ENCRYPTION_KEY |
yes | β | 32 raw bytes or base64-encoded 32-byte key. Encrypts stored user secrets at rest. Lose it, lose every credential. Rotate carefully. |
ACCESS_KEY_ID |
first boot only | β | Super-user access key, used once to seed the user database. |
SECRET_ACCESS_KEY |
first boot only | β | Super-user secret, same. |
GIN_MODE |
no | debug |
Set to release in production. The provided Docker image sets this. |
LOG_LEVEL |
no | info |
debug, info, warn, error. |
LOG_FORMAT |
no | json |
json for production / log aggregators, text for local dev readability. |
RATE_LIMIT_ENABLED |
no | false |
Master switch for per-client request rate limiting. Off by default; when disabled the middleware short-circuits after one atomic read, so the per-request cost is negligible. Seeds the baseline that a runtime override (see below) can replace. |
RATE_LIMIT_RPS |
no | 0 |
Sustained requests per second allowed per client IP (token refill rate). Only meaningful when limiting is enabled. |
RATE_LIMIT_BURST |
no | 0 |
Token-bucket depth: the largest instantaneous spike one client may make before the sustained RPS rate gates it. |
RATE_LIMIT_TRUSTED_PROXIES |
no | 0 |
Number of reverse-proxy hops in front of the server. Selects which X-Forwarded-For entry is the real client. 0 ignores X-Forwarded-For and keys on the socket peer. |
Off by default. Set RATE_LIMIT_ENABLED=true to throttle requests per client IP with a token bucket. It runs early in the chain on both ports (after logging/metrics, before auth), so an unauthenticated flood is rejected before it reaches signature verification or the filesystem. Both ports share one per-IP budget, so a client cannot double its allowance by splitting traffic across the S3 and admin surfaces.
- Set
RATE_LIMIT_RPS(sustained rate) andRATE_LIMIT_BURST(spike depth) together. A request that exceeds its bucket gets503 Slow Downwith aRetry-Afterheader β AWS SDKs treatSlowDownas retryable and back off automatically. - Behind a proxy, set
RATE_LIMIT_TRUSTED_PROXIESto the number of hops in front of ByteBucket (e.g.1for a single nginx / traefik / ALB). The client IP is resolved by counting that many trusted hops in from the right ofX-Forwarded-For; the nearest proxy is the connection peer and is not in the header. Leaving it at0ignoresX-Forwarded-Forand keys on the socket peer β correct only when ByteBucket is directly exposed. Match it to your actual topology: setting it too low lets a client spoof its limiter key by prependingX-Forwarded-Forentries. - The limiter store is bounded (hard entry cap plus idle eviction), so it cannot be turned into a memory-exhaustion vector by an attacker minting source IPs.
- Runtime override. The
RATE_LIMIT_*variables are only the startup baseline. An admin can enable, disable, or retune limiting at runtime from the dashboard's Settings page (orPUT /api/config/ratelimit) without a restart; changes apply live to both ports. A saved override is persisted and wins over the environment until you clear it ("Reset to defaults" /DELETE /api/config/ratelimit), which reverts to theRATE_LIMIT_*baseline.
| Port | Role | Auth | Expose publicly? |
|---|---|---|---|
9000 |
S3 wire protocol | AWS SigV4 | Yes, if that's the point. |
9001 |
Admin API + web UI + /metrics |
X-Admin-AccessKey + X-Admin-Secret headers |
No. Keep private. |
One volume at /data. Layout:
/data
users.db # BoltDB β users, ACLs, encrypted secrets
objects/<bucket>/... # object bytes + .meta / .tags.json / .acl.json / .cors.json sidecars
uploads/<bucket>/... # in-flight multipart uploads
Back up /data as a unit. Objects and metadata are on a filesystem; any snapshot / rsync / restic flow works.
Port 9001 serves a minimal React dashboard at /. It is same-origin with the admin API β no CORS, no AWS SDK in the browser, no third-party calls.
- Log in with the admin access key and secret.
- Manage users and per-user ACLs.
- Create / list / delete buckets.
- Browse, upload, download, delete objects.
- Edit per-bucket CORS as a JSON document.
Credentials live in the browser's localStorage for the session and are sent on every request as X-Admin-* headers. There are no session cookies, no CSRF tokens, no login rate limiting β that's why the admin port must not be public. See SECURITY.md for the hardening backlog.
Standard S3 surface. Any S3 client pointed at http://<host>:9000 with forcePathStyle: true and a user's access key / secret works.
- Buckets β
PUT /:bucket,GET /,GET /:bucket(list objects),DELETE /:bucket,HEAD /:bucket. - Objects β
PUT /:bucket/:key,GET /:bucket/:key,HEAD /:bucket/:key,DELETE /:bucket/:key.GEThonours theRange:header for partial / resumable downloads, returning206 Partial ContentwithContent-Range;HEADand fullGETadvertiseAccept-Ranges: bytes. - Multipart upload β
POST /:bucket/:key?uploads,PUT /:bucket/:key?partNumber=N&uploadId=X,POST /:bucket/:key?uploadId=X(complete),DELETE /:bucket/:key?uploadId=X(abort),GET /:bucket?uploads(list uploads),GET /:bucket/:key?uploadId=X(list parts). - Object tagging β
PUT /:bucket/:key?tagging,GET /:bucket/:key?tagging,DELETE /:bucket/:key?tagging. Up to 10 tags per object (key 1-128, value 0-256 UTF-8 chars, no duplicate keys). Tags are stored in a.tags.jsonsidecar independently of object data, so setting or removing them never changes the object's ETag. - CORS β
PUT /:bucket?cors,GET /:bucket?cors,DELETE /:bucket?cors. - Presigned URLs β SigV4
X-Amz-*query-string style, TTL up to the configured expiry, no server-side state needed.
XML in, XML out. Matches AWS S3 response shapes for ListAllMyBucketsResult, ListBucketResult, CORSConfiguration, InitiateMultipartUploadResult, CompleteMultipartUploadResult, and the standard <Error> body. ETags are the hex MD5 of object bytes, quoted. Multipart ETags are <hex>-<partCount>, matching S3's composite format.
export AK=your_access_key
export SK=your_secret_key
# Create a bucket
curl -X PUT http://localhost:9000/my-bucket \
--aws-sigv4 "aws:amz:us-east-1:s3" --user "$AK:$SK"
# Upload an object
curl -X PUT http://localhost:9000/my-bucket/hello.txt \
--aws-sigv4 "aws:amz:us-east-1:s3" --user "$AK:$SK" \
--data-binary 'hello'
# Download it back
curl http://localhost:9000/my-bucket/hello.txt \
--aws-sigv4 "aws:amz:us-east-1:s3" --user "$AK:$SK"All admin API endpoints live under /api/* so they cannot collide with the React SPA's client-side routes (/users, /buckets, /buckets/:name/cors, ...) served at the root. /health and /metrics stay at the root as operational endpoints.
Every authenticated request carries:
X-Admin-AccessKey: <your-admin-access-key>
X-Admin-Secret: <your-admin-secret>
GET /healthβ{ "status": "ok" }β unauthenticated, suitable for readiness probes.
POST /api/usersβ create a user. Server generates the access key + secret and returns them once in the response. Body takes anaclarray.GET /api/usersβ list users (secrets never returned).PUT /api/users/:accessKeyIDβ replace ACL.DELETE /api/users/:accessKeyIDβ remove.
Admin vs regular users. "Admin" is not a flag β it's an ACL pattern. A user is considered admin (can log in to the dashboard and hit the admin API) if and only if their ACL contains {"effect":"Allow","buckets":["*"],"actions":["*"]}. Anything narrower is an S3-only user, scoped to whatever the ACL allows, and cannot access the admin surface. Multiple admins are fine. New users created from the admin UI start with an empty ACL β edit the ACL afterwards to grant exactly the access they need.
Examples:
// Admin β full access, can use the dashboard
{ "acl": [{ "effect": "Allow", "buckets": ["*"], "actions": ["*"] }] }
// Read-only user on one bucket β no dashboard access
{ "acl": [{ "effect": "Allow", "buckets": ["reports"], "actions": ["s3:GetObject", "s3:ListBucket"] }] }
// Write-only uploader β no dashboard, no reads
{ "acl": [{ "effect": "Allow", "buckets": ["uploads"], "actions": ["s3:PutObject"] }] }Every S3 bucket and object operation is mounted at /api/s3/* with a JSON wire format. Same handlers, same storage, just admin auth instead of SigV4. This is what the embedded UI uses; external tooling can use it too.
GET /api/s3/β list buckets.PUT /api/s3/:bucketβ create.GET /api/s3/:bucketβ list objects.DELETE /api/s3/:bucketβ delete.PUT /api/s3/:bucket/:keyβ upload (raw body).GET /api/s3/:bucket/:keyβ download (raw body).HEAD /api/s3/:bucket/:keyβ metadata only.DELETE /api/s3/:bucket/:keyβ delete.GET /api/s3/:bucket/:keyhonoursRange:(partial download) exactly as the SigV4 surface does.PUT|GET|DELETE /api/s3/:bucket/:key?taggingβ per-object tags as JSON ({"tagSet":[{"key":"env","value":"prod"}]}).PUT|GET|DELETE /api/s3/:bucket?corsβ per-bucket CORS as JSON.
export ADMIN_AK=...
export ADMIN_SK=...
# Create a bucket
curl -X PUT http://localhost:9001/api/s3/my-bucket \
-H "X-Admin-AccessKey: $ADMIN_AK" -H "X-Admin-Secret: $ADMIN_SK"
# Upload an object
curl -X PUT http://localhost:9001/api/s3/my-bucket/hello.txt \
-H "X-Admin-AccessKey: $ADMIN_AK" -H "X-Admin-Secret: $ADMIN_SK" \
--data-binary 'hello'
# Create a user with full access
curl -X POST http://localhost:9001/api/users \
-H "X-Admin-AccessKey: $ADMIN_AK" -H "X-Admin-Secret: $ADMIN_SK" \
-H "Content-Type: application/json" \
-d '{"acl":[{"effect":"Allow","buckets":["*"],"actions":["*"]}]}'CORS lives on the bucket, exactly like AWS S3. There is no global allowlist, no CORS_ALLOWED_ORIGINS env var. A bucket with no CORS configuration rejects cross-origin browser requests β that is the S3 contract.
PUT /:bucket?cors(port 9000, XML body) orPUT /api/s3/:bucket?cors(port 9001, JSON body)GET /:bucket?cors/GET /api/s3/:bucket?corsDELETE /:bucket?cors/DELETE /api/s3/:bucket?cors
{
"CORSRules": [
{
"AllowedMethods": ["GET", "PUT"],
"AllowedOrigins": ["https://app.example.com"],
"AllowedHeaders": ["*"],
"ExposeHeaders": ["ETag"],
"MaxAgeSeconds": 600
}
]
}Same grammar as AWS PutBucketCors:
<CORSConfiguration>
<CORSRule>
<AllowedMethod>GET</AllowedMethod>
<AllowedOrigin>https://app.example.com</AllowedOrigin>
<MaxAgeSeconds>600</MaxAgeSeconds>
</CORSRule>
</CORSConfiguration>Every response carries an x-amz-request-id header (UUIDv4, minted per request). Error bodies repeat the same ID as <RequestId> in XML or requestId in JSON. Use this to correlate a client-visible failure with a server log line.
One JSON line per request at the end of handling:
{"time":"2026-04-14T07:15:03.882Z","level":"INFO","msg":"http_request","method":"GET","path":"/s3/:bucket","status":200,"duration_ms":3.1,"remote_ip":"10.0.0.4","request_id":"5aba...","auth_method":"sigv4","user_access_key":"AKIA...","bytes_in":0,"bytes_out":482}Stable fields. path is always the route template (no object keys, no signatures). Query strings are stripped. Status drives the level: 5xx β ERROR, 4xx β WARN, else INFO. Configure with LOG_LEVEL and LOG_FORMAT.
GET /metrics on port 9001 serves Prometheus text format. ByteBucket speaks the format β it does not bundle a scraper. Point any Prometheus-compatible collector at it (Prometheus, Grafana Agent, VictoriaMetrics, the OpenTelemetry Collector's Prometheus receiver).
Exposed series:
http_requests_total{method,path,status}β counter.http_request_duration_seconds{method,path}β latency histogram.http_request_size_bytes,http_response_size_bytesβ payload histograms.bytebucket_multipart_uploads_in_progressβ gauge.bytebucket_objects_bytes_total{bucket}β per-bucket byte total (best-effort delta, not reconciled on restart).- Standard
go_*andprocess_*collectors.
The endpoint is unauthenticated. It relies on the same network boundary that protects the admin port (see SECURITY.md).
On SIGTERM or SIGINT the server stops accepting new connections and drains in-flight requests for up to 30 seconds before exiting. Kubernetes' default terminationGracePeriodSeconds is 30s, which means Shutdown wins the race to SIGKILL in a normal rollout.
import { S3Client, PutObjectCommand, GetObjectCommand } from '@aws-sdk/client-s3';
import { getSignedUrl } from '@aws-sdk/s3-request-presigner';
const s3 = new S3Client({
region: 'us-east-1',
endpoint: 'http://localhost:9000',
forcePathStyle: true,
credentials: { accessKeyId: 'AK', secretAccessKey: 'SK' },
});
await s3.send(new PutObjectCommand({ Bucket: 'b', Key: 'k.txt', Body: 'hi' }));
const url = await getSignedUrl(s3, new GetObjectCommand({ Bucket: 'b', Key: 'k.txt' }), { expiresIn: 900 });Multipart, presigned URLs, and streaming uploads all work as expected.
import boto3
s3 = boto3.client(
's3',
endpoint_url='http://localhost:9000',
aws_access_key_id='AK',
aws_secret_access_key='SK',
region_name='us-east-1',
config=boto3.session.Config(s3={'addressing_style': 'path'}),
)
s3.upload_file('big.bin', 'my-bucket', 'big.bin') # uses multipart automaticallyaws --endpoint-url http://localhost:9000 s3 cp ./big.bin s3://my-bucket/big.bin[bytebucket]
type = s3
provider = Other
access_key_id = AK
secret_access_key = SK
endpoint = http://localhost:9000
force_path_style = true
It's plain HTTP + JSON; use fetch, axios, requests, httpx, or curl. No SDK, no SigV4.
Everything lives under /data:
/data/
users.db # BoltDB
objects/
<bucket>/
<object> # raw bytes
<object>.meta # JSON sidecar: ETag, checksums, user metadata
<object>.tags.json # JSON sidecar: object tag set (independent of ETag)
.acl.json # per-bucket canned ACL
.cors.json # per-bucket CORS config
uploads/
<bucket>/
<uploadId>/
manifest.json # metadata + state
<partNumber> # raw part bytes
- Backups. Snapshot the whole
/datavolume. Object bytes + their sidecar must travel together. BoltDB's single file is consistent on snapshot thanks to its write-ahead design. - Corruption recovery. If a
.metasidecar is missing, the ETag is recomputed lazily on next read. Stored objects are never mutated after PUT, so bitrot detection is a matter of periodically verifying MD5 against the stored ETag. - Deletion removes the object and its sidecar, then collapses empty parent directories.
- Max header size: 1 MiB.
- Max request body: 5 GiB on port 9000 (S3 single-PUT ceiling), 100 MiB on port 9001 (admin surface).
- Per-connection timeouts: 10 s on headers, 5 min on read/write, 120 s idle. Very large single-PUT or GET on slow links may hit the 5-min bound; prefer multipart upload for anything above a few hundred MiB.
- Multipart: 1 to 10000 parts per upload, no minimum part size enforced (real S3 requires 5 MiB for all but the last part β ByteBucket is lenient).
- Object tags: up to 10 per object; key 1-128 and value 0-256 UTF-8 chars; no duplicate keys; tagging document capped at 16 KiB.
- Rate limiting: off by default (see Configuration). When enabled, requests are throttled per client IP and over-limit calls get
503 SlowDownwithRetry-After. - Presigned URL expiry: bounded by the request's
X-Amz-Expiresclaim; no server-side cap beyond what the client signed. - Versioning, object locking, server-side encryption, replication, and lifecycle policies: not implemented.
- BoltDB is a single-writer embedded DB. Fine for up to tens of thousands of users on a single node; don't expect horizontal scale.
SignatureDoesNotMatchβ clock skew between client and server, wrong region (ByteBucket treats all requests asus-east-1), or trailing slash / header canonicalisation differences. The error body's<RequestId>matches a server log line with the full canonical request trace atDEBUG.NoSuchCORSConfigurationon a preflight β set one via the admin UI or the?corsendpoint.- Admin UI says "Invalid credentials" β you're hitting
/api/userswithX-Admin-*headers; the super-user bootstrap only runs when the user DB is empty. Check thatENCRYPTION_KEYmatches what was used on first boot. - Lost admin credentials or
ENCRYPTION_KEYβ delete/data/users.dband restart with fresh env vars. Objects survive; users and ACLs are gone. - Empty
<Owner>ordummy-*in responses β you're on an older build. Upgrade toghcr.io/byteink/bytebucket:latest. - Connection hangs on large uploads β use multipart. Per-connection write timeout is 5 minutes.
- Metrics endpoint returns 404 β you hit port 9000.
/metricsis on 9001.
Licensed under the Server Side Public License. Free for open-source and commercial use; offering ByteBucket itself as a managed, paid service requires open-sourcing the complete service stack.