[Bug]: Docker v0.9.0 read-only compose breaks boot — ~/.crawl4ai not mounted & writable tmpfs are root-owned, so the worker crash-loops and nothing binds :11235

### crawl4ai version

0.9.0 — Docker server image `unclecode/crawl4ai:latest` (the secure-by-default Compose stack).

### Expected Behavior

`docker compose up` (with `CRAWL4AI_API_TOKEN` set so gunicorn binds a non-loopback interface) should start the FastAPI/gunicorn server and bind `:11235`. `curl http://localhost:11235/health` should return `{"status":"ok"}` and the crawl endpoints should work.

### Current Behavior

The gunicorn worker crash-loops on boot and nothing ever binds `:11235`; `curl http://localhost:11235/health` returns `curl: (56) Recv failure: Connection reset by peer` (the host port-proxy accepts the TCP connection, but there is no upstream listener).

**Root causes** (all verified on `unclecode/crawl4ai:latest`, Docker Desktop, container running as `appuser`, uid/gid 999):

1. **`~/.crawl4ai` is never mounted — the primary failure.** crawl4ai creates its home dir `/home/appuser/.crawl4ai` (DB, logs, seeder index) via `os.makedirs(...)` at **import time** (`crawl4ai/async_database.py`). No tmpfs covers it and `read_only: true` makes the rootfs read-only, so import raises `OSError: [Errno 30] Read-only file system: '/home/appuser/.crawl4ai'`. The worker fails to boot, respawns, and supervisord gives up (`gunicorn entered FATAL state, too many start retries`).

2. **The "writable" tmpfs mounts come up `root:root` and are unwritable by `appuser`.** Observed inside the running container:

   ```
   /tmp                        1777 root:root  WRITABLE
   /home/appuser/.cache         755 root:root  denied
   /var/lib/crawl4ai/outputs    700 root:root  denied   (mode=0700 in compose)
   /var/lib/redis               750 root:root  denied
   ```

   A bare tmpfs mounted over a directory that already exists in the image takes the **mountpoint's mode** (`.cache` 755, `redis` 750) with `root:root` ownership — only `/tmp` (underlying mode 1777) is writable. Since **both gunicorn and redis run as `appuser`** (`supervisord.conf` → `user=appuser`), redis can't persist to `/var/lib/redis` and the server can't write artifacts to `/var/lib/crawl4ai/outputs`. (Note: this corrects an earlier claim that bare tmpfs are always `1777`/writable — that holds for a *non-existent* mountpoint like `--tmpfs /foo`, but not for these pre-existing image dirs.)

3. **The `~/.cache` tmpfs shadows the baked-in Chromium.** The Dockerfile bakes Playwright's Chromium into `/home/appuser/.cache/ms-playwright` and `PLAYWRIGHT_BROWSERS_PATH` is unset. A tmpfs over the whole `~/.cache` hides it, so even after (1) and (2) are fixed, crawling fails because Playwright can't find the browser. Verified: with `--tmpfs /home/appuser/.cache`, `ls ~/.cache/ms-playwright` → `No such file or directory`.

4. **gunicorn ≥26 control socket (non-fatal, but noisy).** gunicorn opens a control socket under `$HOME/.gunicorn` by default; `$HOME` (`/home/appuser`) is on the read-only rootfs with no tmpfs, so every boot logs `[ERROR] Control server error: [Errno 30] Read-only file system: '/home/appuser/.gunicorn'`.

Net effect: import-time / worker failure → no listener on `:11235` → connection reset from the host.

### Is this reproducible?

Yes

### Inputs Causing the Bug

```bash
# Image:  unclecode/crawl4ai:latest  (v0.9.0)
# Config: the secure-by-default docker-compose.yml shipped on the 0.9.0 branch
# Relevant settings:
read_only: true
user: "appuser"
tmpfs:
  - /tmp
  - /var/lib/redis
  - /var/lib/crawl4ai/outputs:mode=0700
  - /home/appuser/.cache
```

### Steps to Reproduce

```bash
1. Use the v0.9.0 docker-compose.yml (read_only: true + user: appuser + the tmpfs list above).
2. Set CRAWL4AI_API_TOKEN (so gunicorn binds [::]:11235 instead of loopback) and run:
     docker compose up -d
3. docker logs <container>  ->  OSError: Read-only file system: '/home/appuser/.crawl4ai';
   the worker exits and respawns; "gunicorn entered FATAL state".
4. curl http://localhost:11235/health  ->  curl: (56) Recv failure: Connection reset by peer
```

### Code snippets

```yaml
# Proposed fix: give the non-root runtime user (uid/gid 999 = appuser) ownership of
# the writable tmpfs, add the missing ~/.crawl4ai mount, and stop shadowing the
# baked Chromium by scoping the cache tmpfs to just the writable subdir.
read_only: true
tmpfs:
  - /tmp
  - /var/lib/redis:uid=999,gid=999,mode=0700
  - /var/lib/crawl4ai/outputs:uid=999,gid=999,mode=0700
  - /home/appuser/.crawl4ai:uid=999,gid=999,mode=0700
  - /home/appuser/.cache/url_seeder:uid=999,gid=999,mode=0700   # NOT all of ~/.cache (that shadows ms-playwright)
  - /home/appuser/.gunicorn:uid=999,gid=999,mode=0700           # or pass gunicorn --no-control-socket
```

Verified with this change: the container boots healthy, `/health` returns `200`, a live crawl of `https://example.com` returns `success: true` with rendered markdown, and no read-only / permission errors remain in the logs.

## Supporting Information

### OS

macOS (Docker Desktop); the container itself is Linux (`python:3.12-slim-bookworm`).

### Python version

3.12 (inside the `unclecode/crawl4ai:latest` image)

### Browser

Chromium (Playwright, baked into the image)

### Browser version

Playwright Chromium build `chromium-1223`

### Error logs & Screenshots (if applicable)

```text
OSError: [Errno 30] Read-only file system: '/home/appuser/.crawl4ai'
  File ".../crawl4ai/async_database.py", line 20, in <module>
    os.makedirs(DB_PATH, exist_ok=True)
[ERROR] Worker (pid:68) exited with code 3.
[ERROR] Shutting down: Master
[ERROR] Reason: Worker failed to boot.
WARN exited: gunicorn (exit status 3; not expected)
INFO gave up: gunicorn entered FATAL state, too many start retries too quickly

# non-fatal, every boot:
[ERROR] Control server error: [Errno 30] Read-only file system: '/home/appuser/.gunicorn'
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Docker v0.9.0 read-only compose breaks boot — ~/.crawl4ai not mounted & writable tmpfs are root-owned, so the worker crash-loops and nothing binds :11235 #2027

crawl4ai version

Expected Behavior

Current Behavior

Is this reproducible?

Inputs Causing the Bug

Steps to Reproduce

Code snippets

Supporting Information

OS

Python version

Browser

Browser version

Error logs & Screenshots (if applicable)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Docker v0.9.0 read-only compose breaks boot — ~/.crawl4ai not mounted & writable tmpfs are root-owned, so the worker crash-loops and nothing binds :11235 #2027

Description

crawl4ai version

Expected Behavior

Current Behavior

Is this reproducible?

Inputs Causing the Bug

Steps to Reproduce

Code snippets

Supporting Information

OS

Python version

Browser

Browser version

Error logs & Screenshots (if applicable)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions