[RFC] Resolving App Cold Starts and Idle RAM Bloat via SQLite-First, Container Tuning, and Host-Level zswap

# RFC: Reducing App Cold Starts and Idle RAM Bloat via SQLite-First, Container Tuning, and Host-Level zswap

## 1. The Problem
Currently, the Shard Core manages resource consumption on low-end VPS hosts by stopping inactive containers (`docker stop`) via `app_lifecycle.py`. While this frees up RAM, it introduces severe UX drawbacks:
* **High Latency (Cold Starts):** Re-activating heavy apps takes 15–30+ seconds because database and application runtimes must boot from scratch.
* **Bad UX Feedback:** Traefik's `app-error` middleware catches the resulting `502 Bad Gateway` and displays an unstyled splash page saying `"Unknown Status..."` with a flashing 2-second hard reload loop.

---

## 2. Architectural Design Decision: Rejecting Shared DB/Redis Engines
We explicitly reject the approach of using a single shared database or Redis instance across all applications. Doing so introduces severe architectural anti-patterns:
* **Security & Isolation:** Shared databases break isolation; credential leaks or SQL injections in one app could expose all other apps.
* **Single Point of Failure:** If the shared DB crashes due to memory limits, all apps on the shard go offline.
* **Version Lock-in:** Apps requiring different DB versions (e.g., PostgreSQL 15 vs. 17) would block each other from updating.

Instead, we propose keeping **strictly isolated containers** but optimizing their idle footprint down to a minimum using three technical levers: **SQLite-First (A)**, **Aggressive Database Tuning (B)**, and **Host-Level Swap/zswap (C)**.

---

## 3. Concrete Application Implementations

Here is how we will apply these rules to three target applications on the Shard:

| Application | Database Engine | Sidecars | Optimization Target |
| :--- | :--- | :--- | :--- |
| **Vaultwarden** | SQLite (Embedded) | None | Limit total RAM to ~15MB RSS. |
| **Paperless-ngx** | SQLite (Embedded) | Redis (Single task broker) | Cap Redis memory via command flags. |
| **Immich** | PostgreSQL (Tuned) | Redis, ML Container | Restrict PG buffers & limit ML workers. |

### 1. Vaultwarden (Rust Bitwarden Clone)
Vaultwarden is highly optimized and runs SQLite by default without external dependencies.
* **Database:** SQLite. No dedicated database container.
* **Optimizations:** Set container memory limits to 30MB (it idle-runs at ~15MB RSS). No tuning required.

### 2. Paperless-ngx
Paperless-ngx supports SQLite natively for single-user scenarios (our primary target).
* **Database:** Enforce SQLite in our templates (avoid Postgres).
* **Sidecars:** Requires Redis for Celery task queuing.
* **Optimizations:** Tune the `paperless-redis` container by passing memory-cap flags directly to the start command:
  ```yaml
  command: ["redis-server", "--maxmemory", "50mb", "--maxmemory-policy", "allkeys-lru"]
  ```
  This keeps the Redis memory footprint below 3MB while fully isolating the queue.

### 3. Immich
Immich is heavy and strictly requires PostgreSQL. It cannot run on SQLite.
* **Database:** PostgreSQL.
* **Sidecars:** Redis, Machine Learning (ML) container, Server.
* **Optimizations:**
  1. **Tune PostgreSQL:** Limit database caches and connections in the compose template:
     ```yaml
     command: ["postgres", "-c", "shared_buffers=16MB", "-c", "max_connections=10", "-c", "work_mem=1MB"]
     ```
     This drops the Postgres idle overhead from 45MB to ~12MB RSS.
  2. **Tune ML Container:** Set `IMMICH_MACHINE_LEARNING_WORKERS=1` and set strict CPU/RAM limit limits (e.g. `mem_limit: 150m`) to prevent RAM spikes during photo uploads.

---

## 4. The Host-Level Mitigation: Active zswap to Prevent I/O Lockups
Even with tuned containers, running 5+ apps with their own mini-Postgres/Redis instances will total ~100-150MB of idle memory. 
To prevent this memory from clogging physical RAM on XS/S instances without causing I/O lockups (Thrashing), the host OS must be prepared accordingly:
* **Host Setup:** Configure a **4–8 GB swapfile** on the host VPS.
* **zswap Pool:** Enable **`zswap` (Compressed Cache for Swap)** using fast compression (e.g., `lz4` or `zstd`) during OS provisioning.
* **Swappiness:** Set `sysctl vm.swappiness=80-100`. The host kernel will compress the idle processes of PostgreSQL/Redis in RAM (typically a 3:1 ratio). When accessed, pages are decompressed in microseconds, bypassing slow disk I/O bottleneck freezes (an issue common on cheap VPS providers like Netcup).
* **Docker safety margins:** Every container must have strict `memswap_limit` configurations. If a container exceeds its allowed swap budget, the OOM-killer terminates it instead of letting the host freeze in `%iowait`.

---

## 5. Action Items
- [ ] **[freeshard-controller]** Add swapfile, `zswap` configuration, and `vm.swappiness` setups to the VM provisioning steps in `ssh.py`.
- [ ] **[freeshard]** Implement a client-side fetch-polling script in `splash.html` to query app status asynchronously and prevent the white page reload flash.
- [ ] **[app-repository]** Apply tuning parameters (shared buffers, Redis maxmemory, etc.) and SQLite default setups to Vaultwarden, Paperless-ngx, and Immich templates.
- [ ] **[freeshard]** Update `app_lifecycle.py` to monitor `%iowait` from `/proc/stat` and dynamically adjust container states if I/O bottlenecks occur.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Resolving App Cold Starts and Idle RAM Bloat via SQLite-First, Container Tuning, and Host-Level zswap #109

RFC: Reducing App Cold Starts and Idle RAM Bloat via SQLite-First, Container Tuning, and Host-Level zswap

1. The Problem

2. Architectural Design Decision: Rejecting Shared DB/Redis Engines

3. Concrete Application Implementations

1. Vaultwarden (Rust Bitwarden Clone)

2. Paperless-ngx

3. Immich

4. The Host-Level Mitigation: Active zswap to Prevent I/O Lockups

5. Action Items

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Application	Database Engine	Sidecars	Optimization Target
Vaultwarden	SQLite (Embedded)	None	Limit total RAM to ~15MB RSS.
Paperless-ngx	SQLite (Embedded)	Redis (Single task broker)	Cap Redis memory via command flags.
Immich	PostgreSQL (Tuned)	Redis, ML Container	Restrict PG buffers & limit ML workers.

Uh oh!

[RFC] Resolving App Cold Starts and Idle RAM Bloat via SQLite-First, Container Tuning, and Host-Level zswap #109

Description

RFC: Reducing App Cold Starts and Idle RAM Bloat via SQLite-First, Container Tuning, and Host-Level zswap

1. The Problem

2. Architectural Design Decision: Rejecting Shared DB/Redis Engines

3. Concrete Application Implementations

1. Vaultwarden (Rust Bitwarden Clone)

2. Paperless-ngx

3. Immich

4. The Host-Level Mitigation: Active zswap to Prevent I/O Lockups

5. Action Items

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions