pi-web server becomes unresponsive over time: event-loop saturation from per-token broadcast fan-out + session-file fd leak

## Summary

After ~24h of uptime with active streaming and several large sessions, the pi-web server (the supervised child) becomes **completely unresponsive**: it accepts TCP connections but never writes a response. The browser UI shows the sessions drawer stuck on "No saved sessions yet." with the `/api/sessions` request hanging forever, and the tunnel/proxy eventually returns a **"Tunnel Timed Out"** page.

This is **not** a memory problem — it's **single-threaded event-loop / syscall saturation**.

## Symptoms

- Sessions drawer renders the empty placeholder ("No saved sessions yet.") because the `/api/sessions` fetch never resolves.
- `curl` to both the supervisor (`127.0.0.1:8787`) and the child directly (`127.0.0.1:8788`) time out (HTTP `000`), even though both processes are alive and `LISTEN`ing.
- Connections accept but never get a response → they pile up (observed 38 `ESTAB` connections to the child backing up).
- Restarting the child clears it immediately (sessions persist on disk, nothing lost).

## Evidence it is CPU / event loop, not memory

System:
- 494 GB RAM, ~320 GB free, **no swap**, no OOM.

Child process:
- RSS only ~833 MB (far below V8's default ~4 GB old-space limit). 37 GB VmSize is just reserved address space.

Main thread is saturated in the kernel:
```
MainThread:  utime 291,825  vs  stime 1,715,164   (~85% of CPU in kernel/syscalls)
voluntary_ctxt_switches: 31,571,312               (31.5M wakeups)
state: running / ep_poll, ~20-30% CPU sustained for the full ~24h uptime
```
V8Worker / libuv threads are idle — this is a **main-thread** saturation, which is exactly why Node accepts sockets but never responds.

## Root cause (two compounding issues in `server.ts`)

1. **Per-streaming-event fan-out is O(events x clients x messages).**
   - Every `pi_event` (fires on *every* token/chunk while a turn streams) triggers **two `broadcast()` calls**.
   - `broadcast()` does a `JSON.stringify` then a `.send()` (a `write` syscall) **per connected realtime client**.
   - On `message_end` / `agent_end` / `compaction_end` it also calls `sessionStats(value)`, which **iterates the entire message branch** (one affected session had **892 messages**).
   - With high-frequency events x several clients x large sessions, this becomes a syscall storm (the 31.5M context switches + huge `stime`).

2. **File-descriptor leak on session files.**
   - The same session `.jsonl` files are held open repeatedly — observed one file open **18x**, others 9x / 7x / 6x — totaling **244 open fds**.
   - Session listing/reading opens files without closing them; with frequent `/api/sessions` + state polling over a long-lived server, fds and parse/I/O work accumulate.

## Why it tips from "slow" to "fully hung"

The server runs ~20-30% kernel CPU continuously (chronic). Streaming a turn on top of large sessions + accumulated connections/fds pushes the event loop past the point where it can drain incoming HTTP requests between broadcast bursts. Requests then queue indefinitely → empty drawer, perpetually-pending `/api/sessions`, tunnel timeout.

## Suggested fixes

- Throttle / coalesce `pi_event` broadcasts (batch instead of fanning out every token).
- Prune dead realtime clients; avoid the double-broadcast per event.
- Memoize or incrementally maintain `sessionStats` instead of re-scanning the full branch on every event.
- Close session-file handles after reading (fix the fd leak).

## Workaround

Restart the supervised child (sessions persist on disk):
```bash
curl -X POST http://127.0.0.1:8787/api/restart
# or
kill -9 <child-pid>   # supervisor auto-respawns
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pi-web server becomes unresponsive over time: event-loop saturation from per-token broadcast fan-out + session-file fd leak #24

Summary

Symptoms

Evidence it is CPU / event loop, not memory

Root cause (two compounding issues in `server.ts`)

Why it tips from "slow" to "fully hung"

Suggested fixes

Workaround

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

pi-web server becomes unresponsive over time: event-loop saturation from per-token broadcast fan-out + session-file fd leak #24

Description

Summary

Symptoms

Evidence it is CPU / event loop, not memory

Root cause (two compounding issues in server.ts)

Why it tips from "slow" to "fully hung"

Suggested fixes

Workaround

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Root cause (two compounding issues in `server.ts`)