Skip to content

fix(hub): generous read-pool FD cap + safe rows.Close defer-conversion (#79) #290

Description

@physercoe

Two cheap resource-management fixes from #79 (the WAL/VACUUM half shipped in #288 / ADR-045 D4).

Part A — generous read-pool FD safety cap

The reader pools are currently uncapped (server.go s.db; per-team eventsR/digestR via openStorePool(path,false,...)). This is deliberate (WAL lets readers run concurrently) but unbounded — under pathological SSE fan-out it can exhaust file descriptors.

Add a generous, env-tunable SetMaxOpenConns to all reader poolss.db and the per-team reader pools (openStorePool reader branch):

  • Default high enough to preserve WAL read concurrency (e.g. 64), overridable via HUB_DB_MAX_READ_CONNS.
  • This is an FD-exhaustion safety bound, NOT a throughput cap — do not set it tight.
  • Update the server.go comment that currently says the reader pool "stays the uncapped general/reader pool" to reflect the new generous bound (don't leave the doc contradicting the code).

Part B — convert SAFE manual rows.Close() to defer (#79.5)

Some handlers rows.Close() manually instead of defer rows.Close(), fragile against a future early return.

Convert to defer rows.Close() only where a function runs a SINGLE query and closes once at the end.

TRAP — do NOT blanket-convert

Many manual closes are loop-body or chunked-multi-query closes (e.g. handlers_search_sessions.go chunks, handlers_insights.go, otlp_export.go, phase_completion_gate.go, handlers_criteria.go, loop_sweep.go, handlers_admin_teams.go). There, defer would accumulate open result sets until the function returns (a leak / lock-holder) — that is WRONG. Leave every manual close that is inside a loop, or where the function opens more than one *sql.Rows sequentially. Convert ONLY the lone-single-query case. The genuine candidate set may be small or empty — report what you converted and what you left, with one-line reasons.

Verify (REQUIRED)

From hub/: PATH=/usr/local/go/bin:$PATH go build ./... && go vet ./... && go test ./internal/server/ -timeout=20m. gofmt -w. All gh pr checks green.

Done

PR Closes #<this>, label ticket:in-review. State the chosen cap default/env and the convert/leave breakdown for Part B.

Pitfalls

  • Go toolchain at /usr/local/go/bin; not Flutter.
  • Part A: apply to READER pools only — never the writer pools (already SetMaxOpenConns(1)).
  • Part B: when in doubt, LEAVE the manual close. A wrong defer-conversion is a latent resource leak; a left-alone manual close is harmless.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ticket:in-reviewPR open, CI green, awaiting maintainer reviewtier:mediumNeeds some reasoning

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions