Skip to content

fix(auto-creation): batch merge journal writes#474

Open
Mirabis wants to merge 1 commit into
MotWakorb:devfrom
Mirabis:feat/batch-journal-wal
Open

fix(auto-creation): batch merge journal writes#474
Mirabis wants to merge 1 commit into
MotWakorb:devfrom
Mirabis:feat/batch-journal-wal

Conversation

@Mirabis

@Mirabis Mirabis commented Jun 3, 2026

Copy link
Copy Markdown

Summary

  • add journal.log_entries() to persist multiple journal rows in one transaction
  • buffer auto-creation merge_stream audit rows in ActionExecutor and flush them at threshold / pipeline completion
  • preserve the existing per-merge audit payload, including batch_id, stream-id-only before/after values, and scored-fuzzy provenance
  • update focused unit coverage for batch journal writes and buffered merge flushes

Fixes #473.

Why

Large auto-creation runs can perform many merge journal writes. Committing every merge row individually increases SQLite WAL churn and can produce very large WAL files during a single run. Batching keeps the same audit trail while reducing transaction frequency.

Validation

Passed:

uv run --python 3.12 --with-requirements backend/requirements.txt python -m pytest backend/tests/unit/test_journal.py backend/tests/unit/test_auto_creation_executor.py backend/tests/unit/test_auto_creation_engine.py backend/tests/unit/test_scored_fuzzy_merge.py backend/tests/unit/test_0hjrk_run_summary_invariants.py --tb=short --no-header -p no:warnings --no-cov
261 passed in 7.96s

Passed:

uv run --python 3.12 --with-requirements backend/requirements.txt python -m py_compile backend/journal.py backend/auto_creation_executor.py backend/auto_creation_engine.py backend/tests/unit/test_journal.py backend/tests/unit/test_auto_creation_executor.py backend/tests/unit/test_auto_creation_engine.py backend/tests/unit/test_scored_fuzzy_merge.py backend/tests/unit/test_0hjrk_run_summary_invariants.py

Local broader gate note:

uv run --python 3.12 --with-requirements backend/requirements.txt python -m pytest backend/tests/unit backend/tests/integration backend/tests/routers --tb=short --no-header -p no:warnings
7 failed, 4475 passed, 1 skipped in 285.20s

The 7 local failures are outside this change:

  • backend/tests/unit/test_date_placeholders.py::test_expand_date_placeholders_format is platform/libc-sensitive locally: %Q expands to Q on this macOS runner.
  • 6 backend/tests/routers/test_alert_methods.py SMTP cases fail because this local test registry exposes only create_test and probe_failures_threshold_test, not smtp.

Full backend/tests/ also reported E2E setup errors locally because those tests attempted to connect to a live server and received Connection refused.

@Mirabis Mirabis force-pushed the feat/batch-journal-wal branch from f810e5d to b786275 Compare June 3, 2026 12:08
@Mirabis Mirabis changed the title Batch auto-creation merge journal writes fix(auto-creation): batch merge journal writes Jun 3, 2026
@Mirabis Mirabis force-pushed the feat/batch-journal-wal branch from b786275 to b4383c0 Compare June 3, 2026 12:12
@Mirabis Mirabis marked this pull request as draft June 3, 2026 12:12
@Mirabis Mirabis marked this pull request as ready for review June 3, 2026 12:15

@MotWakorb MotWakorb left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, @Mirabis — and for the unusually clear write-up and repro on #473. The batching approach is the right lever for the WAL churn, the shared _build_journal_entry helper is clean, and the audit payload (batch_id, stream-ids-only before/after, scored-fuzzy provenance) is faithfully preserved. We ran a correctness review plus an SRE reliability pass and want to take this — with one required change and two test additions.

Required before merge

1. Guarantee the end-of-pipeline flush on the exception path (durability).
executor._flush_journal_buffer() in auto_creation_engine._process_streams sits just before return results with no try/finally. _process_streams has several await points after live merges have already executed against Dispatcharr (reorder, reconcile-orphans, EPG retry, probe). If any of those raises, the exception propagates up and the flush is skipped — so up to _journal_flush_threshold (100) buffered rows describing merges that actually happened are lost. That's exactly the aborted/bad-run scenario the per-merge journal (bd-0emgo.5) exists to make recoverable, and the old per-row commit didn't have this gap. Please wrap so the flush always runs:

try:
    ...  # passes 2–6
    return results
finally:
    executor._flush_journal_buffer()

_flush_journal_buffer already no-ops on an empty buffer and isolates its own failure, so a finally flush is safe on the exception path.

2. Regression test for #1 — a test where a later pipeline pass raises after a merge, asserting the buffered entry is still flushed.

3. Threshold-flush test — none of the current tests drive ≥100 merges to prove the automatic threshold flush fires and clears the buffer; please add one.

Minor

  • The _journal_merge docstring still says failures are "swallowed (journal.log_entry already does this internally)" — that path no longer calls log_entry; please reword to describe the buffer/flush model. A one-line note that timestamp is now stamped at flush time (not per-merge) would also help the next reader.

Everything else — correctness equivalence (incl. the falsy empty-dict case), credential safety, dry-run isolation, batch_id — checks out.

One heads-up for #473 itself: this PR meaningfully reduces WAL growth, but our reliability review found the ~15 GiB memory crash has a separate root cause (the in-memory execution_log + end-of-run json.dumps, on a disk-backed /config), so we'll address that separately — details in my comment on #473. Thanks again; genuinely good contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants