Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
211 changes: 211 additions & 0 deletions .planning/phases/19-collapse-orchestration-shim/19-01-PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
---
phase: 19-collapse-orchestration-shim
plan: "01"
type: execute
wave: 1
depends_on: []
files_modified:
- packages/parser-core/src/bankstatements_core/processor.py
- packages/parser-core/src/bankstatements_core/services/pdf_processing_orchestrator.py
- packages/parser-core/tests/test_iban_grouping.py
- packages/parser-core/tests/test_empty_row_filtering.py
- packages/parser-core/tests/test_header_row_filtering.py
- packages/parser-core/tests/test_repository_integration.py
autonomous: true
requirements:
- RFC-19-dead-code
- RFC-19-inline-passthrough

must_haves:
truths:
- "processor.py contains none of the 10 private methods: _filter_rows, _has_row_data, _filter_empty_rows, _is_header_row, _filter_header_rows, _has_valid_transaction_date, _filter_invalid_date_rows, _group_rows_by_iban, _write_csv_with_totals, _append_totals_to_csv"
- "pdf_processing_orchestrator.py contains no _process_single_pdf method"
- "process_all_pdfs calls extraction_orchestrator.extract_from_pdf directly at the former call site"
- "Full test suite passes with coverage >= 91%"
artifacts:
- path: "packages/parser-core/src/bankstatements_core/processor.py"
provides: "BankStatementProcessor without dead private methods"
- path: "packages/parser-core/src/bankstatements_core/services/pdf_processing_orchestrator.py"
provides: "PDFProcessingOrchestrator with _process_single_pdf inlined"
key_links:
- from: "pdf_processing_orchestrator.py process_all_pdfs"
to: "self.extraction_orchestrator.extract_from_pdf(pdf)"
via: "direct call where result = self._process_single_pdf(pdf) was"
pattern: "extraction_orchestrator\\.extract_from_pdf"
---

<objective>
Remove 10 dead private methods from BankStatementProcessor and inline the 3-line passthrough PDFProcessingOrchestrator._process_single_pdf. Delete the four test files that exclusively test those dead methods; remove one dead-method test from test_repository_integration.py.

Purpose: Eliminate stale code that duplicates logic already owned by injected services (TransactionFilterService, TransactionOrchestrator, OutputOrchestrator). Inline a passthrough that adds an indirection layer with no logic.
Output: Leaner processor.py, leaner pdf_processing_orchestrator.py, test suite still green at >= 91% coverage.
</objective>

<execution_context>
@/Users/I313149/.claude/get-shit-done/workflows/execute-plan.md
@/Users/I313149/.claude/get-shit-done/templates/summary.md
</execution_context>

<context>
@packages/parser-core/src/bankstatements_core/processor.py
@packages/parser-core/src/bankstatements_core/services/pdf_processing_orchestrator.py

<interfaces>
<!-- Key facts confirmed from codebase inspection. -->

<!-- processor.py — methods to DELETE (lines confirmed by grep): -->
<!-- Line 280: def _write_csv_with_totals -->
<!-- Line 307: self._append_totals_to_csv (called only from _write_csv_with_totals) -->
<!-- Line 316: def _append_totals_to_csv -->
<!-- Line 361: def _filter_rows -->
<!-- Line 395: def _has_row_data -->
<!-- Line 412: def _filter_empty_rows -->
<!-- Line 425: return self._filter_rows(rows, self._has_row_data, "empty rows") -->
<!-- Line 427: def _is_header_row -->
<!-- Line 475: def _filter_header_rows -->
<!-- Line 489: def _has_valid_transaction_date -->
<!-- Line 526: def _filter_invalid_date_rows -->
<!-- Line 555: def _group_rows_by_iban -->
<!-- None of these are called from run() or _process_transaction_group() — confirmed. -->
<!-- _process_transaction_group() calls self._filter_service.filter_empty_rows() and self._filter_service.filter_header_rows() — the SERVICE, not self. -->
<!-- run() calls self._transaction_orchestrator.group_by_iban() — the ORCHESTRATOR, not self._group_rows_by_iban. -->

<!-- pdf_processing_orchestrator.py — method to INLINE: -->
<!-- Line 126: result = self._process_single_pdf(pdf) <-- call site -->
<!-- Line 127: rows, page_count, iban = result <-- unpack -->
<!-- Line 185: def _process_single_pdf(self, pdf: Path) -> tuple[list[dict], int, str | None]: -->
<!-- Body (lines 198-200): -->
<!-- rows, page_count, iban = self.extraction_orchestrator.extract_from_pdf(pdf) -->
<!-- return rows, page_count, iban -->
<!-- AFTER inline, lines 126-127 become: -->
<!-- rows, page_count, iban = self.extraction_orchestrator.extract_from_pdf(pdf) -->
<!-- Then delete the entire _process_single_pdf method (lines 185 to ~203). -->

<!-- Test files to DELETE entirely (all tests call deleted private methods): -->
<!-- tests/test_iban_grouping.py (199 lines — tests _group_rows_by_iban only) -->
<!-- tests/test_empty_row_filtering.py (192 lines — tests _filter_empty_rows only) -->
<!-- tests/test_header_row_filtering.py (309 lines — tests _is_header_row, _filter_header_rows) -->

<!-- test_repository_integration.py — partial: -->
<!-- DELETE only: class TestRepositoryIntegration.test_append_totals_uses_repository (lines 89-111) -->
<!-- Keep all other tests in that file (TestFileSystemTransactionRepository, TestRepositoryMockability, etc.) -->
</interfaces>
</context>

<tasks>

<task type="auto">
<name>Task 1: Delete 10 dead private methods from BankStatementProcessor</name>
<files>
packages/parser-core/src/bankstatements_core/processor.py
</files>
<action>
Open processor.py. Delete ALL of the following private method definitions and their bodies:

1. `_write_csv_with_totals` (starts ~line 280) — delete the full def block
2. `_append_totals_to_csv` (starts ~line 316) — delete the full def block
3. `_filter_rows` (starts ~line 361) — delete the full def block
4. `_has_row_data` (starts ~line 395) — delete the full def block
5. `_filter_empty_rows` (starts ~line 412) — delete the full def block
6. `_is_header_row` (starts ~line 427) — delete the full def block
7. `_filter_header_rows` (starts ~line 475) — delete the full def block
8. `_has_valid_transaction_date` (starts ~line 489) — delete the full def block
9. `_filter_invalid_date_rows` (starts ~line 526) — delete the full def block
10. `_group_rows_by_iban` (starts ~line 555) — delete the full def block

None of these methods are called from `run()` or `_process_transaction_group()`:
- `_process_transaction_group()` calls `self._filter_service.filter_empty_rows()` and `self._filter_service.filter_header_rows()` (the injected service, not self)
- `run()` calls `self._transaction_orchestrator.group_by_iban()` (the orchestrator, not self)
- `_write_csv_with_totals` and `_append_totals_to_csv` have zero callers anywhere in src/

After deletion, verify by grep: no references to any of the 10 deleted method names should remain in processor.py.

Also delete the 4 test files that exclusively test these dead methods:
- `packages/parser-core/tests/test_iban_grouping.py` — delete entirely
- `packages/parser-core/tests/test_empty_row_filtering.py` — delete entirely
- `packages/parser-core/tests/test_header_row_filtering.py` — delete entirely

In `packages/parser-core/tests/test_repository_integration.py`, delete ONLY the `test_append_totals_uses_repository` test method (the one that calls `processor._append_totals_to_csv`). The `appended_csv` tracking on `MockTransactionRepository` can also be removed if it is only used by that test — check before deleting. Preserve all other tests in the file.
</action>
<verify>
<automated>cd packages/parser-core && python -m pytest tests/ -x -q --no-header 2>&1 | tail -5</automated>
</verify>
<done>
- processor.py contains none of the 10 deleted method names (confirm with grep)
- test_iban_grouping.py, test_empty_row_filtering.py, test_header_row_filtering.py no longer exist
- test_repository_integration.py still passes (no call to _append_totals_to_csv remains)
- Test suite passes with coverage >= 91%
</done>
</task>

<task type="auto">
<name>Task 2: Inline PDFProcessingOrchestrator._process_single_pdf</name>
<files>
packages/parser-core/src/bankstatements_core/services/pdf_processing_orchestrator.py
</files>
<action>
Open pdf_processing_orchestrator.py.

Find the call site in `process_all_pdfs` (around line 126):
```python
result = self._process_single_pdf(pdf)
rows, page_count, iban = result
```

Replace those two lines with the inlined single call:
```python
rows, page_count, iban = self.extraction_orchestrator.extract_from_pdf(pdf)
```

Then delete the entire `_process_single_pdf` method definition (lines ~185-203). The method body is:
```python
def _process_single_pdf(
self,
pdf: Path,
) -> tuple[list[dict], int, str | None]:
"""Process a single PDF file. ..."""
rows, page_count, iban = self.extraction_orchestrator.extract_from_pdf(pdf)
return rows, page_count, iban
```

No other code changes needed. The variable names `rows`, `page_count`, `iban` are already used in the lines immediately after the old call site (e.g., `pages_read += page_count`, `if iban is None and len(rows) == 0`), so the inlined call slots in exactly.
</action>
<verify>
<automated>cd packages/parser-core && python -m pytest tests/ -x -q --no-header 2>&1 | tail -5</automated>
</verify>
<done>
- pdf_processing_orchestrator.py has no `_process_single_pdf` method definition
- `process_all_pdfs` directly calls `self.extraction_orchestrator.extract_from_pdf(pdf)` at the former call site
- grep finds no `_process_single_pdf` anywhere in src/
- Test suite passes with coverage >= 91%
</done>
</task>

</tasks>

<verification>
Run the full test suite:

```bash
cd packages/parser-core && python -m pytest tests/ -x -q --no-header
```

Confirm dead-code gone:
```bash
grep -rn "_write_csv_with_totals\|_append_totals_to_csv\|_group_rows_by_iban\|_filter_rows\|_has_row_data\|_filter_empty_rows\|_is_header_row\|_filter_header_rows\|_has_valid_transaction_date\|_filter_invalid_date_rows\|_process_single_pdf" packages/parser-core/src/
```
Expected: zero hits.
</verification>

<success_criteria>
- All 10 dead private methods deleted from processor.py
- _process_single_pdf deleted from pdf_processing_orchestrator.py; call site inlined
- test_iban_grouping.py, test_empty_row_filtering.py, test_header_row_filtering.py deleted
- test_repository_integration.py test_append_totals_uses_repository removed
- pytest passes, coverage >= 91%
- No dead method names remain anywhere in packages/parser-core/src/
</success_criteria>

<output>
After completion, create `.planning/phases/19-collapse-orchestration-shim/19-01-SUMMARY.md`
</output>
102 changes: 102 additions & 0 deletions .planning/phases/19-collapse-orchestration-shim/19-01-SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
---
phase: 19-collapse-orchestration-shim
plan: "01"
subsystem: api
tags: [python, refactor, dead-code, processor, orchestrator]

# Dependency graph
requires: []
provides:
- "BankStatementProcessor without 10 dead private methods (_write_csv_with_totals, _append_totals_to_csv, _filter_rows, _has_row_data, _filter_empty_rows, _is_header_row, _filter_header_rows, _has_valid_transaction_date, _filter_invalid_date_rows, _group_rows_by_iban)"
- "PDFProcessingOrchestrator with _process_single_pdf inlined into process_all_pdfs"
affects: [20-delete-facade-passthroughs, 21-unify-word-utils]

# Tech tracking
tech-stack:
added: []
patterns:
- "Dead code purge: private methods superseded by injected services are deleted outright, not kept as wrappers"
- "Inline passthrough: single-statement forwarding methods are inlined at call site"

key-files:
created: []
modified:
- packages/parser-core/src/bankstatements_core/processor.py
- packages/parser-core/src/bankstatements_core/services/pdf_processing_orchestrator.py
- packages/parser-core/tests/test_repository_integration.py

key-decisions:
- "Deleted 10 dead private methods from BankStatementProcessor; logic already owned by injected TransactionFilterService, TransactionProcessingOrchestrator, and OutputOrchestrator"
- "Inlined PDFProcessingOrchestrator._process_single_pdf (3-line passthrough with no logic) directly at its single call site in process_all_pdfs"
- "Deleted test_iban_grouping.py, test_empty_row_filtering.py, test_header_row_filtering.py (tested only dead methods)"
- "Removed test_append_totals_uses_repository and appended_csv tracking from MockTransactionRepository (only used by that deleted test)"
- "test_architecture.py failure (row_merger.py shim import) is pre-existing and out of scope for this plan"

patterns-established:
- "Injected service owns logic: when a service is injected for a concern, the processor must NOT duplicate that logic as private methods"

requirements-completed:
- RFC-19-dead-code
- RFC-19-inline-passthrough

# Metrics
duration: 5min
completed: 2026-03-24
---

# Phase 19 Plan 01: Delete Dead Private Methods and Inline Passthrough Summary

**Deleted 10 dead private methods from BankStatementProcessor and inlined the 3-line PDFProcessingOrchestrator._process_single_pdf passthrough; 3 test files removed, coverage at 92.35%**

## Performance

- **Duration:** 5 min
- **Started:** 2026-03-24T15:31:18Z
- **Completed:** 2026-03-24T15:36:23Z
- **Tasks:** 2
- **Files modified:** 4 (1 deleted test file group + 2 src files + 1 test file)

## Accomplishments
- Removed 10 dead private methods from `processor.py`: `_write_csv_with_totals`, `_append_totals_to_csv`, `_filter_rows`, `_has_row_data`, `_filter_empty_rows`, `_is_header_row`, `_filter_header_rows`, `_has_valid_transaction_date`, `_filter_invalid_date_rows`, `_group_rows_by_iban`
- Inlined `PDFProcessingOrchestrator._process_single_pdf` at its single call site (`process_all_pdfs`), eliminating a pure-indirection wrapper
- Deleted 3 test files (`test_iban_grouping.py`, `test_empty_row_filtering.py`, `test_header_row_filtering.py`) that tested only the deleted dead methods
- Removed `test_append_totals_uses_repository` and `appended_csv` tracking from `test_repository_integration.py`
- Test suite: 1301 passed, 9 skipped, 92.35% coverage (threshold 91%)

## Task Commits

1. **Task 1: Delete 10 dead private methods from BankStatementProcessor** - `2cd7c23` (refactor)
2. **Task 2: Inline PDFProcessingOrchestrator._process_single_pdf** - `7752df7` (refactor)

## Files Created/Modified
- `packages/parser-core/src/bankstatements_core/processor.py` - Removed 10 dead private methods and unused `Callable` import
- `packages/parser-core/src/bankstatements_core/services/pdf_processing_orchestrator.py` - Inlined `_process_single_pdf`; deleted method definition
- `packages/parser-core/tests/test_repository_integration.py` - Removed `test_append_totals_uses_repository` test and `appended_csv` tracking from mock
- `packages/parser-core/tests/test_iban_grouping.py` - Deleted entirely
- `packages/parser-core/tests/test_empty_row_filtering.py` - Deleted entirely
- `packages/parser-core/tests/test_header_row_filtering.py` - Deleted entirely

## Decisions Made
- Deleted `Callable` import from `processor.py` as it was only used by the now-deleted `_filter_rows` method
- Preserved `_sort_transactions_by_date` in `processor.py` — it is NOT in the 10 dead methods list and is tested by `test_processor_refactored_methods.py`
- `transaction_filter.py` references to `_is_header_row` and similar are the canonical owners, not violations
- Pre-existing `test_architecture.py` failure (`row_merger.py` shim import) is out of scope; logged to deferred items

## Deviations from Plan

None - plan executed exactly as written.

## Issues Encountered
- `test_architecture.py::test_no_production_shim_imports` was failing before this plan's changes (4 violations pre-existing, reduced to 1 violation by earlier `feat(19-02)` commit). The remaining `row_merger.py` violation is pre-existing and out of scope. All 1301 non-architecture tests pass.

## User Setup Required
None - no external service configuration required.

## Next Phase Readiness
- `processor.py` and `pdf_processing_orchestrator.py` are clean of dead code
- Ready for phase 20 (delete thin facade pass-throughs) and phase 21 (unify word utils)
- The pre-existing `row_merger.py` shim import violation should be addressed in a follow-up plan

---
*Phase: 19-collapse-orchestration-shim*
*Completed: 2026-03-24*
Loading
Loading