Why this exists
Transaction.confidence_score: float = 1.0 was introduced in v1.1 as part of the Transaction model enrichment (phase 21). The intent was to carry a per-row confidence signal from the extraction pipeline to consumers.
However, no code in the extraction pipeline ever sets this field. Every Transaction produced by the pipeline has confidence_score = 1.0 (the default). The field is serialised and deserialised correctly, but the value is always meaningless.
The design decision logged at the time: "confidence_score weight constants not configurable (CURR-03) — deferred."
What exists today
# Transaction model
confidence_score: float = 1.0 # default, never overwritten
The pipeline flow:
RowBuilder builds raw row dicts — no confidence signal
PDFTableExtractor.extract() calls dicts_to_transactions(rows) — confidence_score defaults to 1.0
- No code in
extraction/, services/, or processor.py sets confidence_score to anything else
The only place confidence_score is used at all is in iban_spatial_filter.py — but that is a completely separate IBANCandidate.confidence_score field, not Transaction.confidence_score.
What needs to change
Define what factors should contribute to Transaction.confidence_score and wire them in the extraction pipeline. Candidate signals already present in the pipeline:
| Signal |
Source |
Suggested weight |
| Row has missing balance |
RowPostProcessor |
-0.2 |
| Date parse used fallback |
DateParserService |
-0.1 |
Row flagged by filter_invalid_dates but kept |
TransactionFilterService |
-0.3 |
| Row from low-density page |
ContentDensityService |
-0.1 |
| Multi-line merge applied |
RowMergerService |
-0.05 |
The score should be clamped to [0.0, 1.0].
The weight constants should live in a ScoringConfig dataclass (injectable, following the pattern established for ScoringConfig in PR #36) rather than being hardcoded in the extractor.
What will change
| Area |
Change |
New: domain/models/scoring_config.py |
ScoringConfig with penalty weight constants |
extraction/pdf_extractor.py |
Compute and set tx.confidence_score based on extraction signals |
extraction/row_post_processor.py |
Emit a signal when balance is missing |
services/transaction_filter.py |
Optionally emit signal for borderline date rows |
PDFTableExtractor / PDFProcessingOrchestrator |
Accept optional ScoringConfig |
Risk
- Medium. The field is currently inert so there is no regression risk on existing behaviour. However, introducing scoring logic in the extraction layer requires the extraction layer to know about warning events — which ties into CURR-02 (structured
ExtractionWarning). Recommend implementing CURR-02 first.
- Weight constants are subjective. They should be documented and tested, not just hardcoded.
Acceptance criteria
Why this exists
Transaction.confidence_score: float = 1.0was introduced in v1.1 as part of the Transaction model enrichment (phase 21). The intent was to carry a per-row confidence signal from the extraction pipeline to consumers.However, no code in the extraction pipeline ever sets this field. Every
Transactionproduced by the pipeline hasconfidence_score = 1.0(the default). The field is serialised and deserialised correctly, but the value is always meaningless.The design decision logged at the time: "confidence_score weight constants not configurable (CURR-03) — deferred."
What exists today
The pipeline flow:
RowBuilderbuilds raw row dicts — no confidence signalPDFTableExtractor.extract()callsdicts_to_transactions(rows)—confidence_scoredefaults to 1.0extraction/,services/, orprocessor.pysetsconfidence_scoreto anything elseThe only place
confidence_scoreis used at all is iniban_spatial_filter.py— but that is a completely separateIBANCandidate.confidence_scorefield, notTransaction.confidence_score.What needs to change
Define what factors should contribute to
Transaction.confidence_scoreand wire them in the extraction pipeline. Candidate signals already present in the pipeline:RowPostProcessorDateParserServicefilter_invalid_datesbut keptTransactionFilterServiceContentDensityServiceRowMergerServiceThe score should be clamped to
[0.0, 1.0].The weight constants should live in a
ScoringConfigdataclass (injectable, following the pattern established forScoringConfigin PR #36) rather than being hardcoded in the extractor.What will change
domain/models/scoring_config.pyScoringConfigwith penalty weight constantsextraction/pdf_extractor.pytx.confidence_scorebased on extraction signalsextraction/row_post_processor.pyservices/transaction_filter.pyPDFTableExtractor/PDFProcessingOrchestratorScoringConfigRisk
ExtractionWarning). Recommend implementing CURR-02 first.Acceptance criteria
confidence_scoreto a value other than 1.0 on affected rowsScoringConfigis injectable with documented weight constantsconfidence_scoreis clamped to[0.0, 1.0]