[DEV] Refactor DatabaseConfig usages in reconcile runtime and tests#2503
[DEV] Refactor DatabaseConfig usages in reconcile runtime and tests#2503BesikiML wants to merge 18 commits into
Conversation
…onnection configs Replace DatabaseConfig usage across reconcile runtime paths with SourceConnectionConfig and TargetConnectionConfig, update service wiring and integration tests, and keep ReconCapture backward-compatible for legacy constructor callers during migration.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2503 +/- ##
==========================================
- Coverage 69.10% 69.10% -0.01%
==========================================
Files 105 105
Lines 9482 9471 -11
Branches 1050 1050
==========================================
- Hits 6553 6545 -8
+ Misses 2735 2732 -3
Partials 194 194 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
✅ 169/169 passed, 8 flaky, 2 skipped, 1h19m5s total Flaky tests:
Running from acceptance #4923 |
| @@ -10,7 +10,7 @@ resolution-markers = [ | |||
| ] | |||
|
|
|||
| [options] | |||
| exclude-newer = "2026-06-04T14:11:46.713298Z" | |||
| exclude-newer = "0001-01-01T00:00:00Z" # This has no effect and is included for backwards compatibility when using relative exclude-newer values. | |||
There was a problem hiding this comment.
this needs to be reverted. please use the Makefile commands e.g. make dev to install dependencies otherwise you will get different dependencies
m-abulazm
left a comment
There was a problem hiding this comment.
Please use the Makefile commands e.g. make dev to install dependencies otherwise you will get different dependencies
Align the lockfile with the current main branch state to keep dependency resolution consistent across local and CI runs.
Provide uc_connection_name for the snowflake SourceConnectionConfig in the aggregate mismatch test to satisfy non-databricks source validation.
m-abulazm
left a comment
There was a problem hiding this comment.
please remove all DatabaseConfig usage from the code
Drop legacy DatabaseConfig constructor support in ReconCapture and require SourceConnectionConfig/TargetConnectionConfig explicitly, aligning runtime reconcile code with the config migration.
| full_target_table = ( | ||
| f"{self.database_config.target_catalog}.{self.database_config.target_schema}.{table_conf.target_name}" | ||
| f"{self.source_connection.schema}.{table_conf.source_name}" | ||
| if self.source_connection.catalog is None |
There was a problem hiding this comment.
catalog is not optional. the goal of this refactoring is to simplify the code with the new type classes. please reflect that
Align ReconCapture with SourceConnectionConfig by removing legacy optional catalog handling and always using catalog.schema.table for source identifiers.
Temporary validation-only commit for updated routine search-path text; intended for follow-up PR, not this branch's final scope.
Format a multiline assignment into Black-preferred single-line style so formatting checks pass with no behavior changes.
m-abulazm
left a comment
There was a problem hiding this comment.
remove the class itself as well since it is not used anymore
| assert joined_df is not None | ||
| assert src_data is not None | ||
| assert tgt_data is not None |
There was a problem hiding this comment.
please do not use assert in production code. should revert or what is the goal here?
There was a problem hiding this comment.
Agreed, those asserts were a mypy workaround, not intentional runtime checks. mypy couldn’t narrow joined_df / src_data / tgt_data after the try/except when they were initialized as None.
|
|
||
| return DataReconcileOutput( | ||
| mismatch=mismatch, | ||
| mismatch=cast(Any, mismatch), |
There was a problem hiding this comment.
Same story, that was a mypy hack. mismatch starts as None but DataReconcileOutput.mismatch expects MismatchOutput.
Remove the unused DatabaseConfig shim, replace production asserts with try/except control flow in aggregate reconciliation, and use MismatchOutput() instead of cast(Any, mismatch).
|
@m-abulazm |
Type DataReconcileOutput.mismatch as optional and restore passing through None so reconcile tests match pre-refactor behavior without cast(Any).
Add None checks before accessing mismatch fields so mypy passes after typing DataReconcileOutput.mismatch as optional.
| missing_in_src_count: int = 0 | ||
| missing_in_tgt_count: int = 0 | ||
| mismatch: MismatchOutput = field(default_factory=MismatchOutput) | ||
| mismatch: MismatchOutput | None = None |
There was a problem hiding this comment.
why are you changing this class
There was a problem hiding this comment.
This was not part of the original DatabaseConfig refactor.
In 507f692fa I removed cast(Any, mismatch) from _get_sample_data and temporarily defaulted to MismatchOutput() when sampling was skipped. That broke integration tests that expect actual.mismatch is None when there are no mismatches (e.g. test_reconcile_data_without_mismatches_and_missing).
On main, _get_sample_data already returns mismatch=None in those cases — the default_factory=MismatchOutput on the dataclass field wasn't reflected at runtime. 2a8da1fab restores that behavior and makes the type match:
mismatch: MismatchOutput | None = NoneNoneguard before accessing.mismatch.*inrecon_captureand tests.
| if data_source_exception: | ||
| rule_reconcile_output = DataReconcileOutput(exception=str(data_source_exception)) | ||
| else: | ||
| for rule in src_query_with_rules.rules: |
There was a problem hiding this comment.
This restructure is from your earlier feedback to remove production asserts in aggregate reconciliation.
Before:
try/exceptcaptureddata_source_exception- A single
for ruleloop usedif data_source_exception: ... else: assert joined_df/src_data/tgt_data is not None; reconcile....
After:
- Success path:
read → join → for rule: reconcileinsidetry - Failure path:
for rule: DataReconcileOutput(exception=...)inexcept
No functional change , still one output per rule, with exceptions propagated the same way. The diff is noisy because the loop moved inside thetry/exceptblocks instead of sitting between them.
Restore the original dataclass field and keep None handling at the call site in _get_sample_data, matching main behavior per review feedback.
Resolve reconcile test conflicts by keeping SourceConnectionConfig/ TargetConnectionConfig wiring and adopting ws/run_by_user fixtures from main.
Summary
SourceConnectionConfigandTargetConnectionConfiginstead ofDatabaseConfig.Reconciliation,TriggerReconService, andTriggerReconAggregateServiceto pass source/target connection config objects explicitly.DatabaseConfigusage fromtests/integration/reconcile/test_recon_capture.py.Backward Compatibility
ReconCaptureconstructor compatibility for legacyDatabaseConfig-style callers during the migration period.Validation
uv run python -m pytest tests/integration/reconcile/test_recon_capture.py -q23 skippeduv run python -m pytest tests/integration/reconcile/query_builder/test_execute.py -q1 passed, 15 skippeduv run python -m pytest tests/integration/reconcile/test_aggregates_reconcile.py -q4 passed, 2 skippeduv run python -m pytest tests/integration/reconcile/test_aggregates_recon_capture.py -q1 skippedNotes
uv.lockchanges from local environment sync were intentionally excluded from this PR.Linked issues
Fixed issue : #2417
Tests