Skip to content

[DEV] Refactor DatabaseConfig usages in reconcile runtime and tests#2503

Open
BesikiML wants to merge 18 commits into
mainfrom
2417-dev-refactor-databaseconfig-usages
Open

[DEV] Refactor DatabaseConfig usages in reconcile runtime and tests#2503
BesikiML wants to merge 18 commits into
mainfrom
2417-dev-refactor-databaseconfig-usages

Conversation

@BesikiML

Copy link
Copy Markdown
Contributor

Summary

  • Refactored reconcile runtime internals to use SourceConnectionConfig and TargetConnectionConfig instead of DatabaseConfig.
  • Updated Reconciliation, TriggerReconService, and TriggerReconAggregateService to pass source/target connection config objects explicitly.
  • Updated integration tests in reconcile query-builder and aggregate/capture paths to align with the new config model.
  • Removed remaining DatabaseConfig usage from tests/integration/reconcile/test_recon_capture.py.

Backward Compatibility

  • Preserved ReconCapture constructor compatibility for legacy DatabaseConfig-style callers during the migration period.

Validation

  • uv run python -m pytest tests/integration/reconcile/test_recon_capture.py -q
    • 23 skipped
  • uv run python -m pytest tests/integration/reconcile/query_builder/test_execute.py -q
    • 1 passed, 15 skipped
  • uv run python -m pytest tests/integration/reconcile/test_aggregates_reconcile.py -q
    • 4 passed, 2 skipped
  • uv run python -m pytest tests/integration/reconcile/test_aggregates_recon_capture.py -q
    • 1 skipped

Notes

  • uv.lock changes from local environment sync were intentionally excluded from this PR.

Linked issues

Fixed issue : #2417

Tests

  • manually tested
  • added unit tests
  • added integration tests

…onnection configs

Replace DatabaseConfig usage across reconcile runtime paths with SourceConnectionConfig and TargetConnectionConfig, update service wiring and integration tests, and keep ReconCapture backward-compatible for legacy constructor callers during migration.
@BesikiML BesikiML requested a review from a team as a code owner June 11, 2026 20:38
@BesikiML BesikiML linked an issue Jun 11, 2026 that may be closed by this pull request
@CLAassistant

CLAassistant commented Jun 11, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 14.28571% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.10%. Comparing base (7ebd945) to head (f97f624).

Files with missing lines Patch % Lines
...bricks/labs/lakebridge/reconcile/reconciliation.py 0.00% 8 Missing ⚠️
...abricks/labs/lakebridge/reconcile/recon_capture.py 20.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2503      +/-   ##
==========================================
- Coverage   69.10%   69.10%   -0.01%     
==========================================
  Files         105      105              
  Lines        9482     9471      -11     
  Branches     1050     1050              
==========================================
- Hits         6553     6545       -8     
+ Misses       2735     2732       -3     
  Partials      194      194              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown

✅ 169/169 passed, 8 flaky, 2 skipped, 1h19m5s total

Flaky tests:

  • 🤪 test_installs_and_runs_local_bladebridge (12.48s)
  • 🤪 test_installs_and_runs_pypi_bladebridge (22.531s)
  • 🤪 test_transpiles_informatica_to_sparksql (23.335s)
  • 🤪 test_transpiles_informatica_to_sparksql_non_interactive[True] (22.029s)
  • 🤪 test_transpiles_informatica_to_sparksql_non_interactive[False] (4.163s)
  • 🤪 test_transpile_teradata_sql (7.044s)
  • 🤪 test_transpile_teradata_sql_non_interactive[False] (24.456s)
  • 🤪 test_transpile_teradata_sql_non_interactive[True] (24.129s)

Running from acceptance #4923

@BesikiML BesikiML self-assigned this Jun 12, 2026
Comment thread uv.lock Outdated
@@ -10,7 +10,7 @@ resolution-markers = [
]

[options]
exclude-newer = "2026-06-04T14:11:46.713298Z"
exclude-newer = "0001-01-01T00:00:00Z" # This has no effect and is included for backwards compatibility when using relative exclude-newer values.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be reverted. please use the Makefile commands e.g. make dev to install dependencies otherwise you will get different dependencies

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@m-abulazm m-abulazm left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the Makefile commands e.g. make dev to install dependencies otherwise you will get different dependencies

BesikiML added 2 commits June 16, 2026 08:55
Align the lockfile with the current main branch state to keep dependency resolution consistent across local and CI runs.
Provide uc_connection_name for the snowflake SourceConnectionConfig in the aggregate mismatch test to satisfy non-databricks source validation.
@BesikiML BesikiML requested a review from m-abulazm June 16, 2026 13:52

@m-abulazm m-abulazm left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove all DatabaseConfig usage from the code

Drop legacy DatabaseConfig constructor support in ReconCapture and require SourceConnectionConfig/TargetConnectionConfig explicitly, aligning runtime reconcile code with the config migration.
@BesikiML BesikiML requested a review from m-abulazm June 18, 2026 16:13
full_target_table = (
f"{self.database_config.target_catalog}.{self.database_config.target_schema}.{table_conf.target_name}"
f"{self.source_connection.schema}.{table_conf.source_name}"
if self.source_connection.catalog is None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

catalog is not optional. the goal of this refactoring is to simplify the code with the new type classes. please reflect that

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Align ReconCapture with SourceConnectionConfig by removing legacy optional catalog handling and always using catalog.schema.table for source identifiers.
@BesikiML BesikiML requested a review from m-abulazm June 22, 2026 12:08
BesikiML and others added 3 commits June 22, 2026 08:24
Temporary validation-only commit for updated routine search-path text; intended for follow-up PR, not this branch's final scope.
Format a multiline assignment into Black-preferred single-line style so formatting checks pass with no behavior changes.

@m-abulazm m-abulazm left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the class itself as well since it is not used anymore

Comment on lines +300 to +302
assert joined_df is not None
assert src_data is not None
assert tgt_data is not None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please do not use assert in production code. should revert or what is the goal here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, those asserts were a mypy workaround, not intentional runtime checks. mypy couldn’t narrow joined_df / src_data / tgt_data after the try/except when they were initialized as None.


return DataReconcileOutput(
mismatch=mismatch,
mismatch=cast(Any, mismatch),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same story, that was a mypy hack. mismatch starts as None but DataReconcileOutput.mismatch expects MismatchOutput.

Remove the unused DatabaseConfig shim, replace production asserts with try/except control flow in aggregate reconciliation, and use MismatchOutput() instead of cast(Any, mismatch).
@BesikiML

BesikiML commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

@m-abulazm
Good catch, agreed. DatabaseConfig and the database_config property was kept temporarily as part of the v2 migration but production code no longer uses them

@BesikiML BesikiML requested a review from m-abulazm June 23, 2026 14:25
BesikiML added 2 commits June 23, 2026 10:43
Type DataReconcileOutput.mismatch as optional and restore passing through None so reconcile tests match pre-refactor behavior without cast(Any).
Add None checks before accessing mismatch fields so mypy passes after typing DataReconcileOutput.mismatch as optional.
missing_in_src_count: int = 0
missing_in_tgt_count: int = 0
mismatch: MismatchOutput = field(default_factory=MismatchOutput)
mismatch: MismatchOutput | None = None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you changing this class

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not part of the original DatabaseConfig refactor.
In 507f692fa I removed cast(Any, mismatch) from _get_sample_data and temporarily defaulted to MismatchOutput() when sampling was skipped. That broke integration tests that expect actual.mismatch is None when there are no mismatches (e.g. test_reconcile_data_without_mismatches_and_missing).
On main, _get_sample_data already returns mismatch=None in those cases — the default_factory=MismatchOutput on the dataclass field wasn't reflected at runtime. 2a8da1fab restores that behavior and makes the type match:

  • mismatch: MismatchOutput | None = None
  • None guard before accessing .mismatch.* in recon_capture and tests.

if data_source_exception:
rule_reconcile_output = DataReconcileOutput(exception=str(data_source_exception))
else:
for rule in src_query_with_rules.rules:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this diff looks off

@BesikiML BesikiML Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This restructure is from your earlier feedback to remove production asserts in aggregate reconciliation.

Before:

  • try/except captured data_source_exception
  • A single for rule loop used if data_source_exception: ... else: assert joined_df/src_data/tgt_data is not None; reconcile....

After:

  • Success path: read → join → for rule: reconcile inside try
  • Failure path: for rule: DataReconcileOutput(exception=...) in except
    No functional change , still one output per rule, with exceptions propagated the same way. The diff is noisy because the loop moved inside the try/except blocks instead of sitting between them.

Restore the original dataclass field and keep None handling at the call
site in _get_sample_data, matching main behavior per review feedback.
@BesikiML BesikiML requested a review from m-abulazm June 23, 2026 17:06
BesikiML and others added 2 commits June 24, 2026 08:41
Resolve reconcile test conflicts by keeping SourceConnectionConfig/
TargetConnectionConfig wiring and adopting ws/run_by_user fixtures from main.

@m-abulazm m-abulazm left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BesikiML BesikiML requested a review from bishwajit-db June 26, 2026 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DEV] Refactor DatabaseConfig usages

3 participants