feat(versioning): capture and expose version history for charts, dashboards, and datasets#39603
Draft
mikebridge wants to merge 93 commits into
Draft
feat(versioning): capture and expose version history for charts, dashboards, and datasets#39603mikebridge wants to merge 93 commits into
mikebridge wants to merge 93 commits into
Conversation
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 23, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 24, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8774778 to
a1f0ddb
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #39603 +/- ##
==========================================
+ Coverage 64.31% 64.37% +0.05%
==========================================
Files 2652 2680 +28
Lines 144799 146727 +1928
Branches 33415 33852 +437
==========================================
+ Hits 93122 94449 +1327
- Misses 50016 50551 +535
- Partials 1661 1727 +66
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
6 tasks
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 27, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7979999 to
70e21bc
Compare
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 28, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
db1b08c to
c338d49
Compare
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
Apr 30, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
May 4, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c338d49 to
04c6b8b
Compare
mikebridge
pushed a commit
to mikebridge/superset
that referenced
this pull request
May 5, 2026
Phase 1 of versioning added ``sqlalchemy-continuum==1.6.0`` to ``requirements/base.txt`` directly, but the pin was missing from ``pyproject.toml``'s ``[project.dependencies]``. CI's ``check-python-deps`` job regenerates the pinned files from the ``.in`` sources via ``scripts/uv-pip-compile.sh``; without the pyproject declaration, regeneration strips the pin out, causing: ModuleNotFoundError: No module named 'sqlalchemy_continuum' …on every Python-based job (test-sqlite, test-postgres, test-mysql, unit-tests, test-postgres-hive, test-postgres-presto, test-load-examples, docker-build) because ``superset/extensions/ __init__.py`` unconditionally imports from it at module load time. Adds ``"sqlalchemy-continuum>=1.6.0, <2.0.0"`` to pyproject and re-runs ``uv-pip-compile.sh`` to sync ``base.txt`` and ``development.txt``. One package regenerates in place; the only other diffs are uv-resolver comment-graph updates (numpy's ``# via`` list) which CI's filter ignores. Fixes CI failures on PR apache#39603. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d632e5e to
e4f548e
Compare
5549d67 to
f031a2c
Compare
f031a2c to
c2b6db7
Compare
Three operational signals on the nightly prune, matching the activity-view orchestrator's emission shape so a single Grafana filter ``superset.versioning.*`` catches both sides of the feature: * ``superset.versioning.retention.pruned_transactions`` — gauge of rows actually pruned this run. ``0`` is a meaningful signal (means retention is enabled but nothing aged out); a sustained ``0`` over multiple runs is the signature of a misconfigured beat schedule that previously was log-only. * ``superset.versioning.retention.skipped`` — counter; fires for the ``retention_days <= 0`` (operator disabled) and the ``no versioned classes resolved`` (init-order regression) branches. Lets an alert distinguish "disabled" from "running and producing zero" without log scraping. * ``superset.versioning.retention.retried`` — counter; fires on every serialization conflict that triggered an inline retry. A sudden rise correlates with concurrent-write pressure and is the leading indicator for the ``OperationalError`` give-up path that re-raises after ``_MAX_RETRY_ATTEMPTS``. The activity-view side already uses ``stats_logger_manager`` for the per-phase ``superset.activity_view.*`` timings; the retention side was log-only. Bringing it up to the same standard closes the v4 CD review's "observability deficit on the nightly job" concern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Operators who redefine ``CeleryConfig`` in ``superset_config.py`` — instead of subclassing or merging the default — silently lose the ``version_history.prune_old_versions`` entry that registers the nightly prune. The capture path keeps writing shadow rows; the prune never runs; disk grows until paged. Add a startup check inside ``init_versioning`` that inspects the resolved ``CELERY_CONFIG.beat_schedule`` and emits a WARNING when the entry is absent. The misconfiguration is now visible in the deploy log instead of waiting for disk pressure to surface it at 03:00 some weeks later. Same shape as the existing ``CORS_OPTIONS["expose_headers"]`` operator note in UPDATING.md — a known-misconfiguration mode the codebase catches at startup so the team doesn't relearn it in production. From the v4 continuous-delivery review (Farley + Humble): "hidden coordination" anti-pattern — the change-set assumes the operator will do something correct (merge their override with the new default) that the code does not verify at runtime. This commit verifies it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The migration's ``downgrade()`` is correct against an empty schema — verified throughout development — but the realistic operational scenario is: we deployed, accumulated versioning rows from real saves for hours or days, then need to roll back. That code path was unexercised, and the CD review v4 flagged that absence as load- bearing: an operator hitting the rollback in anger is a worse place to find a downgrade bug than CI. New test file ``versioning_round_trip__tests.py`` matches the existing ``composite_pk_round_trip__tests.py`` pattern (in-memory SQLite + Alembic ``MigrationContext``) and walks three scenarios: * **Populated round-trip** — upgrade both migrations, insert rows into all 8 versioning tables (live + closed shadow rows, both parent and child, with field-level ``version_changes`` records and M2M shadow rows), downgrade both migrations, assert every table is gone, then re-upgrade and assert the schema shape matches the first upgrade byte-for-byte (idempotency under round-trip). * **Empty downgrade** — sanity belt-and-braces that downgrade run immediately after upgrade (no rows) is also clean. Catches the case where the population step somehow influenced the drop path. * **Indexes-downgrade idempotency** — runs ``8f3a1b2c4d5e.downgrade`` twice in a row. The second call must be a no-op (the migration uses ``if_exists=True`` on every drop) so an operator who interrupts and re-runs doesn't hit a missing-index error. The MEDIUMTEXT cross-backend dimension is delegated to the CI matrix (SQLite collapses every text column to TEXT regardless of declared type); the shape pinned here is reversibility under load. From the v4 continuous-delivery review's "untested operational rollback" finding. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d code) Four small follow-ups from the v5 review cycle, all on the operational infrastructure that landed in v4: * **`_warn_if_retention_beat_missing` runs before the kill-switch early-return** (clean-code C1 / tidy-first W1 / python W3). The retention task lives in its own beat entry and runs against existing shadow data regardless of capture; the warn-log was previously bypassed on exactly the deployment path (kill-switch flipped in anger) where the operator is most likely to also have a hand-rolled ``CeleryConfig`` with a missing prune entry. Move the call above the early-return so both misconfigurations surface at the same restart. * **`_warn_if_retention_beat_missing` handles dict-form and None `CELERY_CONFIG`** (python W2 / sqlalchemy W1 / CD-2). The default shape is ``type[CeleryConfig] | None``, but Celery itself accepts a dict via ``config_from_object``, and ``None`` is the documented "disable Celery entirely" path. The prior ``getattr(_, "beat_schedule", None) if _ else None`` fell through to the WARNING in both cases, emitting a false positive for operators who chose either shape on purpose. Discriminate by ``isinstance(dict)`` and short-circuit on ``None``. Also extract the retention task name to a class-level ``_RETENTION_TASK_NAME`` constant so the previous 90-char line shortens and the literal is no longer duplicated against the default in ``config.py``. * **`ENABLE_VERSIONING_CAPTURE` env-var-to-bool uses ``utils.parse_boolean_string``** (python W1). The hand-rolled ``.lower() == "true"`` only matched the literal ``"true"``; operators setting ``1``, ``yes``, ``on``, ``True`` (no .lower call) silently got ``False`` (capture on — the safe direction for a default-on kill-switch, but surprising and inconsistent with the rest of ``config.py`` which uses the helper). * **`from __future__ import annotations` in `versioning/factory.py`** (python W4). Every other versioning module has it; this was the lone outlier. PEP 604 union syntax in a local-variable annotation worked without it on Python 3.10+, so this is consistency, not correctness. * **Drop the dead `last_exc` / unreachable `RuntimeError`** in ``_prune_old_versions_impl`` (tidy-first dead-code finding). The retry loop always returns or re-raises; the post-loop fallback was defensive code for a control-flow path the loop's exit condition cannot reach. Replaced with a short ``AssertionError`` that mypy needs for type-narrowing and that documents the invariant (post-loop = "the loop's exit condition was changed incorrectly"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…etrics
Three v4 operational features shipped without direct test coverage; the
v5 review flagged this as load-bearing for the next refactor pass. Add
ten focused unit tests:
**`tests/unit_tests/initialization_test.py::TestInitVersioning`** — six
tests covering the kill-switch flow and the four ``CELERY_CONFIG``
shapes the warn-log helper now discriminates:
* ``test_kill_switch_off_skips_listener_registration`` — pins the
contract that ``ENABLE_VERSIONING_CAPTURE=False`` short-circuits
``init_versioning`` before either listener registers.
* ``test_warn_when_celery_beat_schedule_missing_retention_entry`` —
class-shaped config with the entry absent fires WARNING.
* ``test_no_warn_when_celery_beat_schedule_includes_retention_entry``
— class with the entry present is silent.
* ``test_no_warn_when_celery_config_is_none`` — Celery-disabled
deployment doesn't see the false-positive that motivated the v5
``isinstance / None`` guard.
* ``test_dict_form_celery_config_with_entry_does_not_warn`` — Celery's
documented dict-shape works.
* ``test_dict_form_celery_config_without_entry_warns`` — and the
dict-shape symmetry holds when the entry is missing.
**`tests/unit_tests/tasks/test_version_history_retention.py`** — four
tests pinning the statsd-emission contract on every branch:
* ``test_retention_disabled_emits_skipped_metric`` — ``retention_days=0``
fires ``.skipped``.
* ``test_no_versioned_classes_resolved_emits_skipped_metric`` — the
init-order-regression branch also fires ``.skipped`` (same metric on
purpose; dashboard alert is "no work happening", WARNING log carries
the why).
* ``test_serialization_failure_then_success_increments_retried_once``
— one ``OperationalError`` on attempt 1 fires ``.retried`` exactly
once, succeeds on attempt 2, records ``retried=1`` in the stats
dict, emits the ``.pruned_transactions`` gauge.
* ``test_all_attempts_fail_reraises_after_max_retries`` — exhausted
retries fire ``.retried`` exactly ``_MAX_RETRY_ATTEMPTS`` times and
re-raise so the outer Celery wrapper catches.
Total: 10 new tests, all passing in <1s wall-clock. Closes the v5 CD
+ clean-code finding ("the operational instrumentation that just
shipped isn't itself pipeline-gated by tests"). A future refactor that
restructures ``init_versioning`` or renames a metric now has to
deliberately update these tests rather than silently breaking the
contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ed models The versioning bootstrap probed each model with version_class() and swallowed any failure with a bare `except Exception: pass`. A model that failed to wire was silently dropped from VERSIONED_MODELS while the listeners still registered — change capture would degrade with no log, metric, or error, making the failure invisible to operators. Keep degraded-mode boot (don't fail startup), but surface the failure at WARNING with the model name and traceback so a broken wiring is visible in the deploy log. Surfaced by a Codex amin-review pass (M1). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
VERSIONED_MODELS is module-level state; test fixtures that initialize multiple Superset apps per process appended duplicate entries on each re-init. Guard the append. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
op.drop_index(if_exists=True) emits DROP INDEX IF EXISTS, which stock MySQL 5.7/8.x rejects (MariaDB-only grammar) — the downgrade raised a syntax error on the first table. Probe the inspector for existence instead, which keeps the partial-application robustness on every dialect. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The pre-migration graceful-degradation sites caught OperationalError only, but a missing relation raises ProgrammingError (UndefinedTable) on PostgreSQL — and the failed statement aborts the enclosing transaction there, so the user's save failed anyway despite the catch. Catch both exception classes and run each probe under a connection-level SAVEPOINT so a failure can't poison the outer transaction. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AuditMixinNullable stamps created_on/changed_on with naive datetime.now(); the restore stamp used datetime.utcnow(), skewing changed_on ordering on non-UTC servers (and utcnow is deprecated as of Python 3.12). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The natural-key upsert setattr'd every incoming property, so a payload 'id' on a name-matched column rewrote a live primary key, and a renamed column still carrying its old id INSERTed with a live PK while the old row's DELETE was pending in the same flush — INSERTs flush before DELETEs, colliding on the PK/UNIQUE constraints. Strip id/table_id in both branches; regression test included. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tion With override_columns, RefreshDatasetCommand commits its own transaction between the update and the new_version reads, so the response attributed the refresh's version to the user's update. Capture the identifiers before the refresh; re-read only the ETag afterwards (it must reflect the current live version). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ion_id) The dataset child-diff path (shadow_rows_valid_at + the prior-tx probe in shadow_queries.py) filters table_columns_version / sql_metrics_version by parent table_id plus a transaction-range bound. The base migration only indexed transaction_id / end_transaction_id / operation_type, and the live-row index leads with id — nothing served the table_id access pattern, so those queries fell to a seq scan as version history grew. Add a plain composite (table_id, transaction_id) on both child shadow tables (serves both queries on every dialect, no partial-index split). Folded into the existing shadow-index migration; round-trip test asserts the index is present on the child shadows and absent on the parents. Surfaced by a Codex sqlalchemy-review pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
make_versioned() runs at import of superset.extensions, so ENABLE_VERSIONING_CAPTURE=False skipped only the custom baseline/change-record listeners while Continuum's own mapper/session/engine listeners kept writing shadow and version_transaction rows on every save — contradicting the documented operator contract ('they just stop accumulating new rows'). Verified empirically: a chart save with the flag off grew slices_version and version_transaction.
Detach Continuum's write listeners in the flag-off branch. Deliberately a targeted subset of sqlalchemy_continuum.remove_versioning(): that helper also calls manager.reset(), which clears version_class_map — version_class() would then silently return the live model class and break the read-only /versions/ endpoints the flag promises to keep working. Verified both directions: flag off → no shadow/tx growth, reads intact; flag on → capture unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The marshmallow schemas in versioning/schemas.py were dead code drifting from the actual responses: the real rows carry version_uuid (absent from the schema), operation_type documented a 'restore' value that is never emitted, and the raw dicts went through Flask's jsonify — rendering issued_at as an RFC-1123 http-date instead of ISO-8601 and leaving version_uuid a UUID instance in the list but a string in the snapshot. Dump list rows and the snapshot's _version block through VersionListItemSchema (ISO datetimes, string UUIDs, single source of shape), add the missing version_uuid field, fix the operation_type description, register the schema in all three APIs' openapi_spec_component_schemas, and $ref it from the endpoint docstrings instead of bare 'type: object'. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
factory.py, api_helpers.py and UPDATING.md referenced superset.versioning.activity and /activity/ endpoints, which live on the follow-up branch (sc-107283) and don't exist in this change. For an apache/superset reviewer these point at nothing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Three cleanups to the unmerged migration, mirrored in changes/table.py: - Drop the parent shadow tables' audit columns (created_on/changed_on/created_by_fk/changed_by_fk, plus slices' last_saved_at/last_saved_by_fk): every one is in the models' __versioned__ exclude lists, so Continuum's runtime version Table omits them and nothing ever writes them — permanently-NULL drift that autogenerate would flag. Verified against a live app: no runtime shadow table carries any of these. - Widen version_changes.sequence SmallInteger → Integer: per-entity sequence is assigned by unbounded enumerate(); a pathological diff could overflow 32767. - Drop ix_version_changes_transaction_id: the UNIQUE(transaction_id, ...) constraint's backing index already serves transaction_id-prefix lookups. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The adhoc_filter subject was derived from chart id + db name — stable across runs, so on a persistent test DB only the FIRST run added a new natural key; every later run re-appended an existing key and the keyed differ correctly emitted nothing, failing the assertion. Use a per-run uuid4 suffix so each run adds a genuinely new filter key. (Belongs upstream on sc-103156 — the file rides this branch until the chain merges.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Version-history UI feedback (PR apache#40988): system-managed writes polluted the activity timeline with phantom rows the client had to suppress. - Exclude the perm-string class (perm / schema_perm / catalog_perm) from Slice and SqlaTable versioning: bulk permission maintenance rewrites these across many entities, producing phantom transactions (10 'Chart updated' rows for one user save). They're derived security state, not user content — excluding them means Continuum sees no versioned change, so the phantom transactions disappear entirely (records, version rows, and shadow churn). Restore also can no longer resurrect stale permission strings. Shadow DDL columns dropped from the unmerged migration. - Add shared_label_colors to DASHBOARD_JSON_METADATA_AUDIT_KEYS: the DAO rewrites it when a dashboard is merely viewed, producing phantom 'Properties updated' records. The view-time write itself is a separate round-trip-asymmetry issue (cf. apache#39706); this stops the change-record noise regardless. Lets the frontend revert the NOISE_PATHS workaround for these paths. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Version-history UI feedback (PR apache#40988): a single dataset save produced two Continuum transactions stamped the same second (update + refresh each committed), rendered as two 'Dataset updated' rows for one user action. Wrap UpdateDatasetCommand + RefreshDatasetCommand in one @transaction at the API layer — the decorator nests via g.in_transaction, so the commands' own decorators pass through and the wrapper commits once. The pass-through also skips the commands' on_error conversion, so it's replicated inline. This simplifies the earlier attribution fix: with a single transaction the post-commit version reads and the ETag are correct by construction. Intended semantic change: a failed refresh now rolls back the whole save instead of leaving the update applied with the refresh lost. If the reporter's repro involved two separate HTTP requests rather than the override_columns path, that half is client-side; the regression test pins the backend path to one transaction. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Version-history UI feedback (PR apache#40988): action_kind='restore' carried no pointer to WHICH version was restored, so 'Restored to X from [date]' couldn't be rendered from API data. Implements the __meta__ headline convention: the restore command sets ACTION_META_KEY on session.info alongside ACTION_KIND_KEY; the change-record listener pops it on its first firing for the transaction and prepends a synthetic record (kind='__meta__', path=['__meta__','restore'], to_value={version_uuid, version_number}) at sequence 0. Same lifecycle as action_kind: popped on use, cleaned up on commit/rollback. The convention generalizes to import/clone headlines later. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e row uuid is identity, not content. A snapshot can carry a DIFFERENT uuid for the same integer id — id reuse after a hard delete (SQLite and MySQL reuse max(id)+1) while retention still holds the old entity's shadow rows. Continuum's Reverter applied it, rewriting the live row's uuid: every uuid-routed endpoint 404'd immediately after the restore. Surfaced while converting the AV-015 xfail on the activity-view branch — the strict xfail had been 'passing' on this 404 rather than its stated reason. Preserve the pre-restore uuid (with a WARNING when a foreign uuid is detected); regression test plants a foreign uuid on the oldest shadow row and proves the round-trip survives. Verified the test fails with the fix removed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…g test The schema-wiring fix changed issued_at from Flask's RFC-1123 http-date to ISO-8601 (intended); this test still parsed the old format with email.utils.parsedate_to_datetime and failed on the new shape. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reverts the single-transaction dataset save (one logical save = one Continuum tx). Review + instrumentation uncovered two fatal interactions: 1. The listener's tx-dedup guard silently dropped the refresh's column change records (the refresh's validate() autoflushes the update; the commit flush's records were then skipped as already-processed). 2. DBEventLogger writes request logs through the SHARED scoped session and commits/rolls back mid-request (superset/utils/log.py): with the save held uncommitted across a logged sub-action, a transient logger failure ROLLED BACK the user's save (observed: 'database is locked' on SQLite → silent data loss with a 200), and on Postgres/MySQL the logger's commit would persist half-done state — breaking the atomicity the merge promised. Until the event logger gets its own session, per-command commit boundaries are the only shape with honest failure modes. The two-transaction consequence for override_columns saves is documented at the call site; the version-history UI's adjacent-merge handling stands for this path. Keeps the attribution fix (capture identifiers before the refresh; ETag re-read after) and the DatasetRefreshFailedError 422 mapping. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Multi-lens review of the headline work (PR apache#40988 feedback): - build_action_headline() is the single owner of the record shape; the restore command no longer hand-rolls the dict (three-module string contract). - operation='announce' (new vocabulary member): a headline announces an action, it doesn't edit a field. The verb no longer rides in path — path is pure navigation per the ChangeRecord contract; renderers join kind='__meta__' with the transaction's action_kind. - The injector peeks instead of popping when the buffer is empty, and runs after the child-record append: a headline-only buffer no longer defeats the multi-flush short-circuit (which would have silently dropped a later flush's records once import/clone headlines arrive). - _BUFFER_KEY is cleared on the processed-tx early return and popped in both after_commit/after_rollback cleanups — buffered records can no longer leak into the next transaction under the wrong tx id. - Module docstring re-synced with the session keys it manages. - Test hygiene: the perm-rewrite probe restores the perm trio in a finally (security-routing state on a persistent DB), and the foreign-uuid test un-plants the shadow row it corrupts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move the time-based version-history retention work out of this branch so it ships as its own stacked PR (sc-111099). Removed here: - superset/tasks/version_history_retention.py (the prune_old_versions task) - config: SUPERSET_VERSION_HISTORY_RETENTION_DAYS + the version_history.prune_old_versions CELERYBEAT_SCHEDULE entry - celery_app.py task import - initialization: _warn_if_retention_beat_missing beat-schedule check + call - retention unit test + the TestDashboardVersionRetention integration class - UPDATING.md retention config row Doc comments in versioning/queries.py and the init test that referenced the retention task are softened to point at sc-111099. Net cumulative diff for this branch no longer contains the retention cleanup job. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
apache/master advances continuously; rebasing the versioning chain onto a master commit leaves the chain's root (2bee73611e32) pointing at whatever was master's head at rebase time, which forks alembic into multiple heads once master lands its next migration (CI runs on the PR-merge commit, so the fork surfaces there even when the branch alone is single-head). Point the root at master's current head (78a40c08b4be) so the merge stays linear. Re-run this re-point whenever master advances with a new migration before merge. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The rebase onto current master brought ruff 0.9.7 + a stricter mypy that flagged previously-clean base code: - drop 4 now-unused `type: ignore[attr-defined]` comments (queries.py, changes/state.py, versioning_round_trip__tests.py) - test_pin_audit_columns: the dynamic `declarative_base()` local can't be used as a typed base class — ignore [valid-type, misc] on the class, drop the now-unused attr-defined ignore - initialization_test: type the `_initializer(config)` param as dict[str, Any] - ruff-format reflow (queries.py, test_diff.py) ruff check + format clean locally (0.9.7). mypy not runnable in this env; fixes target the exact hook-reported errors. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two CI integration failures (surfaced once the alembic head-fork was resolved and the test jobs actually ran): - VersioningFlaskPlugin.transaction_args recorded whatever get_user_id() returned, guarding only `is None`. A mocked `g` (master's test_query_context_update_requires_chart_access, apache#40648) yields a Mock g.user.id, which then hit the integer version_transaction.user_id column and blew up the flush with a SQL bind error. Guard on isinstance(int) — production-safe (user_id is always int/None there), fixes the test. - test_list_versions_denies_non_owner (chart/dashboard/dataset) asserted Alpha -> 403 on a "row-level ownership" theory the endpoint never implemented: it gates via raise_for_access, and Alpha has all_datasource_access + can_write (FR-013), so 200 is correct. Renamed to _allows_can_write_non_owner and assert 200; model-level denial stays covered by the Gamma tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
7dd245b to
a655e0c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SUMMARY
Adds backend plumbing to capture a version history for every save of a chart, dashboard, or dataset, and to expose that history via three new REST endpoints per entity (list, get, restore). No frontend in this PR. Second PR in the Versioning epic (sc-103156), depending on #39859 (composite-PK reshape on M2M association tables — sc-105349) and orthogonal to #39286 (sc-103157 soft-delete). See SIP-210 / issue #39492 for full design rationale.
🚧🚧🚧 This is still a draft/spike, not ready for final review. Branch contains two
temp(*)commits (demo UI dropdowns + French i18n; URL-param stripping on restore navigation) that will revert before merge. 🚧🚧🚧Stacked PRs — merge order. This PR is the base of the versioning stack and must merge first. Two follow-up PRs build on it and merge after (either order relative to each other):
sc-111099-version-history-retention. The time-based Celery-beat cleanup job, extracted from this PR so it ships/reviews independently. Hard-depends on the schema introduced here.sc-107283-versioning-activity-view. Reads this PR'sversion_changesdata; adds no migrations.Upstream dependency unchanged: #39859 (composite-PK reshape, sc-105349) → this PR → {retention, activity view}.
What changed:
Continuum wiring. Adds
sqlalchemy-continuumas a base dependency. Wired insuperset/extensions/__init__.pywith thevaliditystrategy; a customVersionTransactionFactoryrenames the transaction table toversion_transaction(the wordtransactionis reserved in several dialects) and aVersioningFlaskPluginsupplies the acting user viaget_user_id()(not Flask-Login'scurrent_user) so CLI / Celery / JWT-auth API saves all attribute correctly.Six shadow tables, all Continuum-native:
dashboards_version,slices_version,tables_versiontable_columns_version,sql_metrics_versiondashboard_slices_versionPlus one
version_transactiontable (the per-flush "who/when/where" envelope) and oneversion_changestable (structured diff records, FK toversion_transactionwithON DELETE CASCADE).Three endpoints per entity type (
/chart,/dashboard,/dataset):GET /api/v1/<resource>/<uuid>/versions/— list history (version_number,version_uuid,issued_at,changed_by)GET /api/v1/<resource>/<uuid>/versions/<version_uuid>/— one snapshot (scalar fields; pluscolumns/metricsfor datasets,slicesfor dashboards)POST /api/v1/<resource>/<uuid>/versions/<version_uuid>/restore— restore entity to that version<version_uuid>is a deterministic UUIDv5 (fixed namespace, derived from the entity UUID + Continuum transaction id). Stable across replicas and retention pruning — the same transaction always produces the same version uuid, so API consumers can cache references safely.ETag headers (
ETag: W/"<version_uuid>") on all three GET endpoints + the live entity GET. Foundation for optimistic-locking enforcement on writes (Phase 2); not enforced in this PR.Restore uses Continuum's native
Reverterwrapped in asingle_flush_scopecontext manager (suppresses autoflush inside the block, emits one trailing flush). The single-revert / single-flush shape was the spike outcome — earlier attempts at split-revert and JSON-snapshot tables were abandoned (seespike-continuum-restore.mdand the revised ADR-004 in the spec folder).Baseline capture. First save under versioning of an entity that pre-existed the migration inserts a synthetic
operation_type=0row capturing the pre-edit state, attributed to the entity's existingchanged_on/changed_by_fk. Listener runs before Continuum's ownbefore_flushso the baselinetransaction_idis lower than the edit's (correct ordering).No-op suppression. A
SkipUnmodifiedPluginmarks Continuum Operationsprocessed=Truewhen post-flush column values are content-equal to the previous shadow row — including JSON-aware comparison forDashboard.json_metadatathat strips frontend-stamped audit sub-keys (map_label_colors,chart_configuration, …) so saves that only re-stamp those don't pollute history.Force-parent-dirty on child changes. A
before_flushlistener flags the versioned parent (SqlaTable) as dirty when only its versioned children (TableColumn/SqlMetric) changed, so child-only edits surface in the parent's version dropdown.Structured change records. Every save writes per-field diff records to
version_changeskeyed to the sametransaction_id. Records carrykind/path/from_value/to_value— backbone for the Phase-2 UI's "Added column X" rendering, captured in V1 so the data is available from day one without a backfill.Retention (the time-based Celery-beat prune) has been extracted to the stacked follow-up PR (sc-111099) so it ships and is reviewed independently. This PR captures history; the cleanup job that ages shadow rows out lands in the retention PR.
ON DELETE CASCADEonversion_changes.transaction_id(added in this PR) keeps diffs in sync once the prune runs.Composite PK reshape on M2M associations (sc-105349, PR refactor(db): composite PK on M2M association tables (sc-105349) #39859 — required for Continuum's M2M tracker to populate
dashboard_slices_versioncorrectly). The PRs are intended to merge in order refactor(db): composite PK on M2M association tables (sc-105349) #39859 → this one; the migration is included on this branch because the rebased history depends on it.Authorisation. Version endpoints reuse the resource's existing
can_writepermission. No new FAB permissions. Row-level access enforced viasecurity_manager.raise_for_ownership(entity)in the restore command.What is NOT versioned in v1 (see
specs/sc-103156-entity-versioning/future-work.md):position_jsonis versioned as an opaque blob (restored wholesale on dashboard restore); finer-grained layout versioning is Phase 2.Coordination with #39286 (sc-103157 soft-delete) — orthogonal in design; merge order can go either way. When sc-103157 merges, one small change hooks
deleted_atintofind_active_by_uuid()and the versioned models' Continuumexcludelists. Tracked as T043 in the spec.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A — backend-only. No UI is wired to the new endpoints yet. The
temp(*)commits add non-final demo dropdowns and i18n strings for manual testing only; they will revert before merge.TESTING INSTRUCTIONS
Expect an ordered array with
version_number,version_uuid,issued_at,changed_by. Response will include anETag: W/"<version_uuid>"header for the most recent version.Expect 200 + ETag header. A second request with
If-None-Match: "<that-etag>"returns 304.Expect 200.
GET /api/v1/chart/$CHART_UUIDshould now reflect the restored state, and a new version row (the restore itself) appears in the version list.Each change carries
{kind, path, from_value, to_value}.Repeat for dashboards and datasets using
/api/v1/dashboard/and/api/v1/dataset/. Datasets exercise child shadows (columns/metrics); dashboards exercise the M2M shadow (slices).Run the test suite:
pytest tests/integration_tests/charts/version_history_tests.py \ tests/integration_tests/dashboards/version_history_tests.py \ tests/integration_tests/datasets/version_history_tests.py \ tests/integration_tests/versioning/ -vAsserts the three Success Criteria: list < 1 s, restore < 3 s, save p95 overhead < 50 ms.
ADDITIONAL INFORMATION
Migration list (in dependency order):
2bee73611e32dashboard_slices+ 7 other association tables (sc-105349 / #39859)56cd24c07170version_transaction+ parent shadow tables (dashboards_version,slices_version,tables_version)e1f3c5a7b9d0version_changes(structured diff records, ON DELETE CASCADE FK toversion_transaction)f7a2b3c4d5e6table_columns_version,sql_metrics_version) + M2M shadow (dashboard_slices_version)All migrations are additive on the pre-existing
slices/dashboards/tables/ child tables — no existing columns altered. The composite-PK migration (2bee73611e32) reshapes the M2M association tables; round-trip tested on PostgreSQL, MySQL, and SQLite, including the MySQL FK /AUTO_INCREMENTquirks that required raw SQL workarounds (commits56c36fde54,65a3491861).Write cost per save:
version_transaction*_versionparent shadowSkipUnmodifiedPluginfiltered the save; 1 if scalars changedversion_changesNo write-path retention overhead — pruning is asynchronous via Celery beat (shipped in the stacked retention PR, sc-111099).
Performance:
The numbers below were captured before the ADR-004 reversal (JSON-snapshot → full Continuum). Architecture has changed since — child writes now go through Continuum shadows instead of
dataset_snapshots/dashboard_snapshotsJSON tables. Re-validation against the final architecture pending before review. Targets unchanged:Harness:
SUPERSET_PERF_VALIDATION=1 pytest tests/integration_tests/versioning/perf_validation_tests.py -v -s.