Refresh on re-run for completed pull sub-commands#257
Closed
epeicher wants to merge 3 commits into
Closed
Conversation
Re-invoking files-pull, db-pull, or db-apply on a completed state now performs a refresh instead of erroring or no-oping, matching the composite pull orchestrator's re-pull behavior: - files-pull: runs a delta sync — keeps the local index, re-indexes the remote, diffs, and fetches only changes. The stale remote index and download lists are cleared first (download_remote_index() appends when the file exists). - db-pull: full refresh via the new ImportClient::reset_db_pull_state() (also reused by Pull::prepare_repull()) — resets DB state, deletes db.sql, .import-domains.json, and the .sql-buffer crash-recovery file, then re-downloads the entire dump. - db-apply: re-applies the dump from statement 0. The dump is idempotent (DROP TABLE IF EXISTS + full INSERTs), so remote edits, inserts, and deletes all propagate. This unblocks callers that drive the sub-commands directly (notably Studio's pull-reprint) and could previously never re-sync a database: edits and deletions to existing rows stayed frozen at the first completed pull. Also fixed along the way: - --filter=skipped-earlier runs never marked status=complete, leaving the state stuck in_progress after a successful pull. The branch now marks completion, and a legacy-state repair in run() heals sessions written by older versions. - The skipped download list is now truncated instead of deleted everywhere. apply-runtime mounts that file into the generated runtime for the remote-uploads proxy, and PHP-WASM re-resolves mounts on every runtime rotation — deleting the file crashed server starts and WP-CLI with ENOENT. - db-apply strips _edit_lock postmeta after applying. The dump carries the remote's ephemeral editor locks, which render as phantom "X is currently editing" badges on the imported copy. - A stale .sql-buffer could be replayed into the target on refresh (latent prepare_repull bug, fixed by the shared reset helper). In-progress/partial states still resume from their cursors; --abort remains the way to clear state without running.
1 task
Three E2E tests asserted the old "re-run refuses / deletes" contract that this branch changes; update them to verify the new behavior: - import-02: re-running db-sync on a completed state now performs a full refresh (exit 0, fresh db.sql, audit log records the refresh) instead of erroring with a "--abort" message. - import-40: the skipped download list is truncated to empty rather than deleted (apply-runtime mounts it; deletion crashed the runtime with ENOENT). Assert it exists and is zero-length. - import-01: a completed files-sync re-run now performs a delta sync; rename the test and assert the completed status it ends on (it was passing only incidentally under the old "refuses" name). Verified in the Dockerized E2E environment: all three files pass (31 tests). The full suite's only remaining failures (import-07, import-19) are pre-existing local-Docker permission/fault-injection flakiness — they fail identically on trunk and pass on CI.
Collaborator
Author
|
Superseded by #260 — moved from the fork to an in-repo branch so CI runs with full token permissions (the Performance Report perf-comment step couldn't post from a fork PR). Same branch and commits. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Re-pulling a site never propagated edits or deletions to existing database rows — the local database stayed frozen at the content captured by the first completed pull. The root cause is an asymmetry in the engine: the composite
pullorchestrator resets state for a delta re-pull (Pull::prepare_repull()), but callers that drive the sub-commands directly (notably Studio'spull-reprint) had no refresh path:files-pull--abort")db-pulldb-apply--abortis destructive (re-downloads everything), so there was no clean incremental re-sync.Change
Re-invoking a sub-command on a completed state now performs a refresh, matching the composite
pull's re-pull semantics:files-pull→ delta sync. Keeps the local index, re-indexes the remote, diffs, fetches only changes. Stale remote index / download lists are cleared first (download_remote_index()appends when the file exists).db-pull→ full refresh. NewImportClient::reset_db_pull_state()(now also used byPull::prepare_repull()) resets DB state and deletesdb.sql,.import-domains.json, and the.sql-buffercrash-recovery file, then re-downloads the entire dump.db-apply→ re-apply from statement 0. The dump is idempotent (DROP TABLE IF EXISTS+ full INSERTs), so edits, inserts, and deletes all land.In-progress/partial states still resume from their cursors;
--abortstill clears state without running.Bugs fixed along the way
--filter=skipped-earliernever marked completion — the branch returned before thestatus = "complete"marking, leaving every successful essential+skipped pull stuckin_progressforever. Fixed, plus a legacy-state repair inrun()for sessions written by older versions.apply-runtimemountsstate/.import-download-list-skipped.jsonlinto the generated runtime (remote-uploads proxy), and PHP-WASM re-resolves mounts on every runtime rotation. Deleting the file crashed server starts and WP-CLI with ENOENT. It is now truncated to empty instead (all has-skipped checks testsize > 0, so semantics are unchanged)._edit_lockpostmeta, which renders as an editor lock on the imported copy for 150s after the captured timestamp.db-applynow strips_edit_lockrows after applying (WordPress regenerates them locally; core's WXR exporter skips them too)..sql-bufferreplay — a leftover crash-recovery buffer could be replayed into the target on a refresh (latentprepare_repull()gap, fixed via the shared reset helper).Known limitation
A table dropped on the remote between pulls lingers in the local target (the new dump no longer mentions it). WP core tables always exist, so this is rare; recreating the SQLite file on refresh is a possible hardening follow-up.
Testing
tests/Import/DbResyncStateTest.php: round-trip via the sub-command path proving an edited, inserted, and deleted row all propagate on re-run; db-pull refresh resets state and deletes stale artifacts; the stuck skipped-earlier state is repaired;_edit_lockis stripped while other postmeta survives.tests/Import/FilesSyncStateTest.phpupdated: completedfiles-pullre-runs as a delta and clears the stale remote index.tests/Import/+tests/UrlRewriting/suites green (493 tests).