Skip to content

pull: fix the flattened runtime root and delta re-pull after deferred files#268

Open
epeicher wants to merge 1 commit into
trunkfrom
pull/studio-single-invocation
Open

pull: fix the flattened runtime root and delta re-pull after deferred files#268
epeicher wants to merge 1 commit into
trunkfrom
pull/studio-single-invocation

Conversation

@epeicher

@epeicher epeicher commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

What it does

Two fixes to the composite pull command so it correctly runs, and re-runs, the full preflight -> files-pull -> db-pull -> db-apply -> flat-docroot -> apply-runtime pipeline from a single invocation:

  1. pull --flatten-to=DIR --runtime=... now generates a runtime rooted at the flattened layout (DIR) instead of the raw download tree. The two options now compose: --flatten-to builds the flattened docroot and --runtime targets it.
  2. A completed pull can be re-pulled even after its deferred-files tail ran. Re-running pull --filter=essential-files once the skipped-earlier files have been fetched no longer fails the mid-flight --filter guard; it proceeds and re-syncs as a delta.

Rationale

--flatten-to did not reach apply-runtime. pull forwards --flatten-to to the flat-docroot stage as flatten_to, but run_apply_runtime() reads a different key, flat_document_root, and otherwise falls back to fs_root + remote document_root (the raw tree). So a pull that both flattens and generates a runtime produced a runtime pointing at the wrong root: the flatten and runtime stages did not compose.

Delta re-pull was blocked after a deferred-files tail. pull supports delta re-pull: re-running a completed pull resets its own state via prepare_repull() and re-syncs. But after an --filter=essential-files pull fetches its deferred skipped-earlier files, the state reads filter=skipped-earlier, status=in_progress. The next pull --filter=essential-files loads that in ImportClient::run() and the mid-flight guard throws:

Cannot change --filter from 'skipped-earlier' to 'essential-files' while a sync is in progress.

That guard runs before Pull::run() reaches prepare_repull(), which is the step that would clear the stale state, so the re-pull can never start. A caller that drives the sub-commands directly can sidestep this with files-sync --abort; through pull, the reset has to happen inside prepare_repull().

Implementation

validate_and_default_options() derives the missing key:

if (!empty($options['flatten_to']) && empty($options['flat_document_root'])) {
    $options['flat_document_root'] = $options['flatten_to'];
}

The filter guard exempts a completed pull that is about to delta re-pull (its prepare_repull() resets the state):

$is_repull = $command === "pull" && (($this->state["pull"]["stage"] ?? null) === "complete");
$is_mid_flight =
    !$is_repull &&
    $prev !== null && $prev !== $next && $status !== null && $status !== "complete";

Testing instructions

cd tests && phpunit Import/PullFilterOptionTest.php

Two new tests:

  • testPullDerivesFlatDocumentRootFromFlattenTo: a pull with --flatten-to plus --runtime hands apply-runtime a matching flat_document_root.
  • testRepullBypassesTheMidFlightFilterGuard: a pull whose state carries the tail's filter=skipped-earlier / status=in_progress re-pulls to completion instead of throwing.

Studio is moving from driving reprint as separate sub-commands
(files-sync, db-sync, db-apply, flat-docroot, apply-runtime) to a single
`reprint pull`. Two gaps surfaced:

- apply-runtime targeted the raw download tree, not the flattened site,
  when a pull used --flatten-to. `pull` forwards `flatten_to` to
  flat-docroot, but `run_apply_runtime` reads `flat_document_root`. Derive
  the latter from the former in `validate_and_default_options` so a single
  `pull --flatten-to=X --runtime=...` roots the runtime at the flattened
  layout.

- A delta re-pull tripped the mid-flight --filter guard. After a pull, its
  deferred "skipped-earlier" tail leaves filter=skipped-earlier /
  status=in_progress; the next `pull --filter=essential-files` hit the
  guard in ImportClient::run() before Pull::run() could prepare_repull()
  and clear that stale state. A completed pull (pull.stage===complete)
  about to re-pull now bypasses the guard.

Adds unit tests for both in PullFilterOptionTest.
epeicher added a commit to Automattic/studio that referenced this pull request Jun 18, 2026
Replace the per-sub-command orchestration (downloadEssentialSiteFiles,
refreshFlattenedSiteDirectory, downloadRemoteDatabase,
applyDownloadedDatabase, generateRuntimeConfiguration) and the
clearCompletedSubcommandState/--abort delta-reset with one `reprint pull`
call. reprint owns the pipeline ordering (files-pull -> db-pull ->
db-apply -> flat-docroot -> apply-runtime) and resets its own state for a
delta re-pull via prepare_repull(), so the Studio-side --abort wiring and
per-phase stage gating go away.

- runFullPull() issues the single pull with the same sqlite geometry the
  old db-apply used (target sqlite under the raw content dir) plus
  --flatten-to, --runtime=playground-cli, --start-runtime=none and
  --output-dir, mounting the site + runtime dirs up front.
- ensurePort moves before the pull so --new-site-url is available.
- Collapse the stage machine from 9 stages to 5 (initialized -> pulled ->
  site-registered -> site-started -> completed).
- Bump the PHP-WASM memory_limit to 1024M: the single long-lived fork
  holds the file-index high-water-mark across phases.

Requires reprint's flatten_to->flat_document_root bridge and the re-pull
filter-guard fix (WordPress/reprint#268).
@github-actions

Copy link
Copy Markdown
Contributor

Pull pipeline performance — large-directory

Site: large-directory · 2,000+ plus targeted file-transfer scenarios files · 10,000 posts · 25,000 postmeta · PHP 8.5.7

Stage PR trunk Δ Status Details
playground-sqlite-db-pull 9.75 s 9.32 s ⚪ +428 ms (+4.6%) condition=db-pull in PHP.wasm
runtime=php.wasm 8.3
wp_mysql_parser=enabled
mode=lexer
native_lexer=verified
native_token_stream=WP_MySQL_Native_Token_Stream
native_token_count=18
native_parser=selected
trunk: condition=db-pull in PHP.wasm
runtime=php.wasm 8.3
wp_mysql_parser=enabled
mode=lexer
native_lexer=verified
native_token_stream=WP_MySQL_Native_Token_Stream
native_token_count=18
native_parser=selected
playground-sqlite-db-apply 3.65 s 3.64 s ⚪ +7 ms (+0.2%) condition=db-apply to SQLite in PHP.wasm
runtime=php.wasm 8.3
wp_mysql_parser=enabled
mode=parser
native_lexer=verified
native_token_stream=WP_MySQL_Native_Token_Stream
native_token_count=18
native_parser=verified
native_ast=WP_MySQL_Native_Parser_Node
sqlite_driver_parser=verified
trunk: condition=db-apply to SQLite in PHP.wasm
runtime=php.wasm 8.3
wp_mysql_parser=enabled
mode=parser
native_lexer=verified
native_token_stream=WP_MySQL_Native_Token_Stream
native_token_count=18
native_parser=verified
native_ast=WP_MySQL_Native_Parser_Node
sqlite_driver_parser=verified
Total 13.39 s 12.96 s ⚪ +434 ms (+3.4%)

Numbers carry runner noise; treat single-run deltas as directional, not authoritative.

📈 Trunk performance history — commit-by-commit timeline.

@epeicher epeicher changed the title pull: support Studio's single-invocation flow pull: fix the flattened runtime root and delta re-pull after deferred files Jun 18, 2026
@epeicher epeicher self-assigned this Jun 18, 2026
@epeicher epeicher requested a review from Copilot June 18, 2026 17:11
@epeicher epeicher requested a review from adamziel June 18, 2026 17:13

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes two bugs in the composite pull command so that the full pipeline (preflight → files-pull → db-pull → db-apply → flat-docroot → apply-runtime) works correctly both on first run with flatten + runtime options and on delta re-pulls after deferred files have been fetched.

Changes:

  • Derives flat_document_root from flatten_to in validate_and_default_options() so that pull --flatten-to=DIR --runtime=... generates a runtime rooted at the flattened layout rather than the raw download tree.
  • Exempts completed pull re-pulls (pull.stage === "complete") from the mid-flight filter guard in ImportClient::run(), allowing prepare_repull() to clear the stale sub-command state before the guard would otherwise block the filter change.
  • Adds two focused unit tests covering both fixes.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
packages/reprint-importer/src/lib/pull/class-pull.php Bridges flatten_toflat_document_root in option validation so apply-runtime targets the flattened directory
packages/reprint-importer/src/import.php Adds $is_repull check to the filter guard to exempt completed pulls about to delta re-pull
tests/Import/PullFilterOptionTest.php Adds PullBridgeFakeClient test helper and two new tests for the flatten bridge and the re-pull guard bypass

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants