Optimize critical path update, v15 schema for new (extended) index. by dtzSiFive · Pull Request #1857 · sifiveinc/wake

Will Dietz (dtzSiFive) · 2026-05-15T18:38:38Z

Chunk the update, as even optimized (~60s -> ~12s) that's still much too long for a single transaction.

Otherwise other wake runs cannot make forward progress for the duration.

Replaces #1850 ; this has a single index that works for various filetree queries that today rely on filesearch, while also being usable for this work.

Use per-run lock files (.wake/locks/run_N.lock) to detect dead wake processes. Each wake process creates and holds an exclusive fcntl lock on its lock file for the duration of the run. On startup, probe lock files of incomplete runs -- if we can acquire the lock, the original process is dead and we mark it as reaped (end_time == -1). Additionally: If for whatever reason lock files are missing, to avoid blocking GC watermark indefinitely assumes such runs are dead if they started more than 24 hours ago. This is mostly to cover a strange case that shouldn't happen, not a part of anyone's normal flow.

use_id only tracked the most recent run using a job, which breaks when multiple wakes run concurrently - each would overwrite the other's use_id, corrupting GC decisions. run_jobs is a junction table (run_id, job_id) allowing many runs to reference the same job. GC uses a watermark approach: watermark = min(run_id where end_time is null) - 1 Jobs whose newest run_id <= watermark are safe to delete, meaning all runs using them have completed. Schema changes (migration 9 to 10): - Remove use_id column from jobs table - Add run_jobs(run_id, job_id) with CASCADE delete - Add end_time column to runs (NULL while running) - Mark existing runs complete so GC works immediately Query changes: - Insert into run_jobs on job create and reuse - Overlap detection joins run_jobs for current run only - GC queries use watermark instead of use_id filter - --last, --last-used, --last-executed, digraph filter to last completed run (end_time IS NOT NULL) - Critical path (revtop_order) binds this run's run_id Adds finish_run() to set end_time before GC runs.

We don't use this information but it might be of value for debugging or external integration purposes.

Allows us to drop index later if we want. Per reviewer feedback, thanks!

Implement using std::chrono.

Per reviewer feedback, thanks!

Don't check error message presently due to variations across environments.

Second parameter is used first now. Per reviewer suggestion, thanks!

The schema had 'update entropy set seed=0 where 0;' which was used to acquire an exclusive lock on database open, preventing any concurrent access. This is no longer needed with proper transaction handling.

Per reviewer feedback, thanks!

Refuse to clean while builds are in progress. First reap dead runs (refactor), then with RW lock held: * Check if incomplete jobs, exit * Delete files

The file-based build lock prevented concurrent wake invocations. This is no longer needed, remove it!

database: add lock files for dead run detection

* schema: 12: -stale, no longer unique on path * Send in all path info to job prims * Find files using richer information. * Consider multiple candidates in reuse_job, since find_prior doesn't check specifics of visible. * Update add_hash for multi-wake Don't remove records, but do support new mtime. * Update detect_overlap, delete_overlap. Keep these for time being.

multi-wake: remove .wake-build-lock mechanism

Display whether each run is running, crashed, or duration. Adds end_time to RunReflection. Migrated runs show 0s since we lack their actual end time.

Also check that making lock files read-only doesn't prevent wake --history from determining liveness.

Targeted fix for re2's StringPiece differences across versions.

Record starttime eagerly via start_job() at fork time rather than writing it in finish_job, so queued jobs (starttime==0) are distinguishable from running ones. Also allows showing how long jobs have actually been running. Add --active, --queued, --in-flight capture filters, and fix --canceled (now only jobs not part of live runs). These all are filters on unfinished jobs.

Support RO run lock probing, use in multi-wake's --history.

Don't report jobs that failed but finished. These coud be produced via: job_fail_launch -> finish_job .

Lists jobs grouped by run_id, showing elapsed time for running jobs and [queued] for jobs not yet forked.

wake: record starttime early, add capture filters; fix --canceled

Prune unreachable file records, they have no use. Runs after clearing run_files for this run and delete_jobs.

Ignore errors encountered, instead soldier on! Errors shouldn't happen, but don't bail if there's a stray file from a user's editor or the like.

Return usual exit code even if cas GC encounters error.

Drop open_run_jobs from database, unused.

Per reviewer feedback, thanks!

…oncat. For most use cases we don't need the tags concatenated and grouped by job_id, which is very slow. Before, filters such as used by `--ps` were taking over 0.5s! Now it completes too fast for `\time -v`.

wake: add --ps command to show currently running jobs

Per reviewer suggestion, thanks!

multi-wake: CAS GC implementation

…-and-enumerate_blobs enable cas unit tests, refactor cas to use constants

The job overhead for wake-stage's trivial amount of work amplifies costs especially of startup latency when sourcing many files. Additionally, the point of staging is to get the required information ASAP so as to not race with concurrent modifications. Putting into job queue is more overhead than just doing the reflink directly, and as a primitive we ensure the staging is done immediately not stuck behind a queue of hashing jobs. This was especially noticeable with concurrent runs (multi-wake). Issue diagnosed with help of `wake --ps`! :)

Chunk the update, as even optimized (~60s -> ~12s) that's still much too long for a single transaction.

Will Dietz (dtzSiFive) · 2026-05-15T18:46:07Z

This chunking, and that added in #1852 , should probably be moved to adaptively chunk based on a time budget instead of magic numbers. Magic time budget instead.

Regardless, these "stop the bleeding" and appear to be sufficient in my testing.

Will Dietz (dtzSiFive) added 30 commits May 1, 2026 07:57

share/wake: Always set WAKE_CAS=1, panic otherwise.

624594d

tests: add gc-cross-run test for cross-run GC

d434526

tests: Set WAKE_CAS in additional place.

baada8f

wakefs: Check WAKE_CAS, always specify --use-cas.

04eb81b

database: write pid to run lock files

0a8b294

We don't use this information but it might be of value for debugging or external integration purposes.

run_jobs: swap primary key and index.

d5eae48

Allows us to drop index later if we want. Per reviewer feedback, thanks!

main: WAKE_CAS must be set during transition period.

197d1e4

[NFC] database: refactor out gettime_ns helper.

9322866

Implement using std::chrono.

wake: use case style consistent with other queries in this file.

65db73b

Per reviewer feedback, thanks!

hardlink test: fail to write

fabe9a2

Don't check error message presently due to variations across environments.

database: pull reap_dead_runs() out to a function.

98a21b2

database: swap parameters in detect_overlap.

a7d1903

Second parameter is used first now. Per reviewer suggestion, thanks!

Makefile: Set WAKE_CAS=1.

dc43097

tests: --clean multi-wake safety

84db6dd

multi-wake: remove entropy table exclusive lock

a3795b2

The schema had 'update entropy set seed=0 where 0;' which was used to acquire an exclusive lock on database open, preventing any concurrent access. This is no longer needed with proper transaction handling.

database: refactor run_lock out to RAII helper.

48a8654

database: print to stderr if unexpected but benign happens.

256a9ff

Per reviewer feedback, thanks!

Makefile: Thread WAKE_ENV into an invocation that didn't have it.

bc80332

database: add --clean safety check for active builds

a841fee

Refuse to clean while builds are in progress. First reap dead runs (refactor), then with RW lock held: * Check if incomplete jobs, exit * Delete files

multi-wake: remove .wake-build-lock mechanism

bd1eac0

The file-based build lock prevented concurrent wake invocations. This is no longer needed, remove it!

Merge pull request #1760 from dtzSiFive/pr/lock-files

f58f92c

database: add lock files for dead run detection

[CI] Thread WAKE_CAS through testing and docker invocations.

418a970

tests: disable test until #1781 is fixed.

215e0c2

Merge pull request #1761 from dtzSiFive/pr/remove-build-lock

b40db0d

multi-wake: remove .wake-build-lock mechanism

database: Don't check workspace for files as precondition for reuse.

4055b63

rsc: Plumb WAKE_CAS here as well.

1116c0f

--history: show run status, end time, and duration

bcf818f

Display whether each run is running, crashed, or duration. Adds end_time to RunReflection. Migrated runs show 0s since we lack their actual end time.

Will Dietz (dtzSiFive) added 26 commits May 6, 2026 15:13

tests/inspection/history: remove now-unnecessary run to reap

43c6fce

Also check that making lock files read-only doesn't prevent wake --history from determining liveness.

Merge pull request #1847 from sifiveinc/fix/stringpiece-multi

75f54db

Targeted fix for re2's StringPiece differences across versions.

Merge pull request #1837 from sifiveinc/feature/is_live_ro_history

92beea7

Support RO run lock probing, use in multi-wake's --history.

wake: tighten unfinished filters to stat_is is null.

c41e40b

Don't report jobs that failed but finished. These coud be produced via: job_fail_launch -> finish_job .

[NFC] database: update table in matching().

b16a44d

wake: add --ps command to show currently running jobs

e5a0c1a

Lists jobs grouped by run_id, showing elapsed time for running jobs and [queued] for jobs not yet forked.

Merge pull request #1840 from sifiveinc/feature/active-queued-filters

28d392f

wake: record starttime early, add capture filters; fix --canceled

database: delete orphan files rows in clean().

0fde4f9

Prune unreachable file records, they have no use. Runs after clearing run_files for this run and delete_jobs.

cas: add enumerate_blobs_strings, remove_blob.

28c9d6b

Ignore errors encountered, instead soldier on! Errors shouldn't happen, but don't bail if there's a stray file from a user's editor or the like.

database: add gc_if_dead to callback on dead hashes.

706823e

wake: at end of run, GC dead CAS blobs.

783bb24

Return usual exit code even if cas GC encounters error.

tests: add cas-gc test.

b19fda3

wake: add comment documenting invariant relied on for safety.

bb402eb

wake: refactor --ps to use filter options.

8514362

Drop open_run_jobs from database, unused.

wake: build -> run, phrasing alignment/fixup.

0c3ee16

Per reviewer feedback, thanks!

wake: refactor to optimize matching performance when don't need tag c…

ee82ecb

…oncat. For most use cases we don't need the tags concatenated and grouped by job_id, which is very slow. Before, filters such as used by `--ps` were taking over 0.5s! Now it completes too fast for `\time -v`.

Merge pull request #1842 from sifiveinc/feature/ps

8ec1f08

wake: add --ps command to show currently running jobs

job.wake: add comment about gc safety requirement.

c149d9e

Per reviewer suggestion, thanks!

[NFC] cas: Prefer constants over magic numbers.

4cc7430

unit: Add enumerate_blobs_strings tests, enable cas tests.

08e756a

Merge pull request #1852 from sifiveinc/feature/cas-gc-multi-wake

e36127a

multi-wake: CAS GC implementation

Merge pull request #1853 from sifiveinc/feature/enable-cas-unit-tests…

ab95680

…-and-enumerate_blobs enable cas unit tests, refactor cas to use constants

stage prim: prefer wcl::result over exceptions.

941fa5f

Optimize critical path update, v15 schema for new (extended) index.

d63a874

Chunk the update, as even optimized (~60s -> ~12s) that's still much too long for a single transaction.

Will Dietz (dtzSiFive) mentioned this pull request May 15, 2026

database: add index to speed up revtop query. #1850

Closed

Abrar Quazi (AbrarQuazi) force-pushed the feature/stage-as-prim branch from 941fa5f to bb492d0 Compare May 26, 2026 19:36

Base automatically changed from feature/stage-as-prim to master May 26, 2026 22:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize critical path update, v15 schema for new (extended) index.#1857

Optimize critical path update, v15 schema for new (extended) index.#1857
Will Dietz (dtzSiFive) wants to merge 96 commits into
masterfrom
fix/revtop-index-schema-15-chunked

Will Dietz (dtzSiFive) commented May 15, 2026

Uh oh!

Will Dietz (dtzSiFive) commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Will Dietz (dtzSiFive) commented May 15, 2026

Uh oh!

Will Dietz (dtzSiFive) commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants