Optimize critical path update, v15 schema for new (extended) index.#1857
Open
Will Dietz (dtzSiFive) wants to merge 96 commits into
Open
Optimize critical path update, v15 schema for new (extended) index.#1857Will Dietz (dtzSiFive) wants to merge 96 commits into
Will Dietz (dtzSiFive) wants to merge 96 commits into
Conversation
Use per-run lock files (.wake/locks/run_N.lock) to detect dead wake processes. Each wake process creates and holds an exclusive fcntl lock on its lock file for the duration of the run. On startup, probe lock files of incomplete runs -- if we can acquire the lock, the original process is dead and we mark it as reaped (end_time == -1). Additionally: If for whatever reason lock files are missing, to avoid blocking GC watermark indefinitely assumes such runs are dead if they started more than 24 hours ago. This is mostly to cover a strange case that shouldn't happen, not a part of anyone's normal flow.
use_id only tracked the most recent run using a job, which breaks when multiple wakes run concurrently - each would overwrite the other's use_id, corrupting GC decisions. run_jobs is a junction table (run_id, job_id) allowing many runs to reference the same job. GC uses a watermark approach: watermark = min(run_id where end_time is null) - 1 Jobs whose newest run_id <= watermark are safe to delete, meaning all runs using them have completed. Schema changes (migration 9 to 10): - Remove use_id column from jobs table - Add run_jobs(run_id, job_id) with CASCADE delete - Add end_time column to runs (NULL while running) - Mark existing runs complete so GC works immediately Query changes: - Insert into run_jobs on job create and reuse - Overlap detection joins run_jobs for current run only - GC queries use watermark instead of use_id filter - --last, --last-used, --last-executed, digraph filter to last completed run (end_time IS NOT NULL) - Critical path (revtop_order) binds this run's run_id Adds finish_run() to set end_time before GC runs.
We don't use this information but it might be of value for debugging or external integration purposes.
Allows us to drop index later if we want. Per reviewer feedback, thanks!
Implement using std::chrono.
Per reviewer feedback, thanks!
Don't check error message presently due to variations across environments.
Second parameter is used first now. Per reviewer suggestion, thanks!
The schema had 'update entropy set seed=0 where 0;' which was used to acquire an exclusive lock on database open, preventing any concurrent access. This is no longer needed with proper transaction handling.
Per reviewer feedback, thanks!
Refuse to clean while builds are in progress. First reap dead runs (refactor), then with RW lock held: * Check if incomplete jobs, exit * Delete files
The file-based build lock prevented concurrent wake invocations. This is no longer needed, remove it!
database: add lock files for dead run detection
* schema: 12: -stale, no longer unique on path * Send in all path info to job prims * Find files using richer information. * Consider multiple candidates in reuse_job, since find_prior doesn't check specifics of visible. * Update add_hash for multi-wake Don't remove records, but do support new mtime. * Update detect_overlap, delete_overlap. Keep these for time being.
multi-wake: remove .wake-build-lock mechanism
Display whether each run is running, crashed, or duration. Adds end_time to RunReflection. Migrated runs show 0s since we lack their actual end time.
Also check that making lock files read-only doesn't prevent wake --history from determining liveness.
Targeted fix for re2's StringPiece differences across versions.
Record starttime eagerly via start_job() at fork time rather than writing it in finish_job, so queued jobs (starttime==0) are distinguishable from running ones. Also allows showing how long jobs have actually been running. Add --active, --queued, --in-flight capture filters, and fix --canceled (now only jobs not part of live runs). These all are filters on unfinished jobs.
Support RO run lock probing, use in multi-wake's --history.
Don't report jobs that failed but finished. These coud be produced via: job_fail_launch -> finish_job .
Lists jobs grouped by run_id, showing elapsed time for running jobs and [queued] for jobs not yet forked.
wake: record starttime early, add capture filters; fix --canceled
Prune unreachable file records, they have no use. Runs after clearing run_files for this run and delete_jobs.
Ignore errors encountered, instead soldier on! Errors shouldn't happen, but don't bail if there's a stray file from a user's editor or the like.
Return usual exit code even if cas GC encounters error.
Drop open_run_jobs from database, unused.
Per reviewer feedback, thanks!
…oncat. For most use cases we don't need the tags concatenated and grouped by job_id, which is very slow. Before, filters such as used by `--ps` were taking over 0.5s! Now it completes too fast for `\time -v`.
wake: add --ps command to show currently running jobs
Per reviewer suggestion, thanks!
multi-wake: CAS GC implementation
…-and-enumerate_blobs enable cas unit tests, refactor cas to use constants
The job overhead for wake-stage's trivial amount of work amplifies costs especially of startup latency when sourcing many files. Additionally, the point of staging is to get the required information ASAP so as to not race with concurrent modifications. Putting into job queue is more overhead than just doing the reflink directly, and as a primitive we ensure the staging is done immediately not stuck behind a queue of hashing jobs. This was especially noticeable with concurrent runs (multi-wake). Issue diagnosed with help of `wake --ps`! :)
Chunk the update, as even optimized (~60s -> ~12s) that's still much too long for a single transaction.
Contributor
Author
|
This chunking, and that added in #1852 , should probably be moved to adaptively chunk based on a time budget instead of magic numbers. Magic time budget instead. Regardless, these "stop the bleeding" and appear to be sufficient in my testing. |
941fa5f to
bb492d0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Chunk the update, as even optimized (~60s -> ~12s) that's still much too long for a single transaction.
Otherwise other wake runs cannot make forward progress for the duration.
Replaces #1850 ; this has a single index that works for various filetree queries that today rely on
filesearch, while also being usable for this work.