-
Notifications
You must be signed in to change notification settings - Fork 0
perf(startup): add report-only startup measurement #149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,131 @@ | ||
| # Startup Time Under 10ms Measurement Plan | ||
|
|
||
| ## Roadmap Item | ||
|
|
||
| - `ROADMAP.md`: `Startup time: < 10ms (vs ~100ms for interpreter)` | ||
|
|
||
| ## Scope Decision | ||
|
|
||
| This PR does **not** claim the `<10ms` target is achieved. It establishes repeatable startup measurement and report-only CI visibility so the next optimization PR can be judged against real data. The roadmap checkbox must stay unchecked until the measured target is met. | ||
|
|
||
| The target should apply to the standalone/native execution path, not ordinary `forge run app.fg`. A source file run still has to start the Rust CLI, parse CLI args, read source, lex, parse, typecheck, initialize runtime state, and execute. The native/standalone path is the only realistic place for `<10ms`. | ||
|
|
||
| ## Current State | ||
|
|
||
| - `forge run app.fg` goes through the full CLI/frontend/interpreter path. | ||
| - `forge run app.fgc` skips lex/parse but still starts the CLI and VM. | ||
| - `forge build --native` can now produce a standalone source-runtime binary when `libforge_lang.a` is present. | ||
| - Existing `benches/fork_for_serving.rs` measures per-request fork cost, not process startup. | ||
| - There is no repeatable startup benchmark, no CI trend signal, and no agreed measurement definition. | ||
|
|
||
| ## Measurement Definition | ||
|
|
||
| Measure cold-ish process startup wall time from parent process spawn to child process exit for short-lived programs. | ||
|
|
||
| Initial modes: | ||
|
|
||
| 1. `source-run`: `forge run hello.fg` | ||
| 2. `bytecode-run`: `forge run hello.fgc` | ||
| 3. `native-source-runtime`: generated `forge build --native hello.fg` binary when `libforge_lang.a` is available | ||
| 4. `aot-bytecode`: generated `forge build --aot hello.fg` binary when `libforge_lang.a` is available | ||
|
|
||
| Short-lived fixture: | ||
|
|
||
| ```forge | ||
| println("ok") | ||
| ``` | ||
|
|
||
| The harness must assert correctness on every run. A child process that exits nonzero, segfaults, times out, or prints unexpected output must fail the measurement instead of looking like a fast startup. | ||
|
|
||
| Use a small `println("ok")` fixture for every mode so the harness can assert stdout-based correctness. Avoid server startup, networking, shell builtins, or filesystem writes in the measured child program. | ||
|
|
||
| ## Implementation Units | ||
|
|
||
| ### U1. Startup Measurement Harness | ||
|
|
||
| Files: | ||
| - Create: `tools/startup_time.rs` or `tests/startup_time.rs` as a small Rust harness binary/test helper | ||
| - Modify: `Cargo.toml` only if using a cargo bench/bin target is necessary | ||
|
|
||
| Do **not** use Criterion for process startup measurement. Criterion is optimized for in-process function benchmarking and its warmup/statistical model is a poor fit for fork/exec wall time. | ||
|
|
||
| Add a custom wall-time harness (or a thin wrapper around `hyperfine` only if introducing that dependency/tool is cleaner) that: | ||
| - Locates the `forge` binary under test. | ||
| - Creates an isolated temp fixture directory. | ||
| - Writes `hello.fg`. | ||
| - Builds `hello.fgc`. | ||
| - Requires the caller/CI job to provide `FORGE_LIB_DIR` pointing at an existing `libforge_lang.a`. | ||
| - Builds native artifacts with `FORGE_LIB_DIR` set so standalone modes are actually measured. | ||
| - Measures process spawn-to-exit wall time for each mode using `std::process::Command` and `Instant`. | ||
| - Runs enough repetitions to report min/median/p95 or min/mean/p95. | ||
| - Asserts every child exits successfully and emits expected output where applicable. | ||
| - Times out child processes so hangs fail fast. | ||
|
|
||
| Harness output should be simple, line-oriented, and easy to paste into PRs, for example: | ||
|
|
||
| ```text | ||
| startup.source_run median=... | ||
| startup.bytecode_run median=... | ||
| startup.native_source_runtime median=... | ||
| startup.aot_bytecode median=... | ||
| ``` | ||
|
|
||
| ### U2. Report-Only CI Job | ||
|
|
||
| Files: | ||
| - Modify: `.github/workflows/ci.yml` | ||
|
|
||
| Add a startup benchmark job that: | ||
| - Builds the Forge binary in release mode. | ||
| - Builds `libforge_lang.a` explicitly. | ||
| - Sets `FORGE_LIB_DIR` to the directory containing `libforge_lang.a`. | ||
| - Runs the startup measurement harness. | ||
|
|
||
| Keep this report-only for now: | ||
| - The job should fail if the harness does not compile/run or any measured child fails/times out. | ||
| - It should not fail because the measured value is above 10ms yet. | ||
|
|
||
| Rationale: shared CI runners are noisy; the first step is a trend signal. | ||
|
|
||
| ### U3. Budget Documentation | ||
|
|
||
| Files: | ||
| - Create: `docs/performance/startup.md` or update an existing performance doc if one exists | ||
| - Modify: `CHANGELOG.md` | ||
|
|
||
| Document: | ||
| - Measurement modes and what each means. | ||
| - Why `<10ms` applies to standalone/native startup, not `forge run`. | ||
| - Current status: report-only startup harness exists; hard gate follows after optimization. | ||
| - Future hard-gate proposal: native startup p50/p95 budget once stable baseline is known. | ||
| - CI explicitly builds and measures the standalone native path; native modes must not be silently skipped. | ||
|
|
||
| ### U4. Local Developer Command | ||
|
|
||
| Files: | ||
| - Optional create: `scripts/measure_startup.sh` | ||
|
|
||
| Add a script only if it materially improves developer ergonomics by wrapping the Rust harness with the right release-build and `FORGE_LIB_DIR` setup. Avoid duplicating measurement logic between shell and Rust. | ||
|
|
||
| ## Risks | ||
|
|
||
| - Process startup benchmarks are noisy on GitHub-hosted runners. | ||
| - Harness setup must not accidentally measure build time. | ||
| - Native source-runtime binaries embed the interpreter and may not get close to `<10ms`; if so, the next item may require a bytecode/native runner fast path rather than optimizing the source-runtime path. | ||
| - Launcher-mode native binaries must be labeled separately from standalone source-runtime binaries; the roadmap target cares about standalone. | ||
| - Without storing historical baselines, CI output is visibility-only; this PR should not pretend to provide trend analysis yet. | ||
| - The native measurements require a working C compiler (`cc`) and static library; CI must install/use the available platform toolchain explicitly. | ||
|
|
||
| ## Verification | ||
|
|
||
| - `cargo fmt -- --check` | ||
| - `cargo test` | ||
| - `cargo clippy --all-targets -- -A clippy::approx_constant -A clippy::result_large_err -A clippy::only_used_in_recursion -A clippy::len_zero` | ||
| - The new startup measurement command/harness | ||
| - Existing Forge integration tests remain green. | ||
|
|
||
| ## Success Criteria | ||
|
|
||
| - Developers can run one command to see startup timings for source, bytecode, and available native modes. | ||
| - CI exposes startup timing regressions as benchmark output. | ||
| - The roadmap item remains unchecked, with a clear next optimization target based on measured data. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,32 @@ | ||
| # Startup Time Measurement | ||
|
|
||
| Forge's roadmap target of `<10ms` startup applies to standalone/native execution paths, not to `forge run app.fg`. | ||
|
|
||
| `forge run app.fg` intentionally does more work: starts the CLI, reads source, lexes, parses, typechecks, initializes the runtime, and executes. Native and bytecode paths can skip parts of that work and are the realistic target for sub-10ms startup. | ||
|
|
||
| ## Harness | ||
|
|
||
| Startup timing is measured by `tools/startup_time.rs`, a small Rust process-level harness. It measures wall time from parent process spawn to child process exit and verifies each child prints `ok`. | ||
|
|
||
| The harness measures: | ||
|
|
||
| - `startup.source_run`: `forge run hello.fg` | ||
| - `startup.bytecode_run`: `forge run hello.fgc` | ||
| - `startup.native_source_runtime`: standalone source-runtime binary from `forge build --native` | ||
| - `startup.aot_bytecode`: standalone bytecode binary from `forge build --aot` | ||
|
|
||
| The native modes require `FORGE_LIB_DIR` to point at a directory containing `libforge_lang.a`. | ||
|
|
||
| ## Local Run | ||
|
|
||
| ```bash | ||
| cargo build --release --lib --bin forge | ||
| rustc tools/startup_time.rs -O -o target/startup_time | ||
| FORGE_LIB_DIR=target/release ./target/startup_time --forge ./target/release/forge --warmups 2 --reps 20 | ||
| ``` | ||
|
|
||
| ## CI Status | ||
|
|
||
| CI runs this harness as report-only. The job fails if the harness fails to compile, if fixture builds fail, if any child process exits unsuccessfully, or if output is wrong. It does not yet fail because startup is above 10ms. | ||
|
|
||
| The hard `<10ms` gate should be added after we have stable baseline data and an optimization PR that actually reaches the native startup target. |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the PR reference link to this changelog entry.
This entry is missing the required
([#PR](link))suffix.Suggested fix
📝 Committable suggestion
🤖 Prompt for AI Agents