Skip to content

feat(bench): use nanosecond resolution for benchmarks#3406

Open
bikallem wants to merge 1 commit intomoonbitlang:mainfrom
bikallem:bench-nanosecond-resolution
Open

feat(bench): use nanosecond resolution for benchmarks#3406
bikallem wants to merge 1 commit intomoonbitlang:mainfrom
bikallem:bench-nanosecond-resolution

Conversation

@bikallem
Copy link
Copy Markdown
Contributor

@bikallem bikallem commented Apr 1, 2026

Summary

Update the bench library to work with nanoseconds instead of microseconds, enabling moon bench to accurately measure sub-microsecond operations.

Changes

  • monotonic_clock_wasm.mbt: multiply elapsed seconds by 1e9 (was 1e6)
  • monotonic_clock_js.mbt: multiply elapsed milliseconds by 1e6 (was 1e3)
  • bench.mbt: rename iter_n_microsecondsiter_n_nanoseconds, update warmup threshold from 100_000 (100ms in µs) to 100_000_000 (100ms in ns)

The native backend (monotonic_clock_native.mbt) needs no change — it directly calls moonbit_monotonic_clock_stop via extern "C", which is updated in the runtime PR to return nanoseconds.

Merge order

These three PRs must land in this order:

  1. feat: return nanoseconds from monotonic_clock_stop moonbit-native-runtime#1 (land first) — runtime returns nanoseconds
  2. This PR (land second) — bench library uses ns thresholds and conversion factors
  3. feat(bench): display nanosecond resolution in bench output moon#1588 (land last) — display code adds ns tier to auto_select_unit

CI note: The native target bench test will fail until moonbitlang/moonbit-native-runtime#1 lands. The native backend's monotonic_clock_end is a direct extern "C" mapping to moonbit_monotonic_clock_stop, which still returns microseconds in CI. This causes the warmup loop to hang because the threshold (100_000_000 ns) is 1000x larger than the values the old runtime returns (~µs). Wasm, wasm-gc, and JS targets pass because their conversion is done in MoonBit code (updated in this PR).

Before / After

Before (µs floor):

ycbcr_to_rgb/0   0.01 µs ±  0.00 µs

After (ns precision):

ycbcr_to_rgb/0  14.30 ns ±  0.06 ns

Test plan

  • moon bench on native target shows ns-resolution values for fast benchmarks (requires runtime#1)
  • moon bench on wasm target shows correct values (wasm conversion factor updated)
  • moon bench on js target shows correct values (js conversion factor updated)
  • Warmup loop still calibrates batch sizes correctly (100ms target unchanged)

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

@hackwaly
Copy link
Copy Markdown
Contributor

hackwaly commented Apr 2, 2026

It's just a format issue? I see no strong reason to change ms to ns.

@bikallem
Copy link
Copy Markdown
Contributor Author

bikallem commented Apr 2, 2026

It's just a format issue? I see no strong reason to change ms to ns.

not at all. It is the measurement resolution itself, i.e if the function/method being benchmarked measures in nanoseconds, the benchmark always gives 0.01 µs regardless of the actual nanosecond timing. Here is the before and after nanoseconds resolutions. The latter one is more precise, the first not so much if the actual timing is in nanoseconds.

Before (µs floor):
ycbcr_to_rgb/0 0.01 µs ± 0.00 µs

After (ns precision):
ycbcr_to_rgb/0 14.30 ns ± 0.06 ns

Update the bench library to work with nanoseconds instead of
microseconds, enabling accurate measurement of sub-microsecond
operations in `moon bench`.

Changes:
- monotonic_clock_wasm.mbt: multiply elapsed seconds by 1e9 (was 1e6)
- monotonic_clock_js.mbt: multiply elapsed ms by 1e6 (was 1e3)
- bench.mbt: rename iter_n_microseconds -> iter_n_nanoseconds, update
  warmup threshold from 100_000 (100ms in µs) to 100_000_000 (100ms
  in ns)

Note: the native backend (`monotonic_clock_native.mbt`) needs no
change since it directly calls `moonbit_monotonic_clock_stop` via
extern "C", which is updated in the runtime to return nanoseconds.

Coordinated with:
- moonbitlang/moonbit-native-runtime: moonbit_monotonic_clock_stop
  returns ns
- moonbitlang/moon: display code uses ns as base unit
@bikallem bikallem force-pushed the bench-nanosecond-resolution branch from 08facf5 to e81566b Compare April 2, 2026 10:13
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 4 additional findings in Devin Review.

Open in Devin Review

fn iter_count(name? : String, inner : () -> Unit, count : UInt) -> Summary {
let count = count.land(0x7FFFFFFF).reinterpret_as_int()
let threshold = 100000.0 // 100 ms
let threshold = 100000000.0 // 100 ms
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Native backend monotonic_clock_end not updated for microsecond-to-nanosecond conversion

The PR converts benchmark timing from microseconds to nanoseconds for JS and WASM backends, but the native/llvm backend (bench/monotonic_clock_native.mbt:23) was not updated. The moonbit_monotonic_clock_stop C extern presumably still returns microseconds (matching the old contract that JS and WASM also returned microseconds). Since bench/bench.mbt:28 now sets threshold = 100000000.0 (100ms in nanoseconds) but native's clock still returns microseconds, the calibration loop in iter_count will interpret μs values as ns — making the threshold effectively 100 seconds instead of 100ms. This causes the warm-up phase to run ~1000× longer than intended on native/llvm targets, and all reported timing statistics will be in microseconds while the code expects nanoseconds.

Prompt for agents
The PR updates the JS backend (monotonic_clock_js.mbt, multiplier 1000.0 -> 1000000.0) and WASM backend (monotonic_clock_wasm.mbt, multiplier 1000000.0 -> 1000000000.0) to return nanoseconds instead of microseconds, but the native/llvm backend in bench/monotonic_clock_native.mbt was not touched. The extern C function moonbit_monotonic_clock_stop presumably still returns microseconds. Either the MoonBit wrapper in monotonic_clock_native.mbt needs to multiply the result by 1000.0 to convert from microseconds to nanoseconds, or the C runtime function itself needs to be updated. Without this fix, benchmarks on native/llvm targets will have the calibration threshold off by 1000x (interpreted as 100 seconds instead of 100ms) and all timing values will be in the wrong unit.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@hackwaly hackwaly requested a review from peter-jerry-ye April 2, 2026 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants