Skip to content

tests: proper synchronization for timing-dependent integration tests#181

Merged
borisbat merged 4 commits into
masterfrom
bbatkin/test-sync-gates
Jun 5, 2026
Merged

tests: proper synchronization for timing-dependent integration tests#181
borisbat merged 4 commits into
masterfrom
bbatkin/test-sync-gates

Conversation

@borisbat

@borisbat borisbat commented Jun 5, 2026

Copy link
Copy Markdown
Owner

Why

A full audit of all 131 integration tests, prompted by the slow-node flakiness/crash surfaced in #118 (re-enabling windows-latest exposed timing-dependent failures). The audit classified every test:

  • robust: 99
  • timing_dependent: 20
  • not_applicable (lint/compile/unit, no live host): 11

The 20 timing-dependent tests relied on the value-poll window incidentally absorbing a coroutine play, on wait_for_widget (registry existence) where paint was actually required, or on short (30/120-frame) timeouts that give only ~5–20 polls at the 100 ms cadence. Each races on a loaded CI node. This PR replaces those with explicit synchronization gates.

What

Three sync batches + one isolated lint-cleanup commit:

A — short-timeout bumps + double-click gate (high-severity, slow-node)

  • test_docking_basic: all 7 DockBuilder/layout convergence gates 30 → 300 frames.
  • test_inputs_color: two back-to-back click(PALETTE_GREEN) had no idle gate — the 2nd imgui_click coroutine replaced the 1st's in-flight release, under-counting click_count to 1. Now gated with wait_for_mouse_idle between/after. Real correctness bug, not just a tight ceiling.
  • test_focus_control, test_scroll_here_y, test_glfw_synth_keys: focus/scroll gates 120 → 300.

B — gate on paint, not registry existence

  • wait_for_widget → wait_for_render/wait_for_visible where bbox/payload/rendered is read off the gate snapshot: test_edit_tab_item, test_plot_getter (×2), test_snapshot_transform, test_snapshot_unrendered, test_user_control.
  • value/paint-gate upgrades: test_indexed_dynamic (collapse existence-gate + snapshot + equal into one wait_for_payload_value on the fresh default), test_collapsing_header_closable (re-open → wait_for_visible; the close-side absence gate stays widget_exists).

C — gate coroutine plays + strengthen predicates

  • wait_for_mouse_idle after drag, before the value poll: test_drag_drop, test_io_synth_drag, test_layout_helpers.
  • wait_for_key_idle after key_type, before the select-all chord: test_imgui_synth_ctrl_shortcuts.
  • key_stop/mouse_stop/mouse_pos status reads wrapped in a short convergence poll: test_glfw_synth_keys, test_glfw_synth_mouse.
  • predicate/paint correctness: test_edit_external_vectors (gate on the written 0.7/0.8/0.9 values, not mere field presence — the old predicate passed pre-write), test_popup_window (re-showable popup body → wait_for_visible; drop outdated comment), test_triggers (QUICK_SAVE.bboxwait_for_visible).

Lint cleanup (isolated 4th commit) — touching these files brought them into the lint gate's changed-set, surfacing pre-existing warnings: dropped 3 genuinely-unused require math (STYLE030) and underscore-prefixed the wait_until snapshot arg in blocks that poll via post_command and never read it (LINT013). No sync-logic change.

Not in this PR (deferred — design call)

The audit also found harness-level "workflow timing bug" candidates that match the #118 suspicion — these are not per-test and need a design decision, so they're held for a separate change:

  • with_recording_app fixed port 9090 vs with_imgui_app per-worker offset (Windows shared-loopback cross-test bleed if --worker-index ever fails to forward under isolated mode).
  • Shared 30 s ready/reset timeout under Defender + 5-module-per-spawn load (plausibly the actual windows-latest fastfail cause).

The deferred synchronous playwright command channel would structurally obviate Batch C / the input-coroutine class; Batches A/B are server-state convergence gates a sync input channel doesn't observe, so they're worth shipping now regardless.

Verification

  • Lint gate (utils/lint/main.das on all changed files): 20 files, 0 issues.
  • das-fmt --verify: 20 files verified.
  • compile_check clean on all 20.
  • Full local integration-suite run deferred — CI is the regression gate for the gate-convergence semantics.

🤖 Generated with Claude Code

borisbat and others added 4 commits June 5, 2026 11:17
…h A)

High-severity slow-node flake fixes from the integration-test sync audit.

- test_docking_basic: all 7 DockBuilder/layout convergence gates used a
  30-frame (0.5s, ~5-poll) budget. Multi-server-frame docking convergence
  races that on a loaded CI node; bump every wait_for_payload_value to the
  300-frame (5s) suite norm.
- test_inputs_color: two back-to-back click(PALETTE_GREEN) with no idle gate
  let the second imgui_click coroutine replace the first's in-flight release,
  under-counting click_count to 1. Gate on wait_for_mouse_idle between and
  after the clicks so both press+release pairs fully land. Real correctness
  bug, not just a tight ceiling.
- test_focus_control, test_scroll_here_y, test_glfw_synth_keys: focus/scroll
  convergence gates at 120 frames (2s) bumped to 300 to match the suite norm.

All effects are event-gated, so the larger ceilings cost nothing on a fast node.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
wait_for_widget returns on registry existence (registered at module init, or
rendered once before), so a snapshot read straight off it can predate the first
painted frame and carry stale/zero bbox/payload/rendered fields. Swap each such
startup gate to wait_for_render / wait_for_visible (registered AND painted this
frame), and upgrade two existence gates to value/paint gates:

- test_edit_tab_item, test_plot_getter (x2): bbox/kind/samples read off the gate
  snapshot -> wait_for_render.
- test_snapshot_transform, test_snapshot_unrendered: geometric/rendered-flag
  assertions on the gate snapshot -> wait_for_visible.
- test_user_control: the post-toggle "still rendering while control disabled"
  check used wait_for_widget, which returns on the stale registry entry and
  proves nothing about a new frame -> wait_for_render actually proves a frame
  was produced after the toggle.
- test_indexed_dynamic: re-added CHANNEL_VOL[1] fresh-default read collapsed
  from wait_for_widget + snapshot + equal into a single wait_for_payload_value
  on value==0.5, gating on the converged default instead of mere existence.
- test_collapsing_header_closable: re-open gate switched from widget_exists to
  wait_for_visible (widget_rendered); the close-side absence gate stays
  widget_exists (correct test for disappearance).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…h C)

drag/click/key plays post an imgui_mouse_play / imgui_key_type timeline and
return on POST success, not play completion; *_stop reads status synchronously
across a possible frame boundary. The downstream value poll usually absorbs the
play, but a starved render thread races it. Add explicit idle/convergence gates
between the action and the assert:

- test_drag_drop, test_io_synth_drag, test_layout_helpers: wait_for_mouse_idle
  after the drag, before the value poll.
- test_imgui_synth_ctrl_shortcuts: wait_for_key_idle after key_type, before the
  select-all chord, so the chord doesn't interleave the type timeline's tail
  (matches the gate already used in the sibling subtests).
- test_glfw_synth_keys, test_glfw_synth_mouse: key_stop/mouse_stop status read
  wrapped in a short wait_until poll until playing==false; mouse_pos ownership
  read gated on cursor_owned converging instead of a synchronous status read.

Predicate / paint correctness:
- test_edit_external_vectors: the round-trip gate only checked field presence,
  which is true before the write (SF3 already exposes a non-null float3). Now
  gates on the written component values (0.7/0.8/0.9), actually proving the set.
- test_popup_window: re-showable popup body gated with wait_for_visible (exists
  AND painted) instead of bare widget_exists; drop the outdated widget_rendered
  comment.
- test_triggers: QUICK_SAVE.bbox read off a bare snapshot now gates on
  wait_for_visible so the menu_item is painted in the snapshot frame.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…LINT013)

Touching these files brings them into the changed-set the lint gate checks, so
pre-existing warnings (a recent daslang surfaces them) now block the push. Both
are the lint's own canonical fixes, kept separate from the sync logic in A/B/C:

- STYLE030: drop the genuinely-unused `require math` from test_docking_basic,
  test_glfw_synth_keys, test_imgui_synth_ctrl_shortcuts (float2/int casts are
  builtins, not pulled through math). test_glfw_synth_mouse keeps its require
  math — it uses abs().
- LINT013: underscore-prefix the wait_until block's snapshot arg in the blocks
  that poll via post_command (key_status/mouse_status) and never read the
  snapshot. `options no_unused_block_arguments = false` only suppresses the
  compiler error, not the lint, which wants the `_` intent marker (same as the
  linq codegen). snap-reading blocks are left untouched.

Lint gate: 20 changed files, 0 issues.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens timing-dependent integration tests by replacing incidental “poll window absorbs async” behavior with explicit synchronization gates (render/visible checks, input-idle drains, and longer convergence timeouts), reducing flakiness on slower/loaded CI nodes (notably Windows).

Changes:

  • Switch several tests from existence-only gates (wait_for_widget) to paint/visibility gates (wait_for_render / wait_for_visible) where snapshots/bbox/rendered state are required.
  • Add explicit input-timeline synchronization (wait_for_mouse_idle, wait_for_key_idle) around clicks/drags/typing to avoid coroutine interleaving races.
  • Increase short convergence ceilings (e.g., 30/120 → 300 frames) and perform minor lint cleanup (drop unused require math, underscore unused block args).

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/integration/test_user_control.das Use render gate to assert UI is still painting while control is disabled.
tests/integration/test_triggers.das Gate on menu item visibility before asserting bbox-dependent snapshot fields.
tests/integration/test_snapshot_unrendered.das Use visibility gate for rendered widget snapshot assertions.
tests/integration/test_snapshot_transform.das Use visibility gate to ensure transformed widget is painted before reading geometry.
tests/integration/test_scroll_here_y.das Increase scroll convergence timeouts to reduce timing flakiness.
tests/integration/test_popup_window.das Use visibility gate for re-showable popup body instead of existence-only polling.
tests/integration/test_plot_getter.das Use render gates before reading plot kind/payload fields.
tests/integration/test_layout_helpers.das Wait for mouse idle after drag before polling value/geometry.
tests/integration/test_io_synth_drag.das Wait for mouse idle after drag timeline before value assertions.
tests/integration/test_inputs_color.das Add mouse-idle gates between clicks to prevent click coroutine replacement/races.
tests/integration/test_indexed_dynamic.das Gate directly on fresh-default payload value after re-add.
tests/integration/test_imgui_synth_ctrl_shortcuts.das Drop unused math; add key-idle gate before chord to avoid interleaving with key_type tail.
tests/integration/test_glfw_synth_mouse.das Avoid synchronous status reads by polling for convergence; lint _snap unused args.
tests/integration/test_glfw_synth_keys.das Drop unused math; poll for key status convergence (stop/press/release) and bump focus timeout.
tests/integration/test_focus_control.das Increase focus convergence timeouts.
tests/integration/test_edit_tab_item.das Use render gate before interacting with tab item state.
tests/integration/test_edit_external_vectors.das Strengthen predicate to gate on written component values, not field presence.
tests/integration/test_drag_drop.das Wait for mouse idle after drag gesture before polling acceptance.
tests/integration/test_docking_basic.das Drop unused math; bump multiple docking/layout convergence ceilings to 300 frames.
tests/integration/test_collapsing_header_closable.das Use visibility gate when re-opening to avoid stale registry existence.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@borisbat borisbat merged commit a04dc21 into master Jun 5, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants