tests: proper synchronization for timing-dependent integration tests#181
Merged
Conversation
…h A) High-severity slow-node flake fixes from the integration-test sync audit. - test_docking_basic: all 7 DockBuilder/layout convergence gates used a 30-frame (0.5s, ~5-poll) budget. Multi-server-frame docking convergence races that on a loaded CI node; bump every wait_for_payload_value to the 300-frame (5s) suite norm. - test_inputs_color: two back-to-back click(PALETTE_GREEN) with no idle gate let the second imgui_click coroutine replace the first's in-flight release, under-counting click_count to 1. Gate on wait_for_mouse_idle between and after the clicks so both press+release pairs fully land. Real correctness bug, not just a tight ceiling. - test_focus_control, test_scroll_here_y, test_glfw_synth_keys: focus/scroll convergence gates at 120 frames (2s) bumped to 300 to match the suite norm. All effects are event-gated, so the larger ceilings cost nothing on a fast node. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
wait_for_widget returns on registry existence (registered at module init, or rendered once before), so a snapshot read straight off it can predate the first painted frame and carry stale/zero bbox/payload/rendered fields. Swap each such startup gate to wait_for_render / wait_for_visible (registered AND painted this frame), and upgrade two existence gates to value/paint gates: - test_edit_tab_item, test_plot_getter (x2): bbox/kind/samples read off the gate snapshot -> wait_for_render. - test_snapshot_transform, test_snapshot_unrendered: geometric/rendered-flag assertions on the gate snapshot -> wait_for_visible. - test_user_control: the post-toggle "still rendering while control disabled" check used wait_for_widget, which returns on the stale registry entry and proves nothing about a new frame -> wait_for_render actually proves a frame was produced after the toggle. - test_indexed_dynamic: re-added CHANNEL_VOL[1] fresh-default read collapsed from wait_for_widget + snapshot + equal into a single wait_for_payload_value on value==0.5, gating on the converged default instead of mere existence. - test_collapsing_header_closable: re-open gate switched from widget_exists to wait_for_visible (widget_rendered); the close-side absence gate stays widget_exists (correct test for disappearance). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…h C) drag/click/key plays post an imgui_mouse_play / imgui_key_type timeline and return on POST success, not play completion; *_stop reads status synchronously across a possible frame boundary. The downstream value poll usually absorbs the play, but a starved render thread races it. Add explicit idle/convergence gates between the action and the assert: - test_drag_drop, test_io_synth_drag, test_layout_helpers: wait_for_mouse_idle after the drag, before the value poll. - test_imgui_synth_ctrl_shortcuts: wait_for_key_idle after key_type, before the select-all chord, so the chord doesn't interleave the type timeline's tail (matches the gate already used in the sibling subtests). - test_glfw_synth_keys, test_glfw_synth_mouse: key_stop/mouse_stop status read wrapped in a short wait_until poll until playing==false; mouse_pos ownership read gated on cursor_owned converging instead of a synchronous status read. Predicate / paint correctness: - test_edit_external_vectors: the round-trip gate only checked field presence, which is true before the write (SF3 already exposes a non-null float3). Now gates on the written component values (0.7/0.8/0.9), actually proving the set. - test_popup_window: re-showable popup body gated with wait_for_visible (exists AND painted) instead of bare widget_exists; drop the outdated widget_rendered comment. - test_triggers: QUICK_SAVE.bbox read off a bare snapshot now gates on wait_for_visible so the menu_item is painted in the snapshot frame. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…LINT013) Touching these files brings them into the changed-set the lint gate checks, so pre-existing warnings (a recent daslang surfaces them) now block the push. Both are the lint's own canonical fixes, kept separate from the sync logic in A/B/C: - STYLE030: drop the genuinely-unused `require math` from test_docking_basic, test_glfw_synth_keys, test_imgui_synth_ctrl_shortcuts (float2/int casts are builtins, not pulled through math). test_glfw_synth_mouse keeps its require math — it uses abs(). - LINT013: underscore-prefix the wait_until block's snapshot arg in the blocks that poll via post_command (key_status/mouse_status) and never read the snapshot. `options no_unused_block_arguments = false` only suppresses the compiler error, not the lint, which wants the `_` intent marker (same as the linq codegen). snap-reading blocks are left untouched. Lint gate: 20 changed files, 0 issues. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR hardens timing-dependent integration tests by replacing incidental “poll window absorbs async” behavior with explicit synchronization gates (render/visible checks, input-idle drains, and longer convergence timeouts), reducing flakiness on slower/loaded CI nodes (notably Windows).
Changes:
- Switch several tests from existence-only gates (
wait_for_widget) to paint/visibility gates (wait_for_render/wait_for_visible) where snapshots/bbox/rendered state are required. - Add explicit input-timeline synchronization (
wait_for_mouse_idle,wait_for_key_idle) around clicks/drags/typing to avoid coroutine interleaving races. - Increase short convergence ceilings (e.g., 30/120 → 300 frames) and perform minor lint cleanup (drop unused
require math, underscore unused block args).
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/integration/test_user_control.das | Use render gate to assert UI is still painting while control is disabled. |
| tests/integration/test_triggers.das | Gate on menu item visibility before asserting bbox-dependent snapshot fields. |
| tests/integration/test_snapshot_unrendered.das | Use visibility gate for rendered widget snapshot assertions. |
| tests/integration/test_snapshot_transform.das | Use visibility gate to ensure transformed widget is painted before reading geometry. |
| tests/integration/test_scroll_here_y.das | Increase scroll convergence timeouts to reduce timing flakiness. |
| tests/integration/test_popup_window.das | Use visibility gate for re-showable popup body instead of existence-only polling. |
| tests/integration/test_plot_getter.das | Use render gates before reading plot kind/payload fields. |
| tests/integration/test_layout_helpers.das | Wait for mouse idle after drag before polling value/geometry. |
| tests/integration/test_io_synth_drag.das | Wait for mouse idle after drag timeline before value assertions. |
| tests/integration/test_inputs_color.das | Add mouse-idle gates between clicks to prevent click coroutine replacement/races. |
| tests/integration/test_indexed_dynamic.das | Gate directly on fresh-default payload value after re-add. |
| tests/integration/test_imgui_synth_ctrl_shortcuts.das | Drop unused math; add key-idle gate before chord to avoid interleaving with key_type tail. |
| tests/integration/test_glfw_synth_mouse.das | Avoid synchronous status reads by polling for convergence; lint _snap unused args. |
| tests/integration/test_glfw_synth_keys.das | Drop unused math; poll for key status convergence (stop/press/release) and bump focus timeout. |
| tests/integration/test_focus_control.das | Increase focus convergence timeouts. |
| tests/integration/test_edit_tab_item.das | Use render gate before interacting with tab item state. |
| tests/integration/test_edit_external_vectors.das | Strengthen predicate to gate on written component values, not field presence. |
| tests/integration/test_drag_drop.das | Wait for mouse idle after drag gesture before polling acceptance. |
| tests/integration/test_docking_basic.das | Drop unused math; bump multiple docking/layout convergence ceilings to 300 frames. |
| tests/integration/test_collapsing_header_closable.das | Use visibility gate when re-opening to avoid stale registry existence. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
A full audit of all 131 integration tests, prompted by the slow-node flakiness/crash surfaced in #118 (re-enabling
windows-latestexposed timing-dependent failures). The audit classified every test:The 20 timing-dependent tests relied on the value-poll window incidentally absorbing a coroutine play, on
wait_for_widget(registry existence) where paint was actually required, or on short (30/120-frame) timeouts that give only ~5–20 polls at the 100 ms cadence. Each races on a loaded CI node. This PR replaces those with explicit synchronization gates.What
Three sync batches + one isolated lint-cleanup commit:
A — short-timeout bumps + double-click gate (high-severity, slow-node)
test_docking_basic: all 7 DockBuilder/layout convergence gates30 → 300frames.test_inputs_color: two back-to-backclick(PALETTE_GREEN)had no idle gate — the 2ndimgui_clickcoroutine replaced the 1st's in-flight release, under-countingclick_countto 1. Now gated withwait_for_mouse_idlebetween/after. Real correctness bug, not just a tight ceiling.test_focus_control,test_scroll_here_y,test_glfw_synth_keys: focus/scroll gates120 → 300.B — gate on paint, not registry existence
wait_for_widget → wait_for_render/wait_for_visiblewhere bbox/payload/renderedis read off the gate snapshot:test_edit_tab_item,test_plot_getter(×2),test_snapshot_transform,test_snapshot_unrendered,test_user_control.test_indexed_dynamic(collapse existence-gate + snapshot + equal into onewait_for_payload_valueon the fresh default),test_collapsing_header_closable(re-open →wait_for_visible; the close-side absence gate stayswidget_exists).C — gate coroutine plays + strengthen predicates
wait_for_mouse_idleafter drag, before the value poll:test_drag_drop,test_io_synth_drag,test_layout_helpers.wait_for_key_idleafterkey_type, before the select-all chord:test_imgui_synth_ctrl_shortcuts.key_stop/mouse_stop/mouse_posstatus reads wrapped in a short convergence poll:test_glfw_synth_keys,test_glfw_synth_mouse.test_edit_external_vectors(gate on the written 0.7/0.8/0.9 values, not mere field presence — the old predicate passed pre-write),test_popup_window(re-showable popup body →wait_for_visible; drop outdated comment),test_triggers(QUICK_SAVE.bbox→wait_for_visible).Lint cleanup (isolated 4th commit) — touching these files brought them into the lint gate's changed-set, surfacing pre-existing warnings: dropped 3 genuinely-unused
require math(STYLE030) and underscore-prefixed thewait_untilsnapshot arg in blocks that poll viapost_commandand never read it (LINT013). No sync-logic change.Not in this PR (deferred — design call)
The audit also found harness-level "workflow timing bug" candidates that match the #118 suspicion — these are not per-test and need a design decision, so they're held for a separate change:
with_recording_appfixed port 9090 vswith_imgui_appper-worker offset (Windows shared-loopback cross-test bleed if--worker-indexever fails to forward under isolated mode).windows-latestfastfail cause).The deferred synchronous playwright command channel would structurally obviate Batch C / the input-coroutine class; Batches A/B are server-state convergence gates a sync input channel doesn't observe, so they're worth shipping now regardless.
Verification
utils/lint/main.dason all changed files): 20 files, 0 issues.das-fmt --verify: 20 files verified.compile_checkclean on all 20.🤖 Generated with Claude Code