fix: add pre-loop event init and tail event drain for back-edge sync … by zhangstevenunity · Pull Request #444 · hw-native-sys/PTOAS

zhangstevenunity · 2026-04-04T12:29:04Z

The generated C++ Cube kernel hangs with --enable-insert-sync because:

Back-edge wait_flag ops hoisted to loop heads fire on iteration 0 against uninitialized event registers. This patch adds a pre-loop set_flag for each back-edge wait to initialize the event before the first iteration.
The tail sync (ptoas_auto_sync_tail with kBarrierAll) only emits pipe_barrier(PIPE_ALL) which does not drain pending event flag registers. This patch adds explicit wait_flag calls at the function tail to drain all back-edge event dependencies.

…428) The generated C++ Cube kernel hangs with --enable-insert-sync because: 1. Back-edge wait_flag ops hoisted to loop heads fire on iteration 0 against uninitialized event registers. This patch adds a pre-loop set_flag for each back-edge wait to initialize the event before the first iteration. 2. The tail sync (ptoas_auto_sync_tail with kBarrierAll) only emits pipe_barrier(PIPE_ALL) which does not drain pending event flag registers. This patch adds explicit wait_flag calls at the function tail to drain all back-edge event dependencies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request implements back-edge synchronization handling to resolve issue #428, ensuring event registers are initialized before loops and drained at the function tail. Review feedback identifies a logic error in the event draining process, redundant code for setting insertion points, and opportunities to use more idiomatic C++ for comparison and deduplication.

gemini-code-assist · 2026-04-04T12:30:46Z

lib/PTO/Transforms/InsertSync/SyncCodegen.cpp

+    for (auto *sync : syncIR_.back()->pipeAfter) {
+      if (sync->uselessSync || sync->eventIds.empty())
+        continue;
+      if (sync->isSyncWaitType()) {


There appears to be a logic error here. To "drain" pending events at the function tail, the compiler should identify producer operations (set_flag) that have not been matched by a consumer (wait_flag). If a wait_flag is already present in pipeAfter, the event is already being drained. The comment on line 403 correctly states this should scan for set_flag ops, but the code checks for isSyncWaitType(). This should be changed to isSyncSetType() to correctly identify pending signals that require an explicit drain.

Suggested change

if (sync->isSyncWaitType()) {

if (sync->isSyncSetType()) {

gemini-code-assist · 2026-04-04T12:30:46Z

lib/PTO/Transforms/InsertSync/SyncCodegen.cpp

+    EmitTailEventDrain(rewriter, ret);
+
+    // Re-set insertion point before ret (after any drain ops we just emitted)
    rewriter.setInsertionPoint(ret);


This call to setInsertionPoint is redundant. The insertion point is already set to ret (before the return operation) at line 318. Creating operations via the rewriter in EmitTailEventDrain maintains the insertion point before ret, so there is no need to reset it here.

gemini-code-assist · 2026-04-04T12:30:46Z

lib/PTO/Transforms/InsertSync/SyncCodegen.cpp

+    bool operator<(const EventKey &o) const {
+      if (src != o.src) return static_cast<unsigned>(src) < static_cast<unsigned>(o.src);
+      if (dst != o.dst) return static_cast<unsigned>(dst) < static_cast<unsigned>(o.dst);
+      return eventId < o.eventId;
+    }


The comparison operator can be simplified using std::tie for better readability and maintainability. This avoids manual field-by-field comparison logic.

bool operator<(const EventKey &o) const { return std::tie(src, dst, eventId) < std::tie(o.src, o.dst, o.eventId); }

gemini-code-assist · 2026-04-04T12:30:46Z

lib/PTO/Transforms/InsertSync/SyncCodegen.cpp

+  auto addUnique = [&](PipelineType src, PipelineType dst, int eid) {
+    EventKey key{src, dst, eid};
+    for (auto &existing : drainEvents) {
+      if (existing == key)
+        return;
+    }
+    drainEvents.push_back(key);
+  };


Instead of a manual linear search for uniqueness in a lambda, you can simply collect all events and then use std::unique after the sorting step at line 421. This is more idiomatic and efficient as the number of events grows.

auto addUnique = [&](PipelineType src, PipelineType dst, int eid) { drainEvents.push_back({src, dst, eid}); };

gemini-code-assist · 2026-04-04T12:30:46Z

lib/PTO/Transforms/InsertSync/SyncCodegen.cpp

+    return;
+
+  // Sort for deterministic output.
+  llvm::sort(drainEvents);


If the addUnique logic is simplified to just push_back, you should perform the deduplication here after sorting.

Suggested change

llvm::sort(drainEvents);

llvm::sort(drainEvents);

drainEvents.erase(std::unique(drainEvents.begin(), drainEvents.end()), drainEvents.end());

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ff9be95a47

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-04T12:34:18Z

lib/PTO/Transforms/InsertSync/SyncCodegen.cpp

+    for (auto *waitSync : pair.second) {
+      if (waitSync->uselessSync || waitSync->eventIds.empty())
+        continue;
+      addUnique(waitSync->GetActualSrcPipe(), waitSync->GetActualDstPipe(),
+                waitSync->eventIds[0]);


Guard tail event drains by control-flow reachability

This unconditionally adds every collected loop-head event to the tail drain set, so AppendAutoSyncTailBarrierIfNeeded emits wait_flag for those events on all return paths. If a loop is conditionally skipped (e.g., loop inside an scf.if false branch), its pre-loop set_flag is never executed, but the function-tail drain wait still runs and can block on an uninitialized event, reintroducing hangs. The drain list needs to be filtered to events guaranteed to be initialized on the current return path (or emitted under matching control flow).

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-04T12:34:18Z

lib/PTO/Transforms/InsertSync/SyncCodegen.cpp

+      if (!sync->isSyncWaitType())
+        continue;
+      if (sync->eventIds.empty())
+        continue;


Restrict pre-loop init to backward-loop waits

This treats all waits hoisted to LOOP_BEGIN as back-edge waits and emits synthetic pre-loop set_flag for each, but MoveSyncState::PlanMoveOutWaitSync hoists any wait whose producer is outside the loop (not only loop-carried/back-edge cases). Initializing those ordinary waits here can satisfy them before the real producer runs and weaken required ordering. Filter candidates with an explicit back-edge marker (for example GetForEndIndex().has_value() / reallocated-loop-head-tail metadata) before adding them to preLoopInitWaits_.

Useful? React with 👍 / 👎.

gemini-code-assist bot reviewed Apr 4, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Apr 4, 2026

View reviewed changes

zhangstevenunity closed this Apr 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add pre-loop event init and tail event drain for back-edge sync …#444

fix: add pre-loop event init and tail event drain for back-edge sync …#444
zhangstevenunity wants to merge 1 commit intomainfrom
fix/issue-428-sync-hang

zhangstevenunity commented Apr 4, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 4, 2026

Uh oh!

gemini-code-assist bot Apr 4, 2026

Uh oh!

gemini-code-assist bot Apr 4, 2026

Uh oh!

gemini-code-assist bot Apr 4, 2026

Uh oh!

gemini-code-assist bot Apr 4, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 4, 2026

Uh oh!

chatgpt-codex-connector bot Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	llvm::sort(drainEvents);
	llvm::sort(drainEvents);
	drainEvents.erase(std::unique(drainEvents.begin(), drainEvents.end()), drainEvents.end());

Conversation

zhangstevenunity commented Apr 4, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant