[FIX] Fix zero systolic array utilization during SDPA execution in TO…#220
Open
student-Jungmin wants to merge 1 commit intoPSAL-POSTECH:feat/deepseekfrom
Open
[FIX] Fix zero systolic array utilization during SDPA execution in TO…#220student-Jungmin wants to merge 1 commit intoPSAL-POSTECH:feat/deepseekfrom
student-Jungmin wants to merge 1 commit intoPSAL-POSTECH:feat/deepseekfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR resolves a critical issue where TOGSim simulation results for SDPA kernels incorrectly reported zero systolic array (SA) utilization.
The root cause was identified in the printOperation() function within mlir/test/lib/Analysis/TestTileOperationGraph.cpp (around line 183). In the previous implementation, the MLIR analysis pass failed to properly emit loop kind attributes for certain affine.for structures. Because the TOGSim TileGraphParser specifically looks for these attributes to determine how to traverse nested operations, the lack of an explicit loop type caused all sub-operations within those loops to be disregarded during the Tile Operation Graph (TOG) generation.
To fix this, I updated the mlir template to avoid this problem, aligning with the implementation in TOGSim/src/TileGraphParser.cc.