refactor(shader): consolidate 20 inc*_stat methods into inc_fu_stat#341
Open
eunseo9311 wants to merge 1 commit intogpgpu-sim:devfrom
Open
refactor(shader): consolidate 20 inc*_stat methods into inc_fu_stat#341eunseo9311 wants to merge 1 commit intogpgpu-sim:devfrom
eunseo9311 wants to merge 1 commit intogpgpu-sim:devfrom
Conversation
All 20 functional-unit statistics methods in shader_core_ctx shared the same pattern: accumulate (active_count * latency) into a counter, optionally add inactive-lane overhead (SFU or non-SFU model), and optionally update m_active_exu_threads/warps. Introduce a single inc_fu_stat(counter, active_count, latency, lane_model, update_exu) that captures all four variants: A) NON_SFU + exu update (7 methods: ialu, imul, imul24, fpalu, ...) B) SFU + exu update (10 methods: idiv, fpdiv, sqrt, sin, ...) C) NON_SFU + no exu (1 method: mem) D) NONE + no exu (2 methods: sfu, sp) Each original method is retained as a one-line inline delegating to inc_fu_stat, so all call sites remain unchanged. Net reduction: ~186 lines removed from shader.h. No functional change — identical computation for all patterns.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Consolidate 20 near-identical power statistics methods in
shader_core_ctxinto a single parameterized implementation. No call sites were modified and simulation output is unchanged.Motivation
shader_core_ctxcontained 20 methods (incialu_stat,incfpalu_stat,incsqrt_stat, etc.) that all followed the same three-step pattern:active_count * latencyinactive_lanes_accesses_sfu()orinactive_lanes_accesses_nonsfu()depending on the execution unit type, gated bygpgpu_clock_gated_lanesm_active_exu_threadsandm_active_exu_warpsThe only differences across all 20 methods were which counter to write, which lane overhead function to call, and whether to update the EXU counters — making this a clear consolidation opportunity.
Changes
shader.henum class lane_model { SFU, NON_SFU, NONE }to express the three execution unit categoriesinc_fu_stat()shader.ccshader_core_ctx::inc_fu_stat()Pattern classification
All 20 methods were verified to fall into exactly four patterns:
NON_SFUtrueincialu,incimul,incimul24,incfpalu,incfpmul,incdpalu,incdpmul(7)SFUtrueincimul32,incidiv,incfpdiv,incdpdiv,incsqrt,inclog,incexp,incsin,inctensor,inctex(10)NON_SFUfalseincmem(1)NONEfalseincsfu,incsp(2)Stats
double *across all 20 methodsTesting
Verified with Rodinia 2.0
hotspoton GTX1080Ti config. Allm_num_*_acessesvalues ingpgpusim.*.logare identical before and after the change.