Skip to content

Fixed GitHub Actions CI#345

Open
tdulcet wants to merge 10 commits intopreda:masterfrom
tdulcet:master
Open

Fixed GitHub Actions CI#345
tdulcet wants to merge 10 commits intopreda:masterfrom
tdulcet:master

Conversation

@tdulcet
Copy link
Contributor

@tdulcet tdulcet commented Dec 14, 2025

CC: @N-Storm

gwoltman added 4 commits March 4, 2026 19:09
…ormance increase on TitanV.

Standardized LDS memory layout and bar() strategy.
Made a cleaner, common shufl routine to handle multiple lines using new constants SHUFL_BYTES_W and SHUFL_BYTES_H.
Reverse line routines overhauled to use LDS memory layout and bar() strategy.
Added L2STORE and LULOAD routines for nVidia.  Need to study which GPUs might benefit.
Deprecated BIGLIT=0.
RTX4xxx and RTX5xxx GPUs benefit from L2STORE and LULOAD.  Added support for those options officially.
Since FAST_BARRIER seems to now work on nVidia, the option is now tuned.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants