Conversation
|
This PR looks fine but keep in mind my remarks concerning computing that shared memory pool size statically using PE instead of a bespoke pass. Also why merge this into master considering the plugin system isn't merged there currently and probably won't be for a while ? |
|
Sorry, I should have mentioned that up there, this is independent of the plugin and programming model stuff. It just adds a parameter to allocate a given amount of dynamic shared memory upon launch. There's a corresponding PR that updates the runtime accordingly. We will need this regardless of how we end up computing the size, and this is also useful as just a standalone thing, so we thought it would be best to just get this into main right away to minimize divergence. |
80e537d to
05587c5
Compare
0509bd9 to
00bb7bd
Compare
c02028e to
bb992d8
Compare
|
|
||
| llvm::Value* AMDGPUCodeGen::emit_local_memory(llvm::IRBuilder<>& irbuilder, const Continuation* continuation) { | ||
| return emit_local_memory_base_ptr(irbuilder, continuation); | ||
| } |
There was a problem hiding this comment.
why is this identical function copy/pasted here and in nvvm ? since there are no alternative implementations there's no reason to do this, just have the default impl in llvm.cpp be this one
There was a problem hiding this comment.
I just did it the same way it was done for reserve_shared(). It's not valid to use this function unless one of the GPU backends is being used. So the base implementation in llvm.cpp emits an error, and the GPU backends override that with a call to the actual implementation.
bb992d8 to
ff2aaf6
Compare
This adds a parameter to allocate a given amount of dynamic shared memory upon kernel launch.
corresponding runtime changes: AnyDSL/runtime#41