[SYCL][CUDA][Matrix] Adding test case for tf32#963
[SYCL][CUDA][Matrix] Adding test case for tf32#963pvchupin merged 2 commits intointel:intelfrom hdelan:tf32_joint_matrix
Conversation
|
|
||
| // Convert values if using tf32 | ||
| if constexpr (std::is_same<T3, precision::tf32>::value) { | ||
| for (auto i = 0; i < 4; ++i) { |
There was a problem hiding this comment.
I think it would be OK to merge this test as it is: then I can merge it into #975 and use the element wise interface to address your comment: adding a comment on where 4 comes from would then not be necessary.
There was a problem hiding this comment.
The test will have to change a lot after intel/llvm#5964
you are not using data anymore.
So you want to merge it first as it is and then modify it in 975?
There was a problem hiding this comment.
Basically it will be more simple if intel/llvm#5870 (and this corresponding test PR) is merged before intel/llvm#5964 instead of vice versa: Then the tf32 stuff can be updated in intel/llvm#5964: as you say a fair amount will change but it is straightforward for me to do it. @hdelan is on holiday atm.
There was a problem hiding this comment.
What bothers me is that this PR is what triggered work on wi_marray and slicing additions. So it does not seem right to merge it before the solution.
Is there a reason why you want to merge it before intel/llvm#5964?
There was a problem hiding this comment.
This PR doesn't change the way that element wise ops could be performed in current sycl tip, it is just adding a new free tf32 rounding function and tf32 joint_matrix implementation. I don't think there is anything implicitly wrong with it being merged before #5964. I think It is easier to merge this one first because @hdelan is currently on holiday anyway and I know how to adjust #5964 to take account if these tf32 changes easily. Probably intel/llvm#5964 will be ready to be merged then as well anyway by the sound of things.
There was a problem hiding this comment.
Is the PR that adds the actual tf32 class merged already?
There was a problem hiding this comment.
I see, we agreed on addressing the spec changes in a separate PR (intel/llvm#5870 (review))
In this case, this test can also be approved and merged. Thank you for the clarifications.
) Changing joint_matrix impl to use float as storage type instead of uint32_t for tf32. Test: intel/llvm-test-suite#963
Test for intel/llvm#5870