[SYCL][CUDA] joint_matrix required changes following #11215 by JackAKirk · Pull Request #11563 · intel/llvm

JackAKirk · 2023-10-17T15:01:48Z

As discussed in #11215 this patch:

removed mutable from joint_matrix_cuda (This change requires an upstream llvm patch (https://reviews.llvm.org/rGb781c7ab574f))
removed get_wi_data()

I also added back the cases that the change in the joint_matrix_mad interface allows: namely when the type of C/D matrices differ. I correspondingly updated the tests, to test the new cases that are supported.

I also updated the support matrix for cuda in the spec doc for the newly supported combinations.

Added new supported mma builtins where C/D types differ. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

and test new cases. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

const variables. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

from tf32 device code check test. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

aelovikov-intel · 2023-10-18T14:48:59Z

sycl/test/check_device_code/cuda/matrix/matrix-nvptx-tf32-test.cpp

              sg, sub_c, accC.template get_multi_ptr<access::decorated::yes>(),
              N, layout::row_major);

-          // Round a, b to tf32


I think this file/directory needs an entry in CODEOWNERS file. Please merge a separate PR with that change. Once done, I expect not review from SYCL RT team will be required here.

I've done this here: #11582

YuriPlyakhin

LGTM

dkhaldi · 2023-10-18T20:56:21Z

sycl/test-e2e/Matrix/joint_matrix_gemm_cuda.hpp

              }

-              joint_matrix_mad(sg, sub_c, sub_a, sub_b, sub_c);
+              joint_matrix_mad(sg, sub_d, sub_a, sub_b, sub_c);


how many times this will actually run?
You want the return matrix to be "sub_c" so it can be input again, right?

Just once, I changed it to run once so that sub_d can be a different joint_matrix (with a different type to sub_c), in order to test the new cases I added to the support matrix/joint_matrix_tensorcores_sm70.cpp.

dkhaldi · 2023-10-18T20:57:17Z

sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_oneapi_matrix.asciidoc

+|16 |16 |16
+|8 |32 |16
+|32 |8 |16
+.3+| `matrix_type::fp16`  .3+| `matrix_type::fp16` .3+| `matrix_type::fp32`


BTW, where are the bfloat16 combinations?

They are in the same table at the bottom, between tf32 and fp64.

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

dkhaldi

LGTM

JackAKirk · 2023-10-19T14:58:34Z

sycl/test/check_device_code/cuda/matrix/matrix-nvptx-tf32-test.cpp

-          // Round a, b to tf32
-          for (auto i = 0; i < 4; ++i)
-            get_wi_data(sg, sub_a)[i] =
-                round_to_tf32(get_wi_data(sg, sub_a)[i]);


I've just realized that to be consistent round_to_tf32 should also have a device code check. I removed it here since it didn't have one, but it will be better to add it back via

auto round_lambda = [](auto &x) { x = round_to_tf32(x); }; joint_matrix_apply(sg, sub_a, round_lambda);

and add a check that it calls the correct nvvm builtin.

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

smanna12

FE change LGTM

gmlueck

spec changes OK

npmiller

matrix-nvptx-tf32-test.cpp changes LGTM

aelovikov-intel · 2023-10-20T16:13:42Z

the type of C/D matrices differs

I think it should be "the types of C/D matrices differ".

aelovikov-intel · 2023-10-20T16:16:26Z

I don't see any changes that require runtime-reviewers approval. @JackAKirk , should I merge this in?

JackAKirk · 2023-10-20T16:17:17Z

I don't see any changes that require runtime-reviewers approval. @JackAKirk , should I merge this in?

Yes please!

JackAKirk added 5 commits October 17, 2023 07:43

Remove mutable joint_matrix for cuda.

22d64c5

Added new supported mma builtins where C/D types differ. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

Removed get_wi_data() for cuda backend.

476a6b4

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

Updated the tests to support the changes,

2d63b86

and test new cases. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

Required upstream fixes to correct signatures of

cfab449

const variables. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

Fix Format.

96ab577

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

JackAKirk temporarily deployed to WindowsCILock October 17, 2023 15:55 — with GitHub Actions Inactive

JackAKirk temporarily deployed to WindowsCILock October 17, 2023 16:26 — with GitHub Actions Inactive

JackAKirk added 2 commits October 17, 2023 09:28

Update cuda mma ops support table.

f4e7f42

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

Correct the cuda support table from the last commit.

39cd26d

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

JackAKirk temporarily deployed to WindowsCILock October 17, 2023 16:44 — with GitHub Actions Inactive

JackAKirk temporarily deployed to WindowsCILock October 17, 2023 17:19 — with GitHub Actions Inactive

Removed unnecessary get_wi_data()

b337e58

from tf32 device code check test. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

JackAKirk temporarily deployed to WindowsCILock October 18, 2023 08:46 — with GitHub Actions Inactive

JackAKirk had a problem deploying to WindowsCILock October 18, 2023 09:13 — with GitHub Actions Failure

JackAKirk marked this pull request as ready for review October 18, 2023 13:31

JackAKirk requested review from a team, YuriPlyakhin, dkhaldi and yubingex007-a11y as code owners October 18, 2023 13:31

JackAKirk requested a review from aelovikov-intel October 18, 2023 13:31

aelovikov-intel reviewed Oct 18, 2023

View reviewed changes

YuriPlyakhin approved these changes Oct 18, 2023

View reviewed changes

dkhaldi reviewed Oct 18, 2023

View reviewed changes

Small fixes.

d57c448

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

JackAKirk requested a review from a team as a code owner October 19, 2023 08:57

JackAKirk requested a review from npmiller October 19, 2023 08:57

Fixed format.

93637f6

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

JackAKirk temporarily deployed to WindowsCILock October 19, 2023 09:04 — with GitHub Actions Inactive

JackAKirk temporarily deployed to WindowsCILock October 19, 2023 09:34 — with GitHub Actions Inactive

dkhaldi approved these changes Oct 19, 2023

View reviewed changes

JackAKirk commented Oct 19, 2023

View reviewed changes

elizabethandrews approved these changes Oct 19, 2023

View reviewed changes

Added a single dev code check for round_to_tf32.

6758362

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

JackAKirk temporarily deployed to WindowsCILock October 19, 2023 16:47 — with GitHub Actions Inactive

smanna12 approved these changes Oct 19, 2023

View reviewed changes

JackAKirk temporarily deployed to WindowsCILock October 19, 2023 17:36 — with GitHub Actions Inactive

gmlueck approved these changes Oct 20, 2023

View reviewed changes

npmiller approved these changes Oct 20, 2023

View reviewed changes

JackAKirk requested a review from aelovikov-intel October 20, 2023 12:46

aelovikov-intel merged commit 8cae553 into intel:sycl Oct 20, 2023

Conversation

JackAKirk commented Oct 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aelovikov-intel Oct 18, 2023

Choose a reason for hiding this comment

Uh oh!

JackAKirk Oct 18, 2023

Choose a reason for hiding this comment

Uh oh!

YuriPlyakhin left a comment

Choose a reason for hiding this comment

Uh oh!

dkhaldi Oct 18, 2023

Choose a reason for hiding this comment

Uh oh!

JackAKirk Oct 19, 2023

Choose a reason for hiding this comment

Uh oh!

dkhaldi Oct 18, 2023

Choose a reason for hiding this comment

Uh oh!

JackAKirk Oct 19, 2023

Choose a reason for hiding this comment

Uh oh!

dkhaldi left a comment

Choose a reason for hiding this comment

Uh oh!

JackAKirk Oct 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JackAKirk Oct 20, 2023

Choose a reason for hiding this comment

Uh oh!

smanna12 left a comment

Choose a reason for hiding this comment

Uh oh!

gmlueck left a comment

Choose a reason for hiding this comment

Uh oh!

npmiller left a comment

Choose a reason for hiding this comment

Uh oh!

aelovikov-intel commented Oct 20, 2023

Uh oh!

aelovikov-intel commented Oct 20, 2023

Uh oh!

JackAKirk commented Oct 20, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

JackAKirk commented Oct 17, 2023 •

edited

Loading

JackAKirk Oct 19, 2023 •

edited

Loading