Skip to content

[GPU verification] Add support for automatic tolerance calculations with split-k #3673

@johannes-graner

Description

@johannes-graner

When using a split-k value above 1, it is not sufficient to count the total number of accumulations since K / split_k accumulations are performed in float, and split_k accumulations are performed in the output data type.

When the output data type is less accurate than float, this leads to additional truncation errors and tolerance must therefore be relaxed. The CPU-verification path shows how this calculation should be performed. It's possible to perform the same calculation in the GPU-verification path by

  1. Calling gpu_reduce_max to get the maximum tensor value
  2. Computing rtol, atol, rtol_split_k, atol_split_k as in the CPU path
  3. Calling gpu_verify with explicit tolerance values

However, it would be simpler if these calculations could be performed in gpu_verify with the user passing in the extra information necessary (split_k, data types, etc.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions