Unable to use flashmm on Volta (sm_70) GPUs

I was able to build the flashmm package and install it for "compute_70" without any error, but, when running the test, an error occurred:

$ python3 test_flash_mm.py 
max diff for mm block: tensor(2.3842e-05, device='cuda:0', grad_fn=<SelectBackward0>)
average diff for mm block: tensor(1.7822e-06, device='cuda:0', grad_fn=<MeanBackward0>)
Traceback (most recent call last):
  File "test_flash_mm.py", line 182, in <module>
    print("max diff:", diff[argmax_diff])
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Flashmm works on another machine having a GPU with compute 8.6, but it seems not to work on sm_70.
Is it expected?

CONDA ENVIRONMENT:
PyTorch Version: 2.1.0+cu121
PyTorch CUDA version: 12.1
PyTorch arch_list: ['sm_50', 'sm_60', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90']
PyTorch CUDNN: True 8902
GPU PyTorch Logical Name 0 : Tesla V100S-PCIE-32GB
	Capability: (7, 0)
	Total memory: 34072559616

Another test with CUDA_LAUNCH_BLOCKING=1 is more descriptive:

$ CUDA_LAUNCH_BLOCKING=1 CUDA_VISIBLE_DEVICES=0 python3 test_flash_mm.py 
max diff for mm block: tensor(2.3842e-05, device='cuda:0', grad_fn=<SelectBackward0>)
average diff for mm block: tensor(1.7822e-06, device='cuda:0', grad_fn=<MeanBackward0>)
Traceback (most recent call last):
  File "test_flash_mm.py", line 176, in <module>
    out = fast_hyena_filter(
  File "test_flash_mm.py", line 112, in fast_hyena_filter
    k = hyena_filter_fwd(
RuntimeError: CUDA error: an illegal memory access was encountered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Many thanks for you help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to use flashmm on Volta (sm_70) GPUs #45

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Unable to use flashmm on Volta (sm_70) GPUs #45

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions