cuda.core.system: Better checks for when we expect APIs to be unsupported #1510

mdboom · 2026-01-16T18:13:27Z

Prior to this PR, unit tests in cuda.core.system would just always skip the test if an API returned an NotSupportedError. This meant that a test could be skipped even when the documentation suggests that it should in fact work. This PR makes it so such a skip would only be acceptable if the API is documented as not being supported for a given architecture.

There is some nuance, though. Some of these APIs are still failing as not supported even when the documentation says they should be. This may be because some other aspect of the device (not just the architecture) makes it unsupported. Unfortunately, the NVML headers don't offer any clues there, but at least now we know where all of these cases are (because they are noted with unsupported_before(device, None), and we can follow up by filing documentation bugs with NVML and refining the criteria by which tests are skipped even further.

This also removes the behavior of some tests where it would collect different skip reasons for each device and report all of them. This added a lot of complexity for little benefit -- it is probably very rare that a system would have two devices of different architectures, especially on systems that we would use for testing -- so I just took all of that out to simplify things.

copy-pr-bot · 2026-01-16T18:13:30Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

…rted

Copilot

Pull request overview

This PR improves test coverage for NVML APIs in cuda.core.system by implementing better checks for when APIs are expected to be unsupported based on device architecture. The changes replace blanket exception catching with architecture-aware validation using a new unsupported_before context manager.

Changes:

Introduced unsupported_before context manager to conditionally skip or assert on API support based on device architecture
Renamed DeviceArchitecture class to DeviceArch to align with the underlying NVML enum
Simplified test logic by removing complex skip reason collection and replacing it with the new context manager pattern

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 35 comments.

Show a summary per file

File	Description
cuda_core/tests/system/conftest.py	Added `unsupported_before` context manager for architecture-aware test skipping
cuda_bindings/tests/nvml/conftest.py	Added `unsupported_before` context manager for NVML bindings tests
cuda_core/tests/system/test_system_device.py	Refactored tests to use `unsupported_before` and updated to use `DeviceArch`
cuda_bindings/tests/nvml/test_pynvml.py	Updated tests to use `unsupported_before`
cuda_bindings/tests/nvml/test_gpu.py	Updated tests to use `unsupported_before`
cuda_bindings/tests/nvml/test_compute_mode.py	Updated tests to use `unsupported_before`
cuda_core/cuda/core/system/_device.pyx	Renamed `DeviceArchitecture` to `DeviceArch` and moved functionality to use NVML enum directly
cuda_core/docs/source/api.rst	Updated documentation to reference `DeviceArch` instead of `DeviceArchitecture`

Comments suppressed due to low confidence (1)

cuda_core/cuda/core/system/_device.pyx:1050

The docstring still references the old class name "DeviceArchitecture" instead of the new "DeviceArch". The documentation should be updated to match the renamed class.

        """
        return DeviceEvents(self._handle, events)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cuda_core/tests/system/test_system_device.py

cuda_bindings/tests/nvml/test_gpu.py

cuda_core/tests/system/test_system_device.py

cuda_bindings/tests/nvml/test_pynvml.py

cuda_core/tests/system/test_system_device.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

mdboom · 2026-01-16T18:34:28Z

/ok to test

github-actions · 2026-01-16T18:45:01Z

Doc Preview CI
🚀 View preview at https://nvidia.github.io/cuda-python/pr-preview/pr-1510/
https://nvidia.github.io/cuda-python/pr-preview/pr-1510/cuda-core/
https://nvidia.github.io/cuda-python/pr-preview/pr-1510/cuda-bindings/
https://nvidia.github.io/cuda-python/pr-preview/pr-1510/cuda-pathfinder/
Preview will be ready when the GitHub Pages deployment is complete.

mdboom · 2026-01-16T19:03:51Z

/ok to test

mdboom · 2026-01-16T19:05:35Z

/ok to test

mdboom · 2026-01-16T19:17:34Z

/ok to test

mdboom · 2026-01-16T20:55:37Z

/ok to test

rwgk · 2026-01-17T00:08:57Z

cuda_bindings/tests/nvml/conftest.py

    return pci_info
+
+
+@contextmanager


The below came from Cursor (after discarding my cuda.bindings.tests.helpers idea I posted offline).

The idea is that you can give this to Cursor on your end to play with the options.

Suggestion: Deduplicating unsupported_before

I noticed we now have two nearly identical unsupported_before context managers:

cuda_bindings/tests/nvml/conftest.py

cuda_core/tests/system/conftest.py

The logic is the same, but they use different API surfaces:

Aspect nvml conftest cuda.core conftest

Get arch nvml.device_get_architecture(device) device.arch

Get name nvml.device_get_name(device) device.name

Arch enum nvml.DeviceArch system.DeviceArch

Exception nvml.NotSupportedError system.NotSupportedError

Options to consider

Option 1: Factory pattern in cuda_python_test_helpers (recommended)

Keep cuda_python_test_helpers CUDA-agnostic but share the logic via dependency injection:

# cuda_python_test_helpers/__init__.py from contextlib import contextmanager def make_unsupported_before(*, get_arch, get_name, arch_enum, not_supported_error): """Factory to create an unsupported_before context manager with injected dependencies.""" @contextmanager def unsupported_before(device, expected_device_arch): device_arch = get_arch(device) if isinstance(expected_device_arch, arch_enum): expected_device_arch_int = int(expected_device_arch) elif expected_device_arch == "FERMI": expected_device_arch_int = 1 else: expected_device_arch_int = 0 if ( expected_device_arch is None or expected_device_arch == "HAS_INFOROM" or device_arch == arch_enum.UNKNOWN ): try: yield except not_supported_error: import pytest pytest.skip( f"Unsupported call for device architecture {arch_enum(device_arch).name} " f"on device '{get_name(device)}'" ) elif int(device_arch) < expected_device_arch_int: import pytest with pytest.raises(not_supported_error): yield pytest.skip(f"Unsupported before {expected_device_arch.name}, got {get_name(device)}") else: yield return unsupported_before

Then in each conftest:

# cuda_bindings/tests/nvml/conftest.py from cuda_python_test_helpers import make_unsupported_before from cuda.bindings import _nvml as nvml unsupported_before = make_unsupported_before( get_arch=nvml.device_get_architecture, get_name=nvml.device_get_name, arch_enum=nvml.DeviceArch, not_supported_error=nvml.NotSupportedError, )

# cuda_core/tests/system/conftest.py from cuda_python_test_helpers import make_unsupported_before from cuda.core import system unsupported_before = make_unsupported_before( get_arch=lambda d: d.arch, get_name=lambda d: d.name, arch_enum=system.DeviceArch, not_supported_error=system.NotSupportedError, )

Pros: Single source of truth for the logic, cuda_python_test_helpers stays CUDA-agnostic.
Cons: Slightly more indirection.

Option 2: Let cuda_python_test_helpers depend on cuda-bindings

Provide a single nvml-based implementation. cuda.core tests would extract device handles.

Pros: Simpler API.
Cons: Reverses the direction of commit 6afdd5c; ties test helpers to CUDA packages.

Option 3: Accept the duplication

~40 lines duplicated in 2 files. The implementations use different abstraction layers, so some might argue they're legitimately different.

Pros: No new abstractions.
Cons: Bug fixes / enhancements need to be applied twice.

I'd lean toward Option 1 since it keeps the CUDA-agnostic design while eliminating the duplication. Happy to pair on this if you want to explore it further.

mdboom requested a review from Copilot January 16, 2026 18:13

Copilot started reviewing on behalf of mdboom January 16, 2026 18:13 View session

cuda.core.system: Better checks for when we expect APIs to be unsuppo…

2df0c31

…rted

mdboom force-pushed the cuda.core.system-better-compat-check branch from 7cddcb4 to 2df0c31 Compare January 16, 2026 18:16

Copilot AI reviewed Jan 16, 2026

View reviewed changes

mdboom and others added 3 commits January 16, 2026 13:28

Update cuda_core/tests/system/test_system_device.py

70a252d

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update cuda_core/tests/system/conftest.py

a9d074b

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update cuda_bindings/tests/nvml/conftest.py

c56e49b

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Fix importing of conftest.py

0808139

Fix tests

9b571b4

Fix tests

032c4e6

rwgk reviewed Jan 17, 2026

View reviewed changes

Aspect	nvml conftest	cuda.core conftest
Get arch	`nvml.device_get_architecture(device)`	`device.arch`
Get name	`nvml.device_get_name(device)`	`device.name`
Arch enum	`nvml.DeviceArch`	`system.DeviceArch`
Exception	`nvml.NotSupportedError`	`system.NotSupportedError`

cuda.core.system: Better checks for when we expect APIs to be unsupported #1510

Are you sure you want to change the base?

cuda.core.system: Better checks for when we expect APIs to be unsupported #1510

Conversation

mdboom commented Jan 16, 2026

Uh oh!

copy-pr-bot bot commented Jan 16, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mdboom commented Jan 16, 2026

Uh oh!

github-actions bot commented Jan 16, 2026

Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

mdboom commented Jan 16, 2026

Uh oh!

mdboom commented Jan 16, 2026

Uh oh!

mdboom commented Jan 16, 2026

Uh oh!

mdboom commented Jan 16, 2026

Uh oh!

rwgk Jan 17, 2026

Choose a reason for hiding this comment

Suggestion: Deduplicating unsupported_before

Options to consider

Option 1: Factory pattern in cuda_python_test_helpers (recommended)

Option 2: Let cuda_python_test_helpers depend on cuda-bindings

Option 3: Accept the duplication

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Suggestion: Deduplicating `unsupported_before`

Option 1: Factory pattern in `cuda_python_test_helpers` (recommended)

Option 2: Let `cuda_python_test_helpers` depend on `cuda-bindings`