Skip to content

Flaky test from leaked backend state in _specialize_hlo #27

@vgene

Description

@vgene

Problem

_specialize_hlo in trace.py sets a module-level global (_current_backend) to "hlo" and resets it to None. If any exception occurs between those two lines (argument binding, kernel execution, output marking), set_backend(None) never runs and the global stays "hlo".

When running tests in parallel with pytest-xdist, this poisons all subsequent Op dispatches on the same worker. Tests that call kernel functions directly (expecting CPU execution) get HLO tracing instead, producing NKIPyTensorRef objects where numpy arrays are expected.

Reproduction

  • Appears intermittently during pytest

Fix

Ensure set_backend(None) always runs — either via try/finally or by turning the backend set/reset into a context manager.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions