Skip to content

test#1

Draft
cketcham2333 wants to merge 11 commits intomainfrom
hololink_bridge_relay_bp
Draft

test#1
cketcham2333 wants to merge 11 commits intomainfrom
hololink_bridge_relay_bp

Conversation

@cketcham2333
Copy link
Copy Markdown
Owner

No description provided.

cketcham2333 and others added 11 commits April 2, 2026 13:08
…ript

build_qec.sh: Build HSB with Hololink patches and cuda-quantum realtime with hololink tools enabled; fail if DOCA/Holoscan prerequisites missing. unittests/CMakeLists.txt: Add test_realtime_qldpc_graph_decoding target; gate utils/ on CUDAQX_QEC_ENABLE_HOLOLINK_TOOLS; link libcudaq-realtime-host-dispatch for DT_RUNPATH. New files: bridge tool, playback tool, orchestration script, CI test, decoder config, syndrome data.

Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
build_qec.sh: add gnupg to apt-get install so install_dev_prerequisites.sh can import the DOCA GPG key. CMakeLists.txt: broaden find_library search for libcudaq and libnvqir (remove NO_DEFAULT_PATH, add CUDAQ_DIR fallback), wrap with generator expressions to avoid linker errors when not found.

Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
…can install

The cudaqx CI container has Mellanox OFED pre-installed, which conflicts with doca-all. Install only libdoca-sdk-gpunetio-dev and holoscan-cuda directly. Strip unused HSB operators to avoid configure failures from missing deps. Drop nvcomp (not needed by gpu_roce_transceiver).

Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
Use curl for DOCA GPG key (drop gnupg dep), install only libdoca-sdk-gpunetio-dev, add cuda-nvrtc-dev for hololink_core, add Holoscan dpkg --force-depends fallback, strip unused HSB operators, upgrade cmake via pip for Holoscan SDK >= 3.30.4 requirement.

Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
…overy crash

Broadened find_library search paths for libcudaq and libnvqir in
utils/CMakeLists.txt (removed NO_DEFAULT_PATH, added CUDAQ_DIR
fallback) and wrapped their link references with generator expressions
to handle environments where the full CUDA-Q SDK is not installed.
This mirrors the same fix already applied to the sibling
unittests/CMakeLists.txt.
Added DISCOVERY_MODE PRE_TEST to gtest_discover_tests for
test_realtime_qldpc_graph_decoding. The default POST_BUILD mode
executes the binary at build time to enumerate tests, which crashes
with "undefined symbol: __quantum__qis__y__ctl" because
libcudaq-qec.so's quantum runtime dependencies cannot be resolved
until the proper test environment is set up.

Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
The nvrtc package version was derived from just the CUDA major version
(e.g. "13"), causing apt to install cuda-nvrtc-dev-13-2 (latest 13.x)
into /usr/local/cuda-13.2/ while the container has CUDA 13.0. CMake's
FindCUDAToolkit could not find CUDA::nvrtc in the 13.0 toolkit path,
breaking the HSB hololink_core build. Now extracts the full version
(e.g. "13.0") so the correct cuda-nvrtc-dev-13-0 package is installed.

Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
…PU (NVIDIA#473)

This PR resolves cudaErrorInvalidResourceHandle / torch.AcceleratorError
when running GQE on GPU (e.g. GPT2LMHeadModel(...).to("cuda")) by
initializing PyTorch’s CUDA stack and performing a minimal GPU
allocation before building the transformer, instead of relying on
`conftest.py` preloads and bundled libcudart symlinks.

This should fix https://nvbugspro.nvidia.com/bug/5801752

Signed-off-by: vedika-saravanan <vsaravanan@nvidia.com>
…test

gtest_discover_tests executes the binary to enumerate test cases,
which crashes with "undefined symbol: __quantum__qis__y__ctl" because
libcudaq-qec.so's CUDA-Q quantum runtime dependencies are unresolvable
in CI environments that lack the full CUDA-Q SDK. DISCOVERY_MODE
PRE_TEST only defers this crash from build time to ctest time.
Switching to add_test avoids executing the binary during discovery
entirely. The test will fail at runtime if the quantum runtime or
QLDPC decoder plugin is not present, which is expected.

Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
The test dynamically loads the nv-qldpc-decoder plugin and depends on
CUDA-Q quantum runtime symbols (libnvqir) at runtime. Neither is
available in CI, so running the binary always crashes. Keep the build
dependency to verify compilation but do not register it as a ctest.

Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants