Draft
Conversation
…ript build_qec.sh: Build HSB with Hololink patches and cuda-quantum realtime with hololink tools enabled; fail if DOCA/Holoscan prerequisites missing. unittests/CMakeLists.txt: Add test_realtime_qldpc_graph_decoding target; gate utils/ on CUDAQX_QEC_ENABLE_HOLOLINK_TOOLS; link libcudaq-realtime-host-dispatch for DT_RUNPATH. New files: bridge tool, playback tool, orchestration script, CI test, decoder config, syndrome data. Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
build_qec.sh: add gnupg to apt-get install so install_dev_prerequisites.sh can import the DOCA GPG key. CMakeLists.txt: broaden find_library search for libcudaq and libnvqir (remove NO_DEFAULT_PATH, add CUDAQ_DIR fallback), wrap with generator expressions to avoid linker errors when not found. Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
…can install The cudaqx CI container has Mellanox OFED pre-installed, which conflicts with doca-all. Install only libdoca-sdk-gpunetio-dev and holoscan-cuda directly. Strip unused HSB operators to avoid configure failures from missing deps. Drop nvcomp (not needed by gpu_roce_transceiver). Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
Use curl for DOCA GPG key (drop gnupg dep), install only libdoca-sdk-gpunetio-dev, add cuda-nvrtc-dev for hololink_core, add Holoscan dpkg --force-depends fallback, strip unused HSB operators, upgrade cmake via pip for Holoscan SDK >= 3.30.4 requirement. Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
…overy crash Broadened find_library search paths for libcudaq and libnvqir in utils/CMakeLists.txt (removed NO_DEFAULT_PATH, added CUDAQ_DIR fallback) and wrapped their link references with generator expressions to handle environments where the full CUDA-Q SDK is not installed. This mirrors the same fix already applied to the sibling unittests/CMakeLists.txt. Added DISCOVERY_MODE PRE_TEST to gtest_discover_tests for test_realtime_qldpc_graph_decoding. The default POST_BUILD mode executes the binary at build time to enumerate tests, which crashes with "undefined symbol: __quantum__qis__y__ctl" because libcudaq-qec.so's quantum runtime dependencies cannot be resolved until the proper test environment is set up. Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
The nvrtc package version was derived from just the CUDA major version (e.g. "13"), causing apt to install cuda-nvrtc-dev-13-2 (latest 13.x) into /usr/local/cuda-13.2/ while the container has CUDA 13.0. CMake's FindCUDAToolkit could not find CUDA::nvrtc in the 13.0 toolkit path, breaking the HSB hololink_core build. Now extracts the full version (e.g. "13.0") so the correct cuda-nvrtc-dev-13-0 package is installed. Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
…PU (NVIDIA#473) This PR resolves cudaErrorInvalidResourceHandle / torch.AcceleratorError when running GQE on GPU (e.g. GPT2LMHeadModel(...).to("cuda")) by initializing PyTorch’s CUDA stack and performing a minimal GPU allocation before building the transformer, instead of relying on `conftest.py` preloads and bundled libcudart symlinks. This should fix https://nvbugspro.nvidia.com/bug/5801752 Signed-off-by: vedika-saravanan <vsaravanan@nvidia.com>
…test gtest_discover_tests executes the binary to enumerate test cases, which crashes with "undefined symbol: __quantum__qis__y__ctl" because libcudaq-qec.so's CUDA-Q quantum runtime dependencies are unresolvable in CI environments that lack the full CUDA-Q SDK. DISCOVERY_MODE PRE_TEST only defers this crash from build time to ctest time. Switching to add_test avoids executing the binary during discovery entirely. The test will fail at runtime if the quantum runtime or QLDPC decoder plugin is not present, which is expected. Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
The test dynamically loads the nv-qldpc-decoder plugin and depends on CUDA-Q quantum runtime symbols (libnvqir) at runtime. Neither is available in CI, so running the binary always crashes. Keep the build dependency to verify compilation but do not register it as a ctest. Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.