Skip to content

fix: propagate CuTe DSL runtime link requirements for static libraries#103

Open
francismelon wants to merge 1 commit into
NVIDIA:mainfrom
francismelon:fix/cutedsl-static-link-propagation
Open

fix: propagate CuTe DSL runtime link requirements for static libraries#103
francismelon wants to merge 1 commit into
NVIDIA:mainfrom
francismelon:fix/cutedsl-static-link-propagation

Conversation

@francismelon
Copy link
Copy Markdown

What does this PR do?

Type of change: Bug fix

Overview:
This PR fixes CuTe DSL link propagation for static library targets.

Before this change, cute_dsl_setup() did not propagate all CuTe DSL runtime compatibility pieces when the target type was STATIC_LIBRARY. As a result, downstream executables could fail during the final link step because they did not inherit all required link dependencies and options.

This change updates the static-library path to propagate:

  • the CuTe DSL static archive
  • the CuTe DSL cudart shim
  • the CUDA driver library
  • the _cudaLaunchKernelEx wrap linker option for CUDA < 12.8

The existing private-link behavior for non-static targets is preserved.

Usage

This is a build-system fix only. No user-facing API or runtime usage change is introduced.

🚀 Pull Request Checklist

Thank you for contributing to TensorRT Edge-LLM! Before we review your pull request, please make sure the following items are complete.
Please also refer to Contributor guidelines for general guidelines.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit.
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing.

📄 Documentation

  • Updated any necessary documentation

⚙️ Compatibility

  • The change is backward compatible

Additional Information

Reproduction environment

The issue was reproduced during cross-build configuration for Jetson Orin with:

  • -DCMAKE_BUILD_TYPE=Release
  • -DTRT_PACKAGE_DIR=/usr
  • -DCMAKE_TOOLCHAIN_FILE=cmake/aarch64_linux_toolchain.cmake
  • -DEMBEDDED_TARGET=jetson-orin
  • -DCUDA_CTK_VERSION=12.6
  • -DENABLE_CUTE_DSL=ALL

Failure before this fix

Before this change, the build failed at the final executable link step for llm_inference with unresolved CuTe DSL CUDA runtime symbols:

/usr/bin/ld: ../../../cpp/kernels/cuteDSLArtifact/aarch64/sm_87/libcutedsl_aarch64.a(CudaDialectRuntime.c.o): in function `_cudaLibraryLoadData':
(.text+0x0): undefined reference to `cudaLibraryLoadData'
/usr/bin/ld: ../../../cpp/kernels/cuteDSLArtifact/aarch64/sm_87/libcutedsl_aarch64.a(CudaDialectRuntime.c.o): in function `_cudaLibraryUnload':
(.text+0x8): undefined reference to `cudaLibraryUnload'
/usr/bin/ld: ../../../cpp/kernels/cuteDSLArtifact/aarch64/sm_87/libcutedsl_aarch64.a(CudaDialectRuntime.c.o): in function `_cudaLibraryGetKernel':
(.text+0x10): undefined reference to `cudaLibraryGetKernel'
/usr/bin/ld: ../../../cpp/kernels/cuteDSLArtifact/aarch64/sm_87/libcutedsl_aarch64.a(CudaDialectRuntime.c.o): in function `_cudaKernelSetAttributeForDevice':
(.text+0x20): undefined reference to `cudaKernelSetAttributeForDevice'
collect2: error: ld returned 1 exit status
make[2]: *** [examples/llm/CMakeFiles/llm_inference.dir/build.make:131: examples/llm/llm_inference] Error 1
make[1]: *** [CMakeFiles/Makefile2:389: examples/llm/CMakeFiles/llm_inference.dir/all] Error 2
make: *** [Makefile:91: all] Error 2

Validation after this fix

Validation performed locally:

  • pre-commit run --files cmake/CuteDsl.cmake
  • Configure succeeded with:
    • cmake .. -DCMAKE_BUILD_TYPE=Release -DTRT_PACKAGE_DIR=/usr -DCMAKE_TOOLCHAIN_FILE=cmake/aarch64_linux_toolchain.cmake -DEMBEDDED_TARGET=jetson-orin -DCUDA_CTK_VERSION=12.6 -DENABLE_CUTE_DSL=ALL
  • Build succeeded
  • Runtime binary startup validation succeeded

Relevant successful outputs after this fix included:

[100%] Linking CXX shared library ../libNvInfer_edgellm_plugin.so
[100%] Built target NvInfer_edgellm_plugin
./examples/llm/llm_inference --help
Usage: ./examples/llm/llm_inference [--help] [--engineDir=<path to engine directory>] [--multimodalEngineDir=<path to multimodal engine directory>] [--inputFile=<path to input file>] [--outputFile=<path to output file>] [--dumpProfile] [--profileOutputFile=<path to profile output file>] [--warmup=<number>] [--debug] [--dumpOutput] [--batchSize=<number>] [--maxGenerateLength=<number>] [--specDecode] [--specDraftTopK=<number>] [--specDraftStep=<number>] [--specVerifySize=<number>]

Risk

Low. The change only adjusts link visibility and propagation for CuTe DSL-related dependencies and link options for static-library targets, while preserving the existing behavior for non-static targets.

Signed-off-by: francismelon <qq1650190803@gmail.com>
@francismelon francismelon requested a review from a team June 7, 2026 09:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant