feat: add runtime cache API for TensorRT-RTX#4180
Open
tp5uiuc wants to merge 3 commits intopytorch:mainfrom
Open
feat: add runtime cache API for TensorRT-RTX#4180tp5uiuc wants to merge 3 commits intopytorch:mainfrom
tp5uiuc wants to merge 3 commits intopytorch:mainfrom
Conversation
tp5uiuc
commented
Apr 10, 2026
tp5uiuc
commented
Apr 10, 2026
tp5uiuc
commented
Apr 10, 2026
tp5uiuc
commented
Apr 10, 2026
tp5uiuc
commented
Apr 10, 2026
tp5uiuc
commented
Apr 10, 2026
tp5uiuc
commented
Apr 10, 2026
tp5uiuc
commented
Apr 12, 2026
tp5uiuc
commented
Apr 12, 2026
7 tasks
tp5uiuc
commented
Apr 13, 2026
| dryrun: bool = _defaults.DRYRUN, | ||
| hardware_compatible: bool = _defaults.HARDWARE_COMPATIBLE, | ||
| timing_cache_path: str = _defaults.TIMING_CACHE_PATH, | ||
| runtime_cache_path: str = _defaults.RUNTIME_CACHE_PATH, |
Contributor
Author
There was a problem hiding this comment.
Runtime cache is a JIT-time API : it may not much make sense for cross_compile_for_windows and convert_exported_program_to_serialized_trt_engine. I have added it to the interface as a common API for entry point into torch-TRT, but I can add it to unsupported_settings
Collaborator
There was a problem hiding this comment.
agree, it doesn't make sense in JIT-time cache.
Let's add unsupported_settings for now, even in future, we want this feature we can add it back.
Contributor
Author
There was a problem hiding this comment.
Great, thanks for the feedback Lan 🙏
7 tasks
Add runtime cache support for TensorRT-RTX JIT compilation results, replacing the timing cache which is not used by RTX (no autotuning). Changes: - Skip timing cache creation/saving for TensorRT-RTX in _TRTInterpreter - Add RUNTIME_CACHE_PATH default and runtime_cache_path setting - Wire up IRuntimeCache in PythonTorchTensorRTModule (setup, load, save) - Persist runtime cache to disk with filelock for concurrent access safety - Thread runtime_cache_path through all compile functions - Add unit tests (12 tests) and E2E model tests (6 tests) - Update docstrings and RST documentation Fixes pytorch#3817 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Version provided by upstream torch; no pin needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
runtime_cache_path is a JIT-time API for TensorRT-RTX that only applies at inference time via PythonTorchTensorRTModule. Remove it from compilation_options in cross_compile_for_windows and convert_exported_program_to_serialized_trt_engine (with a warning), letting the dataclass default fill in harmlessly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3893fa4 to
3be6032
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add runtime cache support for TensorRT-RTX JIT compilation results, replacing the timing cache which is not used by RTX (no autotuning).
TensorRT-RTX uses JIT compilation at inference time. The runtime cache (
IRuntimeCache) stores these compilation results so that kernels and execution graphs are not recompiled on subsequent runs. This is analogous to the timing cache but operates at inference time rather than build time.Fixes #3817
Changes
_create_timing_cache()and_save_timing_cache()whenENABLED_FEATURES.tensorrt_rtxis True (timing cache is a no-op in TRT-RTX)runtime_cache_pathsetting: NewRUNTIME_CACHE_PATHdefault andruntime_cache_pathfield inCompilationSettings, threaded through all compile functionsIRuntimeCacheinPythonTorchTensorRTModule: CreateRuntimeConfigwith runtime cache on engine setup, load from disk if available, save on module destructionfilelockfor concurrent access safety when multiple processes share the same cache fileType of change
Checklist: