Skip to content

[None][fix] Enable LoRA in EAGLE3 speculative decoding#13005

Draft
Funatiq wants to merge 2 commits intoNVIDIA:mainfrom
Funatiq:dev/fix/eagle_lora
Draft

[None][fix] Enable LoRA in EAGLE3 speculative decoding#13005
Funatiq wants to merge 2 commits intoNVIDIA:mainfrom
Funatiq:dev/fix/eagle_lora

Conversation

@Funatiq
Copy link
Copy Markdown
Collaborator

@Funatiq Funatiq commented Apr 13, 2026

@coderabbitai summary

Description

  • Handle optional PEFT cache manager in AdapterSlotManager and update weight pointers in CudaGraphLoraParams.
  • Add unit tests for CudaGraphLoraParams and AdapterSlotManager to validate behavior when PEFT cache manager is missing.
  • Add integration test for LoRA in EAGLE3 speculative decoding with and without CUDA graph.

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Apr 13, 2026

/bot run

@Funatiq Funatiq changed the title [fix] Enable LoRA in EAGLE3 speculative decoding [None][fix] Enable LoRA in EAGLE3 speculative decoding Apr 13, 2026
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43093 [ run ] triggered by Bot. Commit: bca258d Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43093 [ run ] completed with state SUCCESS. Commit: bca258d
/LLM/main/L0_MergeRequest_PR pipeline #33732 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Funatiq added 2 commits April 13, 2026 21:41
- Handle optional PEFT cache manager in AdapterSlotManager and update weight pointers in CudaGraphLoraParams.
- Add unit tests for CudaGraphLoraParams and AdapterSlotManager to validate behavior when PEFT cache manager is missing.
- Add integration test for LoRA in EAGLE3 speculative decoding with and without CUDA graph.

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
@Funatiq Funatiq force-pushed the dev/fix/eagle_lora branch from bca258d to a7e50d9 Compare April 13, 2026 19:41
@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Apr 13, 2026

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43108 [ run ] triggered by Bot. Commit: a7e50d9 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43108 [ run ] completed with state SUCCESS. Commit: a7e50d9
/LLM/main/L0_MergeRequest_PR pipeline #33744 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants