[None][test] Add sync_qa_tests Jenkins script and sync-qa-tests skill#13028
[None][test] Add sync_qa_tests Jenkins script and sync-qa-tests skill#13028xinhe-nv wants to merge 4 commits intoNVIDIA:mainfrom
Conversation
6c51318 to
4d142b1
Compare
|
@coderabbitai review the code |
|
✅ Actions performedReview triggered.
|
📝 WalkthroughWalkthroughThis PR introduces QA test list management infrastructure for TensorRT-LLM. It adds documentation describing a synchronization process, a Python script that extracts and filters pytest node IDs from YAML sources by backend and path criteria, and updates curated test lists to reflect new model variants and selection strategies. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (7)
jenkins/scripts/sync_qa_tests.py (5)
2-2: Copyright year inconsistency.Other files in this PR use
2025-2026for the copyright year range (e.g.,.coderabbit.yaml), but this file only shows2026. Consider aligning for consistency.📝 Suggested fix
-# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/sync_qa_tests.py` at line 2, Update the SPDX header in sync_qa_tests.py so the copyright year range matches the rest of the PR (use "2025-2026" instead of "2026"); locate the top-of-file SPDX comment line (the string starting with "# SPDX-FileCopyrightText: Copyright (c) ...") and change the year portion to the range "2025-2026" to align with files like .coderabbit.yaml.
15-15: Unnecessaryfrom __future__ import annotations.Per the project's Python 3.10+ requirement, the
from __future__ import annotationsimport is not needed—PEP 604 union syntax (str | None) and other modern type hints work natively.Removing it is optional but reduces noise. Based on learnings: "In TensorRT-LLM (Python requires >=3.10), you can use Python 3.10+ features throughout the codebase, and you do not need to add
from __future__ import annotations."📝 Suggested fix
-from __future__ import annotations - import argparse🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/sync_qa_tests.py` at line 15, Remove the unnecessary future import at the top of sync_qa_tests.py: delete the line "from __future__ import annotations" since the project targets Python 3.10+ and PEP 604/modern type hints are supported natively; ensure no other code changes are needed because annotations can remain as-is.
146-149: Comparison may produce unexpected results when list was already sorted.The condition
all_lines == existing_linescompares a newly sorted list againstexisting_lines, which preserves the original file order. If the file was already sorted and had no missing entries, this comparison would correctly detect no changes. However, if the file was unsorted but complete, this would incorrectly trigger a rewrite just to sort.This behavior seems intentional (ensuring the file is always sorted), but the "Nothing to change" message could be misleading since it only prints when both conditions are met.
Consider clarifying the intent or adjusting the message:
📝 Suggested clarification
all_lines = sorted(set(existing_lines) | set(missing)) if not missing and all_lines == existing_lines: - print("Nothing to change.") + print("Already up to date (no missing entries, already sorted).") return 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/sync_qa_tests.py` around lines 146 - 149, The current check uses all_lines = sorted(set(existing_lines) | set(missing)) and prints "Nothing to change." only when not missing and all_lines == existing_lines, which conflates “no missing entries” with “already sorted”; update logic or messaging in sync_qa_tests.py so intent is clear: either (A) treat sorting as a change by comparing sets only (e.g., compare set(existing_lines) vs set(all_lines)) and keep the message as-is, or (B) preserve existing behavior but change the message to indicate both “no missing entries and file already sorted” (referencing variables all_lines, existing_lines, missing and the surrounding check) so the output isn’t misleading. Ensure whichever option you pick updates the conditional and the printed string accordingly.
305-305: Consider usingparser.error()instead ofAssertionError.Since
required=Trueis set on the subparsers, this line should never be reached. However, if it somehow is,parser.error()would provide a more user-friendly error message thanAssertionError.That said, this is defensive code for an unreachable path, so keeping
AssertionErroris acceptable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@jenkins/scripts/sync_qa_tests.py` at line 305, Replace the defensive raise AssertionError(f"unknown command: {args.command}") with a user-friendly parser.error call: locate the unreachable fallback where args.command is handled and replace the AssertionError with parser.error(f"unknown command: {args.command}") so argparse prints a proper usage message; ensure you use the existing parser variable and preserve the same error text for consistency.
62-96: YAML parsing uses regex—consider robustness for edge cases.The
extract_accuracy_disaggregated_from_ymlfunction parses YAML using regex line-by-line rather than a proper YAML parser. This works for the current test-db format but could break if:
- YAML uses multi-line strings or flow syntax
- Indentation varies unexpectedly
- Comments appear in unusual positions
For the current structured test-db files, this approach is pragmatic and avoids adding a
pyyamldependency. If test-db format evolves, consider revisiting..claude/skills/sync-qa-tests/SKILL.md (2)
61-61: Minor: Awkward phrasing with "only" repetition.The phrase "today the file is gen_only disagg only" reads awkwardly due to "only" appearing twice.
Consider rephrasing to: "today the file covers gen_only disagg entries exclusively"
📝 Suggested fix
- Other prefixes inside the bracket (`disagg-e2e-`, `aggr-...`, `aggr-ctx_only-`, etc.) exist in some DB files; **do not** auto-merge non–gen_only disagg lines into `llm_perf_multinode.yml` unless perf/QA owners explicitly want them in this union (today the file is **gen_only** disagg only). + Other prefixes inside the bracket (`disagg-e2e-`, `aggr-...`, `aggr-ctx_only-`, etc.) exist in some DB files; **do not** auto-merge non–gen_only disagg lines into `llm_perf_multinode.yml` unless perf/QA owners explicitly want them in this union (today the file covers **gen_only** disagg entries exclusively).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/sync-qa-tests/SKILL.md at line 61, Replace the awkward phrase "today the file is **gen_only** disagg only" in SKILL.md with clearer wording such as "today the file covers **gen_only** disagg entries exclusively" to avoid duplicate "only" and improve readability; locate the exact sentence containing "Other prefixes... do not auto-merge non–gen_only disagg lines into `llm_perf_multinode.yml` unless perf/QA owners explicitly want them in this union (today the file is **gen_only** disagg only)" and update it to the suggested rephrasing.
70-70: Minor: Remove leading space inside code span.The inline code
` TIMEOUT (NNN)`has a leading space which triggers markdownlint MD038.📝 Suggested fix
- Preserve the same **` TIMEOUT (NNN)`** suffix as in test-db when present (commonly `120`). Keep list style consistent with existing entries (` - perf/...`). + Preserve the same **`TIMEOUT (NNN)`** suffix as in test-db when present (commonly `120`). Keep list style consistent with existing entries (` - perf/...`).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/sync-qa-tests/SKILL.md at line 70, The inline code span contains a leading space ("` TIMEOUT (NNN)`") which triggers markdownlint MD038; update the code span to remove the leading space so it reads "`TIMEOUT (NNN)`" and ensure the surrounding list entry style remains consistent with existing entries (e.g., " - perf/...")—locate the offending text by searching for the code span ` TIMEOUT (NNN)` in SKILL.md and replace it accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/integration/test_lists/qa/llm_perf_multinode.yml`:
- Around line 17-18: Update the QA node IDs to match the actual parametrization
naming: replace occurrences of "disagg-gen_only-gb200_<model>" with
"disagg_upload-gen_only-gb200_<model>" (e.g., for models deepseek-r1,
deepseek-v32, kimi-k2-thinking, gpt-oss-120b, qwen3-235b) so the QA entries
match the test parametrization keys used by the
l0_gb200_multi_nodes_perf_sanity* files; verify each listed node ID string now
uses "disagg_upload-gen_only-gb200_" exactly.
---
Nitpick comments:
In @.claude/skills/sync-qa-tests/SKILL.md:
- Line 61: Replace the awkward phrase "today the file is **gen_only** disagg
only" in SKILL.md with clearer wording such as "today the file covers
**gen_only** disagg entries exclusively" to avoid duplicate "only" and improve
readability; locate the exact sentence containing "Other prefixes... do not
auto-merge non–gen_only disagg lines into `llm_perf_multinode.yml` unless
perf/QA owners explicitly want them in this union (today the file is
**gen_only** disagg only)" and update it to the suggested rephrasing.
- Line 70: The inline code span contains a leading space ("` TIMEOUT (NNN)`")
which triggers markdownlint MD038; update the code span to remove the leading
space so it reads "`TIMEOUT (NNN)`" and ensure the surrounding list entry style
remains consistent with existing entries (e.g., " - perf/...")—locate the
offending text by searching for the code span ` TIMEOUT (NNN)` in SKILL.md and
replace it accordingly.
In `@jenkins/scripts/sync_qa_tests.py`:
- Line 2: Update the SPDX header in sync_qa_tests.py so the copyright year range
matches the rest of the PR (use "2025-2026" instead of "2026"); locate the
top-of-file SPDX comment line (the string starting with "#
SPDX-FileCopyrightText: Copyright (c) ...") and change the year portion to the
range "2025-2026" to align with files like .coderabbit.yaml.
- Line 15: Remove the unnecessary future import at the top of sync_qa_tests.py:
delete the line "from __future__ import annotations" since the project targets
Python 3.10+ and PEP 604/modern type hints are supported natively; ensure no
other code changes are needed because annotations can remain as-is.
- Around line 146-149: The current check uses all_lines =
sorted(set(existing_lines) | set(missing)) and prints "Nothing to change." only
when not missing and all_lines == existing_lines, which conflates “no missing
entries” with “already sorted”; update logic or messaging in sync_qa_tests.py so
intent is clear: either (A) treat sorting as a change by comparing sets only
(e.g., compare set(existing_lines) vs set(all_lines)) and keep the message
as-is, or (B) preserve existing behavior but change the message to indicate both
“no missing entries and file already sorted” (referencing variables all_lines,
existing_lines, missing and the surrounding check) so the output isn’t
misleading. Ensure whichever option you pick updates the conditional and the
printed string accordingly.
- Line 305: Replace the defensive raise AssertionError(f"unknown command:
{args.command}") with a user-friendly parser.error call: locate the unreachable
fallback where args.command is handled and replace the AssertionError with
parser.error(f"unknown command: {args.command}") so argparse prints a proper
usage message; ensure you use the existing parser variable and preserve the same
error text for consistency.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 173b8db8-5944-4c20-b8d3-c6b54eb65e2c
📒 Files selected for processing (7)
.claude/skills/sync-qa-tests/SKILL.md.coderabbit.yamljenkins/scripts/sync_qa_tests.pytests/integration/test_lists/qa/llm_function_core.txttests/integration/test_lists/qa/llm_function_core_sanity.txttests/integration/test_lists/qa/llm_perf_multinode.ymltests/integration/test_lists/test-db/l0_a10.yml
💤 Files with no reviewable changes (1)
- tests/integration/test_lists/test-db/l0_a10.yml
|
/bot run --stage-list "" |
|
PR_Github #43408 [ run ] triggered by Bot. Commit: |
|
PR_Github #43408 [ run ] completed with state
|
|
/bot run --stage-list "" |
|
PR_Github #43470 [ run ] triggered by Bot. Commit: |
|
PR_Github #43470 [ run ] completed with state
|
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
/bot run --stage-list "" |
|
PR_Github #43607 [ run ] triggered by Bot. Commit: |
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
PR_Github #43607 [ run ] completed with state |
Add test_openai_consistent_chat and test_ptp_quickstart_advanced_multi_gpus which are referenced in waives.txt but were missing from the QA list. Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
Made-with: Cursor
Summary by CodeRabbit
New Features
Documentation
Chores
Description
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.