Pr/llm kernel assistant by gujialiang123 · Pull Request #348 · Deep-Learning-Profiling-Tools/triton-viz

gujialiang123 · 2026-03-20T04:38:31Z

Summary
Adds an optional AI kernel debug assistant to the visualizer: users can run an initial trace-aware analysis, then ask follow-up questions grounded in recorded ops, source snippets, and a compact compute trace. Configuration follows the rest of the stack (local JSON, env, and programmatic setup) without putting secrets in the UI.

What’s included
Backend (Flask)

GET/POST /api/llm/config — inspect / merge effective settings (no API key echoed).
POST /api/llm/analyze — first-pass bug-oriented summary over trace + code context.
POST /api/llm/chat — follow-up Q&A after analysis is ready.
Supporting endpoints: records snapshot, single record, prompt template preview.
Client & config

OpenAI-compatible Chat Completions client (llm_utils.py).
Layered config: defaults → llm_config.local.json → optional setup_llm(config_path=...) → env (TRITON_VIZ_LLM_*, OPENAI_API_KEY) → setup_llm(**kwargs) / POST /api/llm/config.
Package exports: setup_llm, setup_llm_from_file, clear_llm_setup, LLM_SETUP_KEYS.
CLI: optional --llm-api-key / --llm-base-url before launch.
Frontend

Floating AI Assistant panel in index.html: Start analysis, chat input, English copy only for LLM UI strings.
Prompts

visualizer/prompts/system_default.md — English system instructions focused on concrete kernel debugging.
Examples

examples/LLMtest/ — small intentionally buggy kernels + README for smoke-testing the assistant.
Repo hygiene

.gitignore entries for local LLM config and optional debug log (e.g. llm_config.local.json, llm_chat_debug.jsonl).
How to try it
Configure API access (e.g. llm_config.local.json from llm_config.example.json, or triton_viz.setup_llm(...) before launch(), or env vars).
Run a traced script and triton_viz.launch(...).
Open AI Assistant → Start analysis → then ask questions.
See examples/LLMtest/README.md for minimal runnable examples.

Notes for reviewers
No secrets in git: example config only; real keys stay local / env / runtime setup.
POST /api/llm/config does not accept config_path (file path is Python-only via setup_llm).
LLM UI and prompt templates are English-only by design.

- Flask routes: /api/llm/records, record, prompt, chat, analyze - Record store, OpenAI-compatible client, prompt templates - index.html: AI Assistant panel and wiring - Ignore llm_config.local.json and llm_chat_debug.jsonl - examples: fix @triton_viz.trace(client=Tracer()) for flip/matmul/histogram - examples/LLMtest: small buggy kernels for assistant smoke tests Made-with: Cursor

- Export setup_llm, setup_llm_from_file, clear_llm_setup, LLM_SETUP_KEYS - CLI: optional --llm-api-key / --llm-base-url before launch - Extended POST/GET /api/llm/config; layered config with config_path - LLM chat panel strings in English - LLMtest: inline _LL_CONFIG_PATH / _LL_API_KEY, drop preflight helper Made-with: Cursor

github-actions · 2026-03-20T04:42:01Z

Sanitizer Performance Benchmark

Benchmark	main (min)	PR (min)	Change
gemm	0.184s	0.183s	-0.6%
gemm_oob	0.193s	0.191s	-0.7%
indirect_load	0.294s	0.295s	+0.2%
nested_loop	0.375s	0.371s	-1.1%
block_pointer_loop_advance	0.187s	0.187s	-0.1%
liger_jsd	0.150s	0.150s	-0.2%
flaggems_layernorm	0.461s	0.458s	-0.6%
swiglu	0.184s	0.184s	-0.1%
cross_entropy	0.172s	0.172s	-0.3%
fused_linear_jsd	0.228s	0.225s	-0.9%
Total	2.429s	2.416s	-0.5%

Iterations: 1 warmup + 20 measured

mark14wu · 2026-03-20T04:44:13Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1aaa45d380

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

triton_viz/visualizer/llm_records.py

triton_viz/templates/index.html

mark14wu · 2026-03-20T04:52:10Z

Code review

Found 6 issues:

Hardcoded developer filesystem path in example file. _LL_CONFIG_PATH is set to /home/jgu7/work/triton-viz/... instead of "" like the other two example files. This makes the example non-functional for all other users.

triton-viz/examples/LLMtest/buggy_vector_add_shift.py

Lines 8 to 10 in 1aaa45d

    
           # Optional: visualizer LLM — set one or both before running. 
        
           _LL_CONFIG_PATH = "/home/jgu7/work/triton-viz/triton_viz/visualizer/llm_config.local.json"  # e.g. "/path/to/llm.json" (same shape as llm_config.example.json) 
        
           _LL_API_KEY = ""  # e.g. "sk-..." if the key is not in that file

DEFAULT_MODEL = "gpt-5-mini" is not a real OpenAI model name. This will cause every default LLM call to fail with a model-not-found error. Likely should be "gpt-4o-mini" or another valid model.

triton-viz/triton_viz/visualizer/llm_utils.py

Lines 13 to 15 in 1aaa45d

    
           DEFAULT_BASE_URL = "https://api.openai.com/v1" 
        
           DEFAULT_MODEL = "gpt-5-mini" 
        
           PROMPTS_DIR = os.path.join(os.path.dirname(__file__), "prompts")

activeOpUuid property does not exist on the active block object. window.__tritonVizActiveBlock is an OpWorkspaceBlock instance which has no activeOpUuid field. As a result, window.__tritonVizCurrentOp is always undefined, and context-aware chat always receives uuid: null, defeating the "current selected op" feature.

triton-viz/triton_viz/templates/index.html

Lines 473 to 476 in 1aaa45d

    
           const active = window.__tritonVizActiveBlock; 
        
           if (active && active.activeOpUuid) { 
        
               window.__tritonVizCurrentOp = active.activeOpUuid; 
        
           }

JSON truncation at character boundaries produces malformed JSON. text[:LLM_SYS_CONTEXT_MAX_CHARS] (and similar) slices the serialized JSON string at an arbitrary character offset, producing syntactically invalid JSON that is sent to the LLM as context. The truncation should happen on the data structure before serialization, not on the serialized string.

triton-viz/triton_viz/visualizer/interface.py

Lines 130 to 132 in 1aaa45d

    
           if len(text) > LLM_SYS_CONTEXT_MAX_CHARS: 
        
               text = text[:LLM_SYS_CONTEXT_MAX_CHARS] 
        
               return "Kernel run records summary (JSON, truncated due to size limit): " + text

Unauthenticated POST /api/llm/config endpoint accepts api_key. There is no authentication or CSRF protection. When share=True, the visualizer is network-accessible, allowing any reachable host to overwrite the in-memory API key.

triton-viz/triton_viz/visualizer/interface.py

Lines 505 to 531 in 1aaa45d

    
                       "model": cfg.model, 
        
                       "timeout_sec": cfg.timeout_sec, 
        
                       "max_tokens": cfg.max_tokens, 
        
                       "debug_log_enabled": cfg.debug_log_enabled, 
        
                       "llm_setup_file": setup_basename, 
        
                       "allowed_keys": sorted(LLM_SETUP_KEYS), 
        
                   } 
        
               ) 
        
           @app.route("/api/llm/config", methods=["POST"]) 
        
           def post_llm_config(): 
        
               """ 
        
               Merge JSON fields into the in-memory LLM setup (same as ``triton_viz.setup_llm``). 
        
               Allowed keys: same as ``LLM_SETUP_KEYS`` (see GET response). Does **not** accept 
        
               ``config_path`` (use ``setup_llm(config_path=...)`` in Python before ``launch``). 
        
               Pass ``null`` or ``\"\"`` for string fields to clear that patch entry. 
        
               """ 
        
               data = request.json or {} 
        
               payload = {k: data[k] for k in data if k in LLM_SETUP_KEYS} 
        
               if not payload: 
        
                   return ( 
        
                       jsonify( 
        
                           { 
        
                               "error": f"Provide at least one of: {sorted(LLM_SETUP_KEYS)}",

debug_log_path accepted via unauthenticated HTTP endpoint. The POST /api/llm/config endpoint also accepts debug_log_path, which calls os.makedirs and open() on the user-supplied path. This allows unauthenticated callers to create directories and write files at arbitrary filesystem locations.

triton-viz/triton_viz/visualizer/llm_utils.py

Lines 274 to 283 in 1aaa45d

    
                   return value.strip().lower() in {"1", "true", "yes", "on"} 
        
               return False 
        
           def _resolve_debug_log_path(path_value: Any) -> str: 
        
               text = str(path_value or "").strip() 
        
               if not text: 
        
                   return os.path.join(os.path.dirname(__file__), DEFAULT_DEBUG_LOG_NAME) 
        
               if os.path.isabs(text): 
        
                   return text

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

Gujiang Liang added 2 commits March 20, 2026 00:30

runned precommit

1aaa45d

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

triton_viz/visualizer/llm_records.py Outdated Show resolved Hide resolved

triton_viz/templates/index.html Outdated Show resolved Hide resolved

triton_viz/templates/index.html Show resolved Hide resolved

Gujiang Liang and others added 2 commits March 20, 2026 00:53

fix html bugs

1ffe7dd

Merge branch 'main' into pr/llm-kernel-assistant

e4641f2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pr/llm kernel assistant#348

Pr/llm kernel assistant#348
gujialiang123 wants to merge 5 commits intomainfrom
pr/llm-kernel-assistant

gujialiang123 commented Mar 20, 2026

Uh oh!

github-actions bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

mark14wu commented Mar 20, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mark14wu commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gujialiang123 commented Mar 20, 2026

Uh oh!

github-actions bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Sanitizer Performance Benchmark

Uh oh!

mark14wu commented Mar 20, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mark14wu commented Mar 20, 2026

Code review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Mar 20, 2026 •

edited

Loading