feat: add LLM wrappers and code examination tools for backport judgment#12
Conversation
There was a problem hiding this comment.
Pull request overview
This PR (Part 1/2) introduces foundational building blocks for an AI-driven “judge” that determines whether upstream kernel patches likely need backporting by enabling LLM prompting plus code-search/code-view tools against a downstream kernel repo.
Changes:
- Added LangChain tool wrappers to locate symbols and view code in a target kernel at a given git ref.
- Added system/user prompt templates to drive conservative YES/NO backport judgments.
- Added a CLI/helper entry point intended to run the LLM-based judgment for a given commit and repo paths.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
src/prejudge/judge_tools.py |
Adds locate_symbol / view_code LangChain tools backed by git grep and git show. |
src/prejudge/judge_prompt.py |
Defines system and user prompts for conservative backport judgment. |
src/prejudge/judge_llm.py |
Adds a CLI + judge_with_llm wrapper intended to invoke a JudgeAgent for LLM analysis. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Use git grep to find the symbol | ||
| result = subprocess.run( | ||
| ["git", "grep", "-n", f"\\b{symbol}\\b", ref], | ||
| cwd=project_path, | ||
| capture_output=True, | ||
| text=True, | ||
| timeout=30, | ||
| ) |
There was a problem hiding this comment.
git grep is being passed a pattern with \b...\b, but without -P (PCRE) \b won't behave as a word-boundary (often treated as a backspace), so this search can miss real matches. Prefer git grep -w (and/or escaping the symbol) instead of relying on \b here.
| if result.returncode != 0: | ||
| # Symbol not found | ||
| return f"The symbol '{symbol}' was NOT FOUND in the target kernel at ref {ref}." |
There was a problem hiding this comment.
The return-code handling treats any non-zero git grep exit as “not found”, but git grep uses different non-zero codes for errors (e.g., invalid ref/repo, IO errors). This can incorrectly conclude a symbol is absent when the command actually failed; differentiate the "not found" case (typically 1) from real errors and surface errors separately.
| if result.returncode != 0: | |
| # Symbol not found | |
| return f"The symbol '{symbol}' was NOT FOUND in the target kernel at ref {ref}." | |
| if result.returncode == 1: | |
| # Symbol not found | |
| return f"The symbol '{symbol}' was NOT FOUND in the target kernel at ref {ref}." | |
| if result.returncode != 0: | |
| error_output = (result.stderr or result.stdout).strip() | |
| logger.error( | |
| f"git grep failed for symbol '{symbol}' at ref '{ref}' " | |
| f"with return code {result.returncode}: {error_output}" | |
| ) | |
| return ( | |
| f"Error searching for symbol '{symbol}' at ref {ref}: " | |
| f"git grep failed with return code {result.returncode}" | |
| + (f": {error_output}" if error_output else "") | |
| ) |
| Returns: | ||
| A string listing all locations where the symbol is found, in the format: | ||
| "file_path:line_number" for each occurrence | ||
| """ |
There was a problem hiding this comment.
The docstring says the tool returns locations in the format file_path:line_number, but git grep -n outputs file:line:matching_line. Either update the docstring or parse/format the output to match the documented contract so the agent can consume it reliably.
| # Limit the window size | ||
| end_line = start_line + 500 |
There was a problem hiding this comment.
The line-window limiting logic is off by one: if you want to cap at 500 lines, setting end_line = start_line + 500 yields 501 lines inclusive. Adjust the bound (or clarify the intent) so the cap matches what the comment describes.
| # Limit the window size | |
| end_line = start_line + 500 | |
| # Limit the window size to 500 lines inclusive | |
| end_line = start_line + 499 |
| if str(_src_path) not in sys.path: | ||
| sys.path.insert(0, str(_src_path)) | ||
|
|
||
| from prejudge.judge_agent import JudgeAgent |
There was a problem hiding this comment.
from prejudge.judge_agent import JudgeAgent will fail at runtime because src/prejudge/judge_agent.py (or an equivalent module providing JudgeAgent) is not present in this PR or elsewhere in src/. Either add the missing module/class or change this import to the correct existing entry point.
| import sys | ||
| from pathlib import Path | ||
|
|
||
|
|
||
| def judge_with_llm( | ||
| commit_id: str, src_project_path: str, target_project_path: str | ||
| ) -> bool: | ||
| """ | ||
| Judge if a patch needs backporting using LLM analysis. | ||
|
|
||
| This function uses an LLM agent to analyze the patch and determine | ||
| if the vulnerable code exists in the target kernel. | ||
|
|
||
| Args: | ||
| commit_id: The upstream commit hash | ||
| src_project_path: Path to source kernel repository | ||
| target_project_path: Path to target kernel repository | ||
|
|
||
| Returns: | ||
| True if patch needs backporting, False otherwise | ||
|
|
||
| Raises: | ||
| ValueError: If paths are invalid | ||
| RuntimeError: If LLM analysis fails | ||
| """ | ||
| # Import with proper path handling | ||
| import sys | ||
| from pathlib import Path | ||
|
|
||
| # Add src directory to path if needed | ||
| _src_path = Path(__file__).parent.parent | ||
| if str(_src_path) not in sys.path: | ||
| sys.path.insert(0, str(_src_path)) |
There was a problem hiding this comment.
This module imports Path at top-level but then re-imports Path inside judge_with_llm, leaving the top-level import unused. CI runs pylint (unused-import is enabled), so this will likely lower the score; remove the unused import and avoid re-importing sys/Path inside the function.
| True if patch needs backporting, False otherwise | ||
|
|
||
| Raises: | ||
| ValueError: If paths are invalid | ||
| RuntimeError: If LLM analysis fails |
There was a problem hiding this comment.
The docstring documents RuntimeError being raised when LLM analysis fails, but the implementation catches all exceptions and returns True instead. Update the docstring (and/or exception behavior) so callers have an accurate contract.
| True if patch needs backporting, False otherwise | |
| Raises: | |
| ValueError: If paths are invalid | |
| RuntimeError: If LLM analysis fails | |
| True if patch needs backporting, or if LLM analysis fails and the | |
| function conservatively treats the patch as needing backporting; | |
| False otherwise. | |
| Raises: | |
| ValueError: If paths are invalid |
This PR is Part 1 of 2 for introducing an AI-driven patch backporting judgment system.
It establishes the foundational LLM wrappers, prompt templates, and code examination tools required for the LangChain agent.
By separating these core utilities from the main pipeline integration, this PR allows for focused review on the prompt engineering and LLM interaction interfaces.