feat: add config judge logic to find whether in downstream kernel#9
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new “prejudge” utility under src/prejudge/ to analyze a kernel commit’s patch, infer related CONFIG_ options, and decide whether at least one required config is enabled in any bundled downstream architecture config.
Changes:
- Added
PrejudgeControllerCLI togit showa commit, extract required configs, and printtrue/falsebased on downstream arch config enablement. - Added
judge_config.pypatch analyzer to inferCONFIG_symbols from Makefile/source preprocessor context. - Added downstream config snapshot for
sw_64undersrc/prejudge/config_data/.
Reviewed changes
Copilot reviewed 3 out of 8 changed files in this pull request and generated 12 comments.
| File | Description |
|---|---|
src/prejudge/prejudge.py |
New CLI/controller that fetches commit patch, runs config inference, and checks inferred configs against per-arch config snapshots. |
src/prejudge/judge_config.py |
New patch parsing + (intended) Kconfig dependency analyzer to infer required CONFIG symbols from patch changes. |
src/prejudge/config_data/sw_64 |
Adds sw_64 downstream config snapshot used by the downstream enablement check. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def analyze_commit(self, commit_id: str) -> Dict[str, Set[str]]: | ||
| """ | ||
| Analyze a commit and return all judgment results | ||
| Returns a dict with keys like 'config', 'others', etc. | ||
| """ | ||
| # Get patch from commit | ||
| patch_content = self.get_patch_from_commit(commit_id) | ||
| if not patch_content: | ||
| return {} | ||
|
|
||
| return patch_content | ||
|
|
There was a problem hiding this comment.
analyze_commit() is annotated/documented to return a Dict[str, Set[str]], but it actually returns the patch content as a str (and {} on failure). This mixes return types and will break callers that rely on the declared API; consider making this method return str (empty string on failure) or returning a structured dict consistently.
| try: | ||
| # Save patch to temp file | ||
| temp_patch = self.save_patch_to_temp(patch_content) | ||
|
|
||
| # Analyze | ||
| analyzer = PatchConfigAnalyzer(str(self.kernel_dir)) | ||
| configs = analyzer.analyze_patch(str(temp_patch)) | ||
|
|
||
| # Clean up temp file | ||
| temp_patch.unlink(missing_ok=True) | ||
|
|
||
| # Format as CONFIG_XXX=y | ||
| return {f"{config}=y" for config in configs} | ||
|
|
||
| except Exception: | ||
| return set() |
There was a problem hiding this comment.
Temporary patch file cleanup is not guaranteed: if analyzer.analyze_patch() raises, the function returns from the except and leaves the temp file behind. Use a try/finally (or context manager) so the temp file is always removed, even on errors.
| Judge required CONFIG options for the patch | ||
| Returns set of CONFIG_XXX=y strings | ||
| """ | ||
| from judge_config import PatchConfigAnalyzer |
There was a problem hiding this comment.
Importing PatchConfigAnalyzer via from judge_config import PatchConfigAnalyzer only works when this file is executed as a script from its directory. If src.prejudge.prejudge is imported/executed as a module (e.g., python -m src.prejudge.prejudge), this import will fail; consider using a package-relative import with a fallback to the script-style import.
| from judge_config import PatchConfigAnalyzer | |
| try: | |
| from .judge_config import PatchConfigAnalyzer | |
| except ImportError: | |
| from judge_config import PatchConfigAnalyzer |
| # No patch content, return True | ||
| print("Error: Could not retrieve patch content. Please check the commit ID and repository.") | ||
| return |
There was a problem hiding this comment.
analyze_and_report() claims the output format is true/false, but on failure it prints a human-readable error string and returns without printing a boolean. If this output is parsed by automation, emit the boolean consistently (and send errors to stderr / nonzero exit code instead).
| # No patch content, return True | |
| print("Error: Could not retrieve patch content. Please check the commit ID and repository.") | |
| return | |
| print( | |
| "Error: Could not retrieve patch content. Please check the commit ID and repository.", | |
| file=sys.stderr | |
| ) | |
| print("false") | |
| sys.exit(1) |
| # If config_data doesn't exist, assume True | ||
| return True |
There was a problem hiding this comment.
When config_data/ is missing, check_config_in_arch_configs() currently returns True (treating unknown as enabled). This can create false positives for the downstream-kernel check; consider returning False or surfacing an error so the caller can decide how to handle missing config data.
| # If config_data doesn't exist, assume True | |
| return True | |
| # If config_data doesn't exist, we cannot confirm the CONFIG is enabled | |
| return False |
| for symbol_name in config_symbols: | ||
| symbol = self.kconf.syms.get(symbol_name) | ||
|
|
||
| if not symbol: | ||
| # Symbol not found in Kconfig | ||
| results[symbol_name] = set() | ||
| continue |
There was a problem hiding this comment.
KconfigAnalyzer.analyze_config_dependencies() looks up symbols via self.kconf.syms.get(symbol_name), but kconfiglib symbol keys are typically bare names (e.g., FOO), not CONFIG_FOO. As written, lookups will fail and dependencies will always be empty. Strip the CONFIG_ prefix for kconfiglib access and normalize extracted deps back to CONFIG_... when returning results.
| # Convert to string and extract CONFIG_ references | ||
| # kconfiglib expressions can be complex, so we use string representation | ||
| expr_str = str(expr) | ||
|
|
||
| # Match CONFIG_FOO patterns | ||
| pattern = r'\b(CONFIG_[A-Z0-9_]+)\b' | ||
| configs = set(re.findall(pattern, expr_str)) | ||
|
|
There was a problem hiding this comment.
_extract_configs_from_expr() searches for CONFIG_... tokens in str(expr), but kconfiglib expression strings generally contain bare symbols (e.g., NET && INET), so this will miss dependencies. Consider walking the expression tree or extracting bare symbols and converting them to CONFIG_... before returning.
| # Convert to string and extract CONFIG_ references | |
| # kconfiglib expressions can be complex, so we use string representation | |
| expr_str = str(expr) | |
| # Match CONFIG_FOO patterns | |
| pattern = r'\b(CONFIG_[A-Z0-9_]+)\b' | |
| configs = set(re.findall(pattern, expr_str)) | |
| ignored_tokens = { | |
| "y", "n", "m", | |
| "&&", "||", "!", "=", "!=", "(", ")" | |
| } | |
| def add_symbol_name(name: str) -> None: | |
| if not name or name in ignored_tokens: | |
| return | |
| if name.startswith("CONFIG_"): | |
| configs.add(name) | |
| elif re.fullmatch(r"[A-Z0-9_]+", name): | |
| configs.add(f"CONFIG_{name}") | |
| def walk(node) -> None: | |
| if node is None: | |
| return | |
| if isinstance(node, kconfiglib.Symbol): | |
| add_symbol_name(getattr(node, "name", "")) | |
| return | |
| if isinstance(node, (tuple, list)): | |
| for child in node: | |
| walk(child) | |
| return | |
| # Fallback for expression fragments that are represented as strings | |
| # or for kconfiglib internals we do not explicitly handle above. | |
| expr_str = str(node) | |
| for match in re.findall(r'\bCONFIG_[A-Z0-9_]+\b', expr_str): | |
| add_symbol_name(match) | |
| # kconfiglib expression strings typically use bare symbol names | |
| # such as "NET && INET", so convert those to CONFIG_* names. | |
| for token in re.findall(r'\b[A-Z][A-Z0-9_]*\b', expr_str): | |
| add_symbol_name(token) | |
| walk(expr) |
| def __init__(self, kernel_dir: str): | ||
| self.kernel_dir = kernel_dir | ||
| self.source_analyzer = SourceAnalyzer(kernel_dir) | ||
| self.kconfig_analyzer = KconfigAnalyzer(kernel_dir) | ||
|
|
||
| def analyze_patch(self, patch_file: str) -> Set[str]: | ||
| """ | ||
| Analyze a patch file and return all required CONFIG options. | ||
| Returns empty set if any error occurs or no configs found. | ||
| """ | ||
| try: | ||
| # Read patch | ||
| patch_content = Path(patch_file).read_text() | ||
| except Exception: | ||
| return set() | ||
|
|
||
| try: | ||
| # Parse patch to get modified lines | ||
| parser = PatchParser(patch_content) | ||
| files_changed = parser.parse() | ||
|
|
||
| if not files_changed: | ||
| return set() | ||
|
|
||
| # Extract CONFIG conditions from modified lines | ||
| all_config_conditions = set() | ||
|
|
||
| for source_file, line_numbers in files_changed.items(): | ||
| # Only analyze C source files | ||
| if not (source_file.endswith('.c') or source_file.endswith('.h')): | ||
| continue | ||
|
|
||
| # Try to get CONFIG from Makefile (most accurate) | ||
| makefile_configs = self.source_analyzer.extract_config_from_makefile(source_file) | ||
| all_config_conditions.update(makefile_configs) | ||
|
|
||
| # If Makefile didn't give us the answer, try source code analysis | ||
| if not makefile_configs: | ||
| configs = self.source_analyzer.extract_config_conditions( | ||
| source_file, line_numbers | ||
| ) | ||
| all_config_conditions.update(configs) | ||
|
|
||
| return all_config_conditions | ||
| except Exception: |
There was a problem hiding this comment.
PatchConfigAnalyzer initializes self.kconfig_analyzer, but analyze_patch() never uses it, so Kconfig dependencies are never applied despite the module/docstrings describing dependency analysis. Either remove the unused analyzer or call it (e.g., expand all_config_conditions via get_all_required_configs()) so results match the intended behavior.
| elif ('-y +=' in line or '-y +=' in line.replace(' ', '') or | ||
| '-objs +=' in line or '-objs +=' in line.replace(' ', '')): |
There was a problem hiding this comment.
The composite-object detection tries to handle both spaced and unspaced forms by checking '-y +=' in line.replace(' ', ''), but after removing spaces the token becomes '-y+=' (same for -objs+=). As written, these line.replace(' ', '') checks will never match; update them to test the correct no-space forms.
| elif ('-y +=' in line or '-y +=' in line.replace(' ', '') or | |
| '-objs +=' in line or '-objs +=' in line.replace(' ', '')): | |
| elif ('-y +=' in line or '-y+=' in line.replace(' ', '') or | |
| '-objs +=' in line or '-objs+=' in line.replace(' ', '')): |
| class PreprocessorConditionTracker: | ||
| """Track preprocessor conditional nesting in C code""" | ||
|
|
||
| def __init__(self): | ||
| self.condition_stack: List[str] = [] | ||
| self.line_to_conditions: Dict[int, List[str]] = defaultdict(list) | ||
|
|
||
| def process_line(self, line: str, source_line: int = None): | ||
| """Process a preprocessed line, tracking conditionals""" | ||
| line = line.strip() | ||
|
|
||
| # Handle conditional directives | ||
| if line.startswith("#if"): | ||
| self.condition_stack.append(line) | ||
| elif line.startswith("#elif"): | ||
| if self.condition_stack: | ||
| self.condition_stack.pop() | ||
| self.condition_stack.append(line.replace("elif", "if")) | ||
| elif line.startswith("#else"): | ||
| if self.condition_stack: | ||
| last = self.condition_stack.pop() | ||
| # Negate the last condition for else branch | ||
| cond = last[3:].strip() # Remove "#if " | ||
| self.condition_stack.append(f"!({cond})") | ||
| elif line.startswith("#endif"): | ||
| if self.condition_stack: | ||
| self.condition_stack.pop() | ||
|
|
||
| # Track source line mapping | ||
| elif source_line is not None: | ||
| if line.startswith("#line"): | ||
| match = re.search(r'#line (\d+)', line) | ||
| if match: | ||
| source_line = int(match.group(1)) | ||
| else: | ||
| # Associate current conditions with this source line | ||
| if self.condition_stack: | ||
| self.line_to_conditions[source_line] = self.condition_stack.copy() | ||
| source_line += 1 | ||
|
|
||
| def get_conditions_for_line(self, line_num: int) -> List[str]: | ||
| """Get all preprocessor conditions for a given source line""" | ||
| return self.line_to_conditions.get(line_num, []) | ||
|
|
||
|
|
There was a problem hiding this comment.
PreprocessorConditionTracker is defined but never referenced anywhere in this module. If it’s not needed, removing it will reduce maintenance surface; if it is intended to be used, wire it into the source analysis path so it’s exercised.
| class PreprocessorConditionTracker: | |
| """Track preprocessor conditional nesting in C code""" | |
| def __init__(self): | |
| self.condition_stack: List[str] = [] | |
| self.line_to_conditions: Dict[int, List[str]] = defaultdict(list) | |
| def process_line(self, line: str, source_line: int = None): | |
| """Process a preprocessed line, tracking conditionals""" | |
| line = line.strip() | |
| # Handle conditional directives | |
| if line.startswith("#if"): | |
| self.condition_stack.append(line) | |
| elif line.startswith("#elif"): | |
| if self.condition_stack: | |
| self.condition_stack.pop() | |
| self.condition_stack.append(line.replace("elif", "if")) | |
| elif line.startswith("#else"): | |
| if self.condition_stack: | |
| last = self.condition_stack.pop() | |
| # Negate the last condition for else branch | |
| cond = last[3:].strip() # Remove "#if " | |
| self.condition_stack.append(f"!({cond})") | |
| elif line.startswith("#endif"): | |
| if self.condition_stack: | |
| self.condition_stack.pop() | |
| # Track source line mapping | |
| elif source_line is not None: | |
| if line.startswith("#line"): | |
| match = re.search(r'#line (\d+)', line) | |
| if match: | |
| source_line = int(match.group(1)) | |
| else: | |
| # Associate current conditions with this source line | |
| if self.condition_stack: | |
| self.line_to_conditions[source_line] = self.condition_stack.copy() | |
| source_line += 1 | |
| def get_conditions_for_line(self, line_num: int) -> List[str]: | |
| """Get all preprocessor conditions for a given source line""" | |
| return self.line_to_conditions.get(line_num, []) |
No description provided.