fix(meta_analyzer): store None-end fallback key so static findings are not dropped by Shrotriya-lalit · Pull Request #94 · NVIDIA/SkillSpector

Shrotriya-lalit · 2026-06-18T10:24:10Z

Problem

LLMMetaAnalyzer.apply_filter() stored LLM-confirmed findings keyed by:

(file, rule_id, start_line, end_line)

Static analyzers almost never populate end_line — it defaults to None.
The LLM, however, fills in end_line for every finding it confirms.

This key mismatch meant every static finding with end_line=None was silently
dropped after LLM enrichment, even when the LLM had correctly confirmed it.
In practice, the meta-analyzer produced zero filtered findings for entire files.

Root Cause

# Before fix — only exact key stored:
confirmed_granular[(file, rule_id, int(start), int(end))] = enrichment
# Static finding has end_line=None → no match → finding dropped

Fix

When storing a granular LLM confirmation with an explicit end_line, also store
a fallback key with end_line=None via setdefault:

confirmed_granular[(file, rule_id, int(start), exact_end)] = enrichment
confirmed_granular.setdefault(
    (file, rule_id, int(start), None), enrichment
)

setdefault is used so that if the LLM also reports a finding at the same
start_line with end_line=None, its enrichment takes precedence. Lookup order
is unchanged: exact key → start-only key → coarse key.

Tests

test_static_end_line_none_confirmed_when_llm_returns_end_line — reproduces the bug exactly
test_static_end_line_none_multiple_same_rule — multiple findings same rule, different start lines
test_end_line_used_when_provided — regression: exact end_line match still works

Closes #67

Checklist

make lint passes
Tests added for new behaviour (including bug reproduction test)
DCO sign-off on commit (git commit -s)

rng1995

Correct fix for the dropped-findings bug. Approving.

Static analyzers almost always leave end_line=None, while the LLM fills it in, so the exact (file, rule, start, end) key never matched and confirmed findings were silently dropped. Storing the extra (file, rule, start, None) fallback via setdefault (so an explicit LLM end_line=None still takes precedence) is the minimal, correct change, and the lookup order is unchanged. Good that the reproduction test, the multi-finding test, and the updated test_end_line_used_when_provided all pin the new behavior — and I confirmed rejected findings (is_vulnerability=False) are never stored, so they stay rejected.

Minor / optional (non-blocking): the None-end fallback means two static findings at the same (file, rule, start_line) with different end_lines will now both be confirmed if the LLM confirms that start_line — a slight over-match. For a security tool that's the safe direction (avoids false negatives), just noting the behavior.

Shrotriya-lalit · 2026-06-23T08:56:54Z

Thanks for the detailed review @rng1995!

On the over-match observation: you're right that two static findings at the same (file, rule, start_line) with different end_line values will both be confirmed via the None fallback when the LLM confirms that start_line. This is an intentional trade-off — for a security scanner, a false negative (silently dropping a real finding) is worse than a slight over-match on an edge case that only arises when the same rule fires twice at the same line with different end boundaries, which is uncommon in practice.

If the over-match becomes a problem, a follow-up could track confirmed end_lines per (file, rule, start_line) and only apply the None fallback when no exact end_line match exists. Happy to do that as a separate PR if the maintainers prefer tighter semantics.

rng1995 · 2026-06-23T12:04:38Z

Please resolve the merge conflicts and do separate PR for tighter semantics

Shrotriya-lalit · 2026-06-23T16:21:10Z

Conflicts resolved and pushed.

The merge also brought in the upstream's cleaner confirmed_by_start approach (separate dict, only applied when f.end_line is None) which directly addresses the over-match concern — a static finding with an explicit end_line that differs from the LLM's end_line is now correctly dropped rather than over-matched. This is cleaner than the setdefault fallback in the original PR and implements the "tighter semantics" mentioned in your review.

Updated test_end_line_used_when_provided to use a real end_line=None Finding (the test class's _make_finding helper converts None→line via or, so we construct the Finding directly) and assert the three-way behaviour: exact match confirmed, end_line=None confirmed via confirmed_by_start, different explicit end_line dropped.

880 tests pass, ruff clean.

rng1995 · 2026-06-24T06:17:23Z

I still see a conflict arising due to PRs being merged. Sorry, but can you resolve it for the last time? Same as your other PR: #95

…d_line=None Static analyzers emit findings with end_line=None while the LLM always fills in an explicit end_line. Before the confirmed_by_start fix (issue NVIDIA#67), these findings were silently dropped because the exact (file, rule, start, end) key never matched. Add two tests that pin the correct behaviour: - test_static_finding_with_none_end_line_confirmed_by_start: core issue NVIDIA#67 scenario — static finding with end_line=None is confirmed when the LLM reports the same start_line with an explicit end_line. - test_static_findings_at_different_lines_only_confirmed_kept: two findings at different start_lines; LLM denies one — only the confirmed finding survives apply_filter. Signed-off-by: Lalit Shrotriya <shrotriya.lalit@outlook.com>

Shrotriya-lalit · 2026-06-24T06:59:11Z

Conflicts resolved. Rebased onto current main and re-applied the change cleanly.

The previous branch had stale commits predating the suppression floor, batch isolation, and fallback filter work — applying them would have regressed those features. Rebuilt from current main.

What changed: Added two regression tests to TestLLMMetaAnalyzerApplyFilter that specifically cover the issue #67 scenario (static finding with end_line=None confirmed by LLM with an explicit end_line) — a case the existing test_end_line_used_when_provided does not exercise since both its findings have explicit end_lines:

test_static_finding_with_none_end_line_confirmed_by_start — static finding with end_line=None, LLM returns same start_line with explicit end_line → finding must not be dropped (the confirmed_by_start fallback).
test_static_findings_at_different_lines_only_confirmed_kept — two findings at different lines, LLM confirms one and denies the other → only the confirmed finding survives.

Note on production code: The confirmed_by_start fix was independently implemented by PR #70 (JiayingHuang). This PR now contributes the regression tests that pin that behaviour for the common static-analyzer case where end_line=None.

117 tests in test_llm_analyzer_base.py pass, ruff clean.

rng1995

[Automated SkillSpector Review]

Re-review after rebase — re-confirming approval.

The production fix for the dropped-findings bug now lives on main via the cleaner confirmed_by_start approach (which also resolves the slight over-match I flagged originally, since it only applies the start-only fallback when f.end_line is None). This branch's remaining net diff is purely two well-constructed regression tests for issue #67 (static end_line=None confirmed by an LLM end_line is kept; and a confirmed-vs-denied selectivity case). mergeable_state: clean. No concerns.

rng1995 approved these changes Jun 21, 2026

View reviewed changes

Shrotriya-lalit force-pushed the fix/issue-67-apply-filter-end-line branch from 45d160b to 6bdf960 Compare June 24, 2026 06:58

rng1995 merged commit 36e8058 into NVIDIA:main Jun 24, 2026

rng1995 approved these changes Jun 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(meta_analyzer): store None-end fallback key so static findings are not dropped#94

fix(meta_analyzer): store None-end fallback key so static findings are not dropped#94
rng1995 merged 1 commit into
NVIDIA:mainfrom
Shrotriya-lalit:fix/issue-67-apply-filter-end-line

Shrotriya-lalit commented Jun 18, 2026

Uh oh!

rng1995 left a comment

Uh oh!

Shrotriya-lalit commented Jun 23, 2026

Uh oh!

rng1995 commented Jun 23, 2026

Uh oh!

Shrotriya-lalit commented Jun 23, 2026

Uh oh!

rng1995 commented Jun 24, 2026

Uh oh!

Shrotriya-lalit commented Jun 24, 2026

Uh oh!

rng1995 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Shrotriya-lalit commented Jun 18, 2026

Problem

Root Cause

Fix

Tests

Checklist

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Uh oh!

Shrotriya-lalit commented Jun 23, 2026

Uh oh!

rng1995 commented Jun 23, 2026

Uh oh!

Shrotriya-lalit commented Jun 23, 2026

Uh oh!

rng1995 commented Jun 24, 2026

Uh oh!

Shrotriya-lalit commented Jun 24, 2026

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants