fix(meta_analyzer): store None-end fallback key so static findings are not dropped#94
Conversation
rng1995
left a comment
There was a problem hiding this comment.
Correct fix for the dropped-findings bug. Approving.
Static analyzers almost always leave end_line=None, while the LLM fills it in, so the exact (file, rule, start, end) key never matched and confirmed findings were silently dropped. Storing the extra (file, rule, start, None) fallback via setdefault (so an explicit LLM end_line=None still takes precedence) is the minimal, correct change, and the lookup order is unchanged. Good that the reproduction test, the multi-finding test, and the updated test_end_line_used_when_provided all pin the new behavior — and I confirmed rejected findings (is_vulnerability=False) are never stored, so they stay rejected.
Minor / optional (non-blocking): the None-end fallback means two static findings at the same (file, rule, start_line) with different end_lines will now both be confirmed if the LLM confirms that start_line — a slight over-match. For a security tool that's the safe direction (avoids false negatives), just noting the behavior.
|
Thanks for the detailed review @rng1995! On the over-match observation: you're right that two static findings at the same If the over-match becomes a problem, a follow-up could track confirmed end_lines per |
|
Please resolve the merge conflicts and do separate PR for tighter semantics |
|
Conflicts resolved and pushed. The merge also brought in the upstream's cleaner Updated 880 tests pass, ruff clean. |
|
I still see a conflict arising due to PRs being merged. Sorry, but can you resolve it for the last time? Same as your other PR: #95 |
…d_line=None Static analyzers emit findings with end_line=None while the LLM always fills in an explicit end_line. Before the confirmed_by_start fix (issue NVIDIA#67), these findings were silently dropped because the exact (file, rule, start, end) key never matched. Add two tests that pin the correct behaviour: - test_static_finding_with_none_end_line_confirmed_by_start: core issue NVIDIA#67 scenario — static finding with end_line=None is confirmed when the LLM reports the same start_line with an explicit end_line. - test_static_findings_at_different_lines_only_confirmed_kept: two findings at different start_lines; LLM denies one — only the confirmed finding survives apply_filter. Signed-off-by: Lalit Shrotriya <shrotriya.lalit@outlook.com>
45d160b to
6bdf960
Compare
|
Conflicts resolved. Rebased onto current main and re-applied the change cleanly. The previous branch had stale commits predating the suppression floor, batch isolation, and fallback filter work — applying them would have regressed those features. Rebuilt from current main. What changed: Added two regression tests to
Note on production code: The 117 tests in |
rng1995
left a comment
There was a problem hiding this comment.
[Automated SkillSpector Review]
Re-review after rebase — re-confirming approval.
The production fix for the dropped-findings bug now lives on main via the cleaner confirmed_by_start approach (which also resolves the slight over-match I flagged originally, since it only applies the start-only fallback when f.end_line is None). This branch's remaining net diff is purely two well-constructed regression tests for issue #67 (static end_line=None confirmed by an LLM end_line is kept; and a confirmed-vs-denied selectivity case). mergeable_state: clean. No concerns.
Problem
LLMMetaAnalyzer.apply_filter()stored LLM-confirmed findings keyed by:Static analyzers almost never populate
end_line— it defaults toNone.The LLM, however, fills in
end_linefor every finding it confirms.This key mismatch meant every static finding with
end_line=Nonewas silentlydropped after LLM enrichment, even when the LLM had correctly confirmed it.
In practice, the meta-analyzer produced zero filtered findings for entire files.
Root Cause
Fix
When storing a granular LLM confirmation with an explicit
end_line, also storea fallback key with
end_line=Noneviasetdefault:setdefaultis used so that if the LLM also reports a finding at the samestart_line with
end_line=None, its enrichment takes precedence. Lookup orderis unchanged: exact key → start-only key → coarse key.
Tests
test_static_end_line_none_confirmed_when_llm_returns_end_line— reproduces the bug exactlytest_static_end_line_none_multiple_same_rule— multiple findings same rule, different start linestest_end_line_used_when_provided— regression: exact end_line match still worksCloses #67
Checklist
make lintpassesgit commit -s)