-
Notifications
You must be signed in to change notification settings - Fork 0
reset error patterns at the start of each LogAnalyzer.analyze call #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -85,7 +85,6 @@ def parse(self, line: str) -> Optional[ParsedEntry]: | |
| groups = match.groupdict() | ||
|
|
||
| timestamp = self._parse_timestamp(groups.get("timestamp", "")) | ||
| level = detect_log_level(line) # Check full line for level | ||
|
|
||
| metadata = {} | ||
| if groups.get("host"): | ||
|
|
@@ -94,11 +93,12 @@ def parse(self, line: str) -> Optional[ParsedEntry]: | |
| metadata["process"] = groups["process"] | ||
| source = groups["process"] | ||
| else: | ||
| # Use host as source when no process name | ||
| source = groups.get("host") | ||
| if groups.get("pid"): | ||
| metadata["pid"] = groups["pid"] | ||
|
|
||
| level = detect_log_level(groups.get("message", "")) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
# Description: Verify CLI level filtering behavior with the parser change
# Check how _parse_file uses level filtering before parsing
ast-grep --pattern $'def _parse_file($$$):
$$$
if include_levels:
$$$
$$$
parsed = parser.parse($$$)
$$$'
# Check if there are any tests that validate level filtering consistency
rg -n -A5 -B5 'include_levels' --glob '*test*.py'Repository: HrachShah/log-analyzer-cli Length of output: 52 🏁 Script executed: # Check CLI code structure around level filtering
fd 'cli.py' -x cat -n {} | head -250
# Also check syslog parser around line 100
fd 'syslog.py' -x cat -n {} | grep -A 10 -B 10 'level = detect_log_level'Repository: HrachShah/log-analyzer-cli Length of output: 10564 🏁 Script executed: # Find the _parse_file function in cli.py
rg -n '_parse_file' --glob '*.py' -A 30
# Also check if detect_log_level is being used elsewhere in cli.py
rg -n 'detect_log_level' --glob '*.py'Repository: HrachShah/log-analyzer-cli Length of output: 4722 🏁 Script executed: # Get the complete _parse_file function to understand the flow
sed -n '186,250p' src/log_analyzer_cli/cli.pyRepository: HrachShah/log-analyzer-cli Length of output: 1341 🏁 Script executed: # Check for tests that validate the level filtering behavior
rg -n 'error-db|hostname.*error|include_levels' tests/ --glob '*.py' -A 3 -B 3
# Also check what detect_log_level does
sed -n '121,150p' src/log_analyzer_cli/utils.pyRepository: HrachShah/log-analyzer-cli Length of output: 863 🏁 Script executed: # Check if there are any tests showing the hostname/error issue being fixed
rg -n 'error-db|hostname' tests/ --glob '*.py' -A 5 -B 5
# Check if UNKNOWN levels are converted to INFO or other defaults anywhere
rg -n 'UNKNOWN' src/ --glob '*.py' -B 2 -A 2
# List test files to understand test coverage
find tests/ -name '*.py' -type f | head -20Repository: HrachShah/log-analyzer-cli Length of output: 3642 🏁 Script executed: # Check for tests that specifically validate the level filtering consistency
rg -n 'def test.*level' tests/ --glob '*.py' -A 10
# Check the syslog test to see if it validates the message-only level detection
sed -n '34,60p' tests/test_parsers.pyRepository: HrachShah/log-analyzer-cli Length of output: 3822 The parser and CLI-level filtering have mismatched level detection logic that breaks filtering consistency. The CLI filters entries using A line like
This contradicts the PR's goal to prevent false positives from hostnames/process names and breaks the semantic contract of the level filter—users expect filtered results to have the filtered level. The CLI should filter using the parsed entry's level, not the raw line. 🤖 Prompt for AI Agents |
||
|
|
||
| return ParsedEntry( | ||
| raw=line, | ||
| timestamp=timestamp, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Restricting log-level detection to only the message body may miss levels present in the prefix or structured fields.
Previously
detect_log_levelreceived the full raw line, so it could pick up level markers in prefixes (e.g.,ERROR,WARN, numeric priorities) outside themessagefield. With the new call using onlygroups["message"], those indicators will be ignored in formats where the level isn't insidemessage, which may regress detection for some syslog variants. Consider callingdetect_log_levelonmessagefirst and falling back to the fulllinewhenmessageis empty or the result is unknown, to retain robustness while favoring the new behavior.