parse Apache-style timestamps with a numeric timezone offset#19
parse Apache-style timestamps with a numeric timezone offset#19HrachShah wants to merge 1 commit into
Conversation
The Apache combined log format always emits timestamps in the form '01/Jan/2025:12:00:00 -0700' (day/Mon/year:hour:minute:second signhhmm). The formats list in _try_parse_datetime only had the offset-less '%d/%b/%Y:%H:%M:%S' form, so any Apache log line whose timestamp included the numeric offset returned None from _try_parse_datetime, which cascaded to parse_timestamp returning None, which made the apache parser's _parse_timestamp fall back to the same offset-less format inside the parser and silently drop the timezone information. The result is that --start-time / --end-time filtering on a log file using the canonical Apache combined format with offsets either timed the window wrongly (because the time zones disagreed silently) or produced no entries at all (because the parser-level fallback used ts_str.split()[0] which left the offset in the string and made the follow-up strptime call fail with the colon in the offset). Add a '%d/%b/%Y:%H:%M:%S %z' entry to the formats list so the offset-aware Apache timestamp is parsed in one shot, and add tests/test_utils.py with three pin cases (positive offset, negative offset, offset-less still works) so the next refactor of the format list doesn't drop the offset form.
Reviewer's GuideAdds support for Apache-style timestamps with numeric timezone offsets to the internal timestamp parser and introduces tests to cover both offset and non-offset variants. Sequence diagram for updated _try_parse_datetime parsing Apache timestamps with timezone offsetsequenceDiagram
participant Caller
participant Utils as _try_parse_datetime
Caller->>Utils: _try_parse_datetime("01/Jan/2025:12:00:00 -0700")
activate Utils
Utils->>Utils: datetime.strptime(ts_str, "%Y-%m-%dT%H:%M:%S")
Utils-->>Utils: ValueError
Utils->>Utils: datetime.strptime(ts_str, "%Y-%m-%dT%H:%M:%S%z")
Utils-->>Utils: ValueError
Utils->>Utils: datetime.strptime(ts_str, "%d/%b/%Y:%H:%M:%S")
Utils-->>Utils: ValueError
Utils->>Utils: datetime.strptime(ts_str, "%d/%b/%Y:%H:%M:%S %z")
Utils-->>Caller: timezone-aware datetime
deactivate Utils
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Warning Review limit reached
More reviews will be available in 48 minutes and 11 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Adds a missing datetime format to the timestamp parser in
log_analyzer_cli/utils.py so that Apache combined log lines with
a numeric timezone offset ("01/Jan/2025:12:00:00 -0700") can be
parsed as timezone-aware datetimes instead of returning None.
Summary by Sourcery
Extend timestamp parsing to handle Apache-style log entries with numeric timezone offsets while preserving support for existing formats.
New Features:
Tests: