Skip to content

fix: handle non-UTF-8 bytes in forkstat post-processor#8

Merged
k-rister merged 1 commit into
mainfrom
fix-post-process-encoding
Mar 30, 2026
Merged

fix: handle non-UTF-8 bytes in forkstat post-processor#8
k-rister merged 1 commit into
mainfrom
fix-post-process-encoding

Conversation

@k-rister
Copy link
Copy Markdown
Contributor

Summary

  • Fix UnicodeDecodeError crash when forkstat output contains non-UTF-8 box-drawing characters
  • Add errors='replace' to lzma.open() so undecodable bytes are substituted rather than crashing
  • Safe because the regex parser only matches ASCII timestamp and event fields

Test plan

  • Verify forkstat post-processing completes without error
  • Verify metric-data files are produced correctly

🤖 Generated with Claude Code

Forkstat output contains box-drawing characters that may not be
valid UTF-8 on all systems. Add errors='replace' to lzma.open()
so undecodable bytes are substituted rather than crashing. The
regex parser only matches ASCII timestamp and event fields, so
replaced characters in other columns are harmlessly skipped.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@k-rister k-rister merged commit 2bf1713 into main Mar 30, 2026
31 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in Crucible Tracking Mar 30, 2026
@k-rister k-rister deleted the fix-post-process-encoding branch March 30, 2026 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants