Skip to content

[BUG] Windows: parse_git_diff_ranges crashes with UnicodeDecodeError (cp1252) → NoneType.splitlines #266

@kkrassmann

Description

@kkrassmann

Summary

On Windows, code-review-graph analyze (used by the Claude Code Stop hook) crashes when the git diff output contains non-cp1252 bytes. Two layered bugs are involved:

  1. subprocess.run(...) in parse_git_diff_ranges reads stdout with the default Windows codepage (cp1252), which raises UnicodeDecodeError on any non-ANSI byte in the diff (e.g. UTF-8 characters in source files or .planning/ docs).
  2. When that decode raises inside the reader thread, result.stdout ends up as None, and the next line calls .splitlines() on it — crashing with AttributeError: 'NoneType' object has no attribute 'splitlines'.

This is the same class of bug as #148 / #239 (missing encoding="utf-8"), but in changes.py instead of skills.py / incremental.py.

Environment

  • OS: Windows 11 Pro 10.0.26200
  • Python: 3.13
  • Package: code-review-graph (latest)
  • Trigger: Stop hook in Claude Code running code-review-graph analyze-changes

Actual Traceback

Exception in thread Thread-3 (_readerthread):
Traceback (most recent call last):
  File "...\threading.py", line 1044, in _bootstrap_inner
    self.run()
  File "...\threading.py", line 995, in run
    self._target(*self._args, **self._kwargs)
  File "...\subprocess.py", line 1615, in _readerthread
    buffer.append(fh.read())
  File "...\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 608670: character maps to <undefined>

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "...\code-review-graph.exe\__main__.py", line 6, in <module>
    sys.exit(main())
  File "...\code_review_graph\cli.py", line 591, in main
    result = analyze_changes(
        store,
        ...
        base=base,
    )
  File "...\code_review_graph\changes.py", line 226, in analyze_changes
    changed_ranges = parse_git_diff_ranges(repo_root, base)
  File "...\code_review_graph\changes.py", line 63, in parse_git_diff_ranges
    return _parse_unified_diff(result.stdout)
  File "...\code_review_graph\changes.py", line 79, in _parse_unified_diff
    for line in diff_text.splitlines():
                ^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'splitlines'

Root Cause

code_review_graph/changes.py around line 63 runs git diff via subprocess.run without passing encoding="utf-8". On Windows, Python's subprocess defaults to the ANSI codepage (cp1252 in most locales) when text=True / the result is decoded, so any UTF-8 byte > 0x7F that is not valid cp1252 (e.g. 0x9d) raises UnicodeDecodeError inside _readerthread.

Because the decode exception happens on a worker thread, the caller sees result.stdout == None and promptly feeds that into _parse_unified_diff, which calls .splitlines() — hence the confusing second traceback.

Suggested Fix

Two-part fix in code_review_graph/changes.py:

 result = subprocess.run(
     ["git", "diff", "--unified=0", base, "HEAD"],
     cwd=repo_root,
     capture_output=True,
-    text=True,
+    text=True,
+    encoding="utf-8",
+    errors="replace",
     check=False,
 )
+if result.stdout is None:
+    return []
 return _parse_unified_diff(result.stdout)

Impact

On Windows, this completely breaks the Stop hook integration for any repo whose tracked files (or planning docs) contain UTF-8 characters outside cp1252 — which is basically every repo that touches German/French/Chinese text or standard Unicode punctuation (em-dash, smart quotes, etc.). The tool is unusable as a Claude Code hook on Windows today.

Verification

Reproduces deterministically on Windows 11 / Python 3.13 against any repo whose git diff <base> HEAD output contains byte 0x9d (or any byte that is undefined in cp1252).

Happy to send a PR if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions