Summary
While investigating v2.3.1 test output on Windows, I found three distinct bugs that all reproduce on origin/main with no modifications. They are not flakiness — each is a deterministic cross-platform bug with a clear root cause.
All three are currently masked by the 8 failing tests on Windows, which are easy to dismiss as "platform noise" but actually reveal real product bugs.
Bug 1 — get_data_dir() writes non-UTF-8 .gitignore on Windows
File: code_review_graph/incremental.py:128-138 (in get_data_dir())
inner_gitignore.write_text(
"# Auto-generated by code-review-graph — do not commit database files.\n"
"# The graph.db contains absolute paths and code structure metadata.\n"
"*\n"
)
write_text() is called without an encoding= argument. The string literal contains an em-dash (—, U+2014). On Windows, Path.write_text uses the system default codepage (cp1252 in most locales), which encodes U+2014 as byte 0x97. When anything later reads this file as UTF-8 — including get_data_dir's own test test_default_uses_repo_subdir at tests/test_incremental.py:194 — it raises:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x97 in position 38: invalid start byte
Reproducible on: Windows 11 / Python 3.10.10 / v2.3.1, clean checkout of origin/main. No modifications required.
Impact: Any Windows user running code-review-graph build generates a .gitignore that cannot be parsed by standard UTF-8 tooling (including the tool's own tests and any editor set to UTF-8 strict mode). Unix systems are unaffected because they default to UTF-8.
Fix: Add encoding="utf-8" to the write_text() call. (Same fix that get_data_dir's caller _ensure_repo_gitignore at line 170 already applies — it uses encoding="utf-8", but this sibling function was missed.)
Bug 2 — Databricks notebook auto-detection fails on CRLF line endings
File: code_review_graph/parser.py:391-394 (in parse_bytes())
if language == "python" and source.startswith(
b"# Databricks notebook source\n",
):
return self._parse_databricks_py_notebook(path, source)
The check hard-codes \n. On Windows, git checkouts default to core.autocrlf=true, which converts the Databricks fixture's line endings to \r\n on checkout. The file on disk starts with b"# Databricks notebook source\r\n", so source.startswith(b"# Databricks notebook source\n") returns False, the file is parsed as regular Python, and all Databricks-specific handling is bypassed:
notebook_format metadata is never set on the File node
# MAGIC %sql cells are not parsed as SQL (no IMPORTS_FROM edges for table references)
# MAGIC %md cells are not skipped
- Per-cell
cell_index metadata is missing from function nodes
Reproducible on: Windows 11 / v2.3.1, clean checkout, confirmed with a single python -c statement reading tests/fixtures/sample_databricks_export.py — the file starts with b'# Databricks notebook source\r\n'.
Impact: Every one of the 4 TestDatabricksPyNotebook tests fails on Windows:
test_detects_databricks_header — no notebook_format
test_extracts_sql_tables — missing bronze.events, silver.users, etc.
test_skips_magic_md_cells — extra function from MD cells that should be skipped
test_cell_index_tracking — cell_index metadata missing
Fix: Parse the first line robustly, stripping any trailing \r:
first_line_end = source.find(b"\n")
first_line = source[:first_line_end].rstrip(b"\r") if first_line_end != -1 else source
if language == "python" and first_line == b"# Databricks notebook source":
return self._parse_databricks_py_notebook(path, source)
This matches both \n and \r\n endings.
Bug 3 — Stale FastMCP API in async regression guard test
File: tests/test_main.py:68-71 (in TestLongRunningToolsAreAsync::test_heavy_tools_are_coroutines)
async def test_heavy_tools_are_coroutines(self):
tools = await crg_main.mcp.get_tools()
FastMCP does not expose a get_tools() method. The available API in fastmcp>=2.14.0 (the version pinned in pyproject.toml) is list_tools(). The test dies at AttributeError: 'FastMCP' object has no attribute 'get_tools'. Did you mean: 'list_tools'?.
This is not a cosmetic issue — this test is the regression guard for #46/#136 (v2.3.1 Windows async fix). The PR description for #231 explicitly says:
There are regression tests (test_heavy_tools_are_coroutines + test_heavy_tool_source_uses_to_thread) that will fail at CI collection time if anyone converts one of the 5 tools back to sync in a future refactor, so we shouldn't hit this class of bug again.
But the test has been failing at runtime on every platform since it was written, which means the regression guard is not actually guarding anything. A future refactor that converts build_or_update_graph_tool back to sync would not be caught by CI because this test is already red.
Reproducible on: every platform with the pinned fastmcp>=2.14.0. Not Windows-specific.
Fix: Replace get_tools() with list_tools(). The rest of the test logic (verifying asyncio.iscoroutinefunction(tool.fn) for the heavy 5 tools) is correct.
Follow-up PR
A single PR will fix all three bugs in one focused commit. Each fix is small, each has a direct regression test, and each is verifiable on unchanged origin/main before and after.
Note: there are 2 additional failing tests on Windows (test_returns_none_without_git, test_falls_back_to_start in TestFindRepoRoot / TestFindProjectRoot) that are true test environmental flakiness — they assume no ancestor of tmp_path has a .git directory, which is false on any Windows user whose home directory contains a git repo (e.g. dotfiles). Those are intentionally not addressed here because the fix requires either a product API change (stop_at parameter) or invasive monkeypatching — worth a separate discussion.
Summary
While investigating v2.3.1 test output on Windows, I found three distinct bugs that all reproduce on
origin/mainwith no modifications. They are not flakiness — each is a deterministic cross-platform bug with a clear root cause.All three are currently masked by the 8 failing tests on Windows, which are easy to dismiss as "platform noise" but actually reveal real product bugs.
Bug 1 —
get_data_dir()writes non-UTF-8.gitignoreon WindowsFile:
code_review_graph/incremental.py:128-138(inget_data_dir())write_text()is called without anencoding=argument. The string literal contains an em-dash (—, U+2014). On Windows,Path.write_textuses the system default codepage (cp1252 in most locales), which encodes U+2014 as byte0x97. When anything later reads this file as UTF-8 — includingget_data_dir's own testtest_default_uses_repo_subdirattests/test_incremental.py:194— it raises:Reproducible on: Windows 11 / Python 3.10.10 / v2.3.1, clean checkout of
origin/main. No modifications required.Impact: Any Windows user running
code-review-graph buildgenerates a.gitignorethat cannot be parsed by standard UTF-8 tooling (including the tool's own tests and any editor set to UTF-8 strict mode). Unix systems are unaffected because they default to UTF-8.Fix: Add
encoding="utf-8"to thewrite_text()call. (Same fix thatget_data_dir's caller_ensure_repo_gitignoreat line 170 already applies — it usesencoding="utf-8", but this sibling function was missed.)Bug 2 — Databricks notebook auto-detection fails on CRLF line endings
File:
code_review_graph/parser.py:391-394(inparse_bytes())The check hard-codes
\n. On Windows, git checkouts default tocore.autocrlf=true, which converts the Databricks fixture's line endings to\r\non checkout. The file on disk starts withb"# Databricks notebook source\r\n", sosource.startswith(b"# Databricks notebook source\n")returnsFalse, the file is parsed as regular Python, and all Databricks-specific handling is bypassed:notebook_formatmetadata is never set on the File node# MAGIC %sqlcells are not parsed as SQL (noIMPORTS_FROMedges for table references)# MAGIC %mdcells are not skippedcell_indexmetadata is missing from function nodesReproducible on: Windows 11 / v2.3.1, clean checkout, confirmed with a single
python -cstatement readingtests/fixtures/sample_databricks_export.py— the file starts withb'# Databricks notebook source\r\n'.Impact: Every one of the 4
TestDatabricksPyNotebooktests fails on Windows:test_detects_databricks_header— nonotebook_formattest_extracts_sql_tables— missingbronze.events,silver.users, etc.test_skips_magic_md_cells— extra function from MD cells that should be skippedtest_cell_index_tracking—cell_indexmetadata missingFix: Parse the first line robustly, stripping any trailing
\r:This matches both
\nand\r\nendings.Bug 3 — Stale FastMCP API in async regression guard test
File:
tests/test_main.py:68-71(inTestLongRunningToolsAreAsync::test_heavy_tools_are_coroutines)FastMCPdoes not expose aget_tools()method. The available API infastmcp>=2.14.0(the version pinned inpyproject.toml) islist_tools(). The test dies atAttributeError: 'FastMCP' object has no attribute 'get_tools'. Did you mean: 'list_tools'?.This is not a cosmetic issue — this test is the regression guard for #46/#136 (v2.3.1 Windows async fix). The PR description for #231 explicitly says:
But the test has been failing at runtime on every platform since it was written, which means the regression guard is not actually guarding anything. A future refactor that converts
build_or_update_graph_toolback to sync would not be caught by CI because this test is already red.Reproducible on: every platform with the pinned
fastmcp>=2.14.0. Not Windows-specific.Fix: Replace
get_tools()withlist_tools(). The rest of the test logic (verifyingasyncio.iscoroutinefunction(tool.fn)for the heavy 5 tools) is correct.Follow-up PR
A single PR will fix all three bugs in one focused commit. Each fix is small, each has a direct regression test, and each is verifiable on unchanged
origin/mainbefore and after.Note: there are 2 additional failing tests on Windows (
test_returns_none_without_git,test_falls_back_to_startinTestFindRepoRoot/TestFindProjectRoot) that are true test environmental flakiness — they assume no ancestor oftmp_pathhas a.gitdirectory, which is false on any Windows user whose home directory contains a git repo (e.g. dotfiles). Those are intentionally not addressed here because the fix requires either a product API change (stop_atparameter) or invasive monkeypatching — worth a separate discussion.