Skip to content

Test flakiness: find_repo_root walks above test tmp_path and finds ancestor .git #241

@azizur100389

Description

@azizur100389

Summary

Two tests in tests/test_incremental.py fail on any developer machine where an ancestor of pytest's tmp_path (e.g. the user's home directory) contains a .git repository:

  • TestFindRepoRoot::test_returns_none_without_git
  • TestFindProjectRoot::test_falls_back_to_start

Reproduction

On Windows 11, pytest creates tmp_path under C:/Users/<username>/AppData/Local/Temp/pytest-of-<username>/.... If the user has run git init anywhere under C:/Users/<username>/ (for example, a dotfiles repo at ~/.git — very common on developer machines), then find_repo_root() walking up from tmp_path/no_git finds C:/Users/<username>/.git and returns the home directory instead of None.

Verified on clean origin/main:

tests\test_incremental.py:40: AssertionError
  find_repo_root(WindowsPath('C:/Users/erazi/AppData/Local/Temp/pytest-of-erazi/.../no_git'))
  returned WindowsPath('C:/Users/erazi') instead of None

The exact same failure mode affects Linux users who have a git-initialized home directory (e.g. chezmoi, yadm, or a bare git init ~).

Root cause

find_repo_root() in code_review_graph/incremental.py:74 has no way to bound its ancestor walk:

def find_repo_root(start: Path | None = None) -> Optional[Path]:
    current = start or Path.cwd()
    while current != current.parent:
        if (current / ".git").exists():
            return current
        current = current.parent
    ...

It walks all the way to the filesystem root. This is the correct behavior in production (a nested script inside a git repo should find that repo), but it makes the function un-testable in environments where the tmp path is inside a git-initialized ancestor.

Fix proposal

Add an optional stop_at: Path | None = None parameter to both find_repo_root() and find_project_root(). When provided, the walk halts after examining stop_at and returns None (for find_repo_root) or the start path (for find_project_root) rather than crossing above the boundary.

def find_repo_root(
    start: Path | None = None,
    stop_at: Path | None = None,
) -> Optional[Path]:
    current = start or Path.cwd()
    while current != current.parent:
        if (current / ".git").exists():
            return current
        if stop_at is not None and current == stop_at:
            return None
        current = current.parent
    ...

Semantics:

  • stop_at=None (default): existing behavior, walk all the way to the filesystem root
  • stop_at=<some path>: the walk examines stop_at for .git and then stops; it never crosses above stop_at

This is:

  • Fully backward-compatible — all 7 production callers pass no stop_at
  • Testable — the 2 flaky tests can pass stop_at=tmp_path to bound the walk
  • Useful beyond tests — any caller that wants to bound the ancestor walk (embedded environments, multi-repo orchestrators, CI containers with bind-mounted volumes) can use the same mechanism

Not a product bug

Production behavior is correct. This is a test-only improvement to make find_repo_root env-independent, plus a small product API addition that's useful in its own right.

A PR implementing this with 2 test fixes + 1 new positive regression test for the stop_at parameter will follow.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions