Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .archgate/adrs/ARCH-005-testing-standards.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ Use Bun's built-in test runner (`bun test`) for all tests. Test files go in `tes
- Use descriptive test names that explain the expected behavior
- **When mocking `fetch` in tests, assign directly to `globalThis.fetch`** — use `globalThis.fetch = mockFn as unknown as typeof fetch`. Restore in `afterEach` via `mock.restore()` (from `bun:test`) or by reassigning the original reference before the test.
- **Wrap `spyOn` / `mockImplementation` calls in `try/finally` to guarantee `mockRestore()` runs** — when `expect()` assertions fail, they throw immediately, skipping any `mockRestore()` that follows. The un-restored spy leaks into subsequent tests, causing false positives or false negatives. Pattern: `const spy = spyOn(...).mockImplementation(() => {}); try { /* assertions */ } finally { spy.mockRestore(); }`. Alternatively, create and restore spies in `beforeEach`/`afterEach` hooks instead of inline.
- **Make large production thresholds injectable so tests can use a small value** — When production code only acts past a large numeric threshold (e.g., `SCOPE_FILE_WARN_THRESHOLD = 1000` in `src/engine/git-files.ts`), add an optional parameter that defaults to the module constant (e.g., `resolveScopedFiles(root, globs, { fileWarnThreshold })`). Tests inject a tiny value such as `5` and create only a handful of files to exercise the same code path. This keeps the test fast and deterministic on every platform instead of materializing thousands of fixture files.
- **Only ever raise a per-test timeout override above the global, never below it** — the project global is `bun test --timeout 60000` (60s). A per-test `}, 30_000` override makes that test _more_ likely to time out than the default. Use a per-test override solely to grant a genuinely slow test more time than 60s; never set one shorter than the global.

### Don't

Expand All @@ -59,6 +61,7 @@ Use Bun's built-in test runner (`bun test`) for all tests. Test files go in `tes
- Don't import test utilities from `node:test` — use Bun's built-in `bun:test` module
- **Don't use `mock.module("node:fetch", ...)` to intercept HTTP fetch calls** — in Bun, the runtime fetch is `globalThis.fetch` and `mock.module` targeting `node:fetch` does not intercept it. The mock silently has no effect: the real network is hit, making tests non-deterministic and dependent on external services. Assign `globalThis.fetch` directly instead (see Do's above).
- **Don't place `mockRestore()` after assertions without `try/finally` protection** — if the assertion throws, the spy is never restored and contaminates subsequent tests. This is especially dangerous when spying on globals like `console.warn` or `globalThis.fetch`, where a leaked spy silently suppresses or redirects output for every test that follows.
- **Don't materialize large filesystem fixtures (1000+ files, plus `git add .`) just to cross a production threshold** — on Windows CI runners that filesystem work is slow enough to exceed the per-test timeout, killing the staging subprocess mid-run (`git add . failed (exit 143)`, where 143 = 128 + SIGTERM) and producing a flaky failure that does not reproduce locally. Inject a small threshold instead (see Do's above). This was the root cause of a recurring Windows smoke-test flake on the `resolveScopedFiles` warning test.

## Implementation Pattern

Expand Down Expand Up @@ -213,6 +216,7 @@ Code reviewers MUST verify:
5. Tests that call `git commit` on a temp repo configure `user.email` and `user.name` locally before committing
6. Tests that mock HTTP fetch assign `globalThis.fetch` directly — no `mock.module("node:fetch", ...)` usage
7. Tests that use `spyOn` or `mockImplementation` inline (not in `beforeEach`/`afterEach`) wrap the spy lifecycle in `try/finally` to guarantee `mockRestore()` runs even when assertions fail
8. Tests that exercise a numeric production threshold inject a small threshold value rather than generating fixtures large enough to trip the production default, and no per-test timeout override is shorter than the global `--timeout 60000`

## References

Expand Down
3 changes: 3 additions & 0 deletions .claude/agent-memory/archgate-developer/MEMORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ Skipping steps 2 or 3 is a workflow violation. The user should NEVER have to inv

## Patterns & Fixes

- **`oxfmt` formats markdown too — run `bun run format` after editing any `.md`, including ADRs** — `format:check` (part of `bun run validate` and the CI "Lint, Test & Check" job) runs `oxfmt --check .` over ALL files, not just `.ts`. It normalizes markdown (e.g., emphasis `*word*` → `_word_`). The `adr-author` skill writes ADR markdown but does NOT auto-format it, so ADR edits frequently fail CI `format:check` even when the TS changes are clean. Always run `bun run format` before committing ADR/markdown edits. Tripped CI on PR #372 (ARCH-005 edit).
- **Make large production thresholds injectable rather than building giant test fixtures** — Tests that must cross a big numeric threshold in production code (e.g., the 1000-file `SCOPE_FILE_WARN_THRESHOLD` in `src/engine/git-files.ts`) should NOT create 1000+ real files + `git add .` to trip it. On Windows CI runners that filesystem work is slow enough to exceed the per-test timeout — `git add .` gets SIGTERM'd mid-run (`git add . failed (exit 143)`, 143 = 128+SIGTERM) and the whole suite fails flakily. Fix: add an optional override (e.g., `fileWarnThreshold?: number` defaulting to the module constant) to the function under test, and have the test inject a tiny value (5) so it only needs ~6 files. Caused the `git-files > resolveScopedFiles > warns when file scope exceeds threshold` flake on PR #366's Windows smoke test. Also captured in ARCH-005.
- **Never set a per-test timeout override SHORTER than the global `--timeout`** — CI runs `bun test --timeout 60000` (60s). A per-test `}, 30_000` override makes that test MORE likely to time out than the default, the opposite of the usual intent. Only override the per-test timeout to raise it above the global, never to lower it. The 30s override on the git-files threshold test was the proximate cause of its Windows flake.
- **YAML double-quoted strings require escaped backslashes for Windows paths in tests** — YAML interprets `\` as an escape character inside double-quoted strings. Writing `cwd: "E:\project"` silently corrupts the parsed value because `\p` is not a valid escape sequence. Fix: use `JSON.stringify(path)` to produce properly escaped YAML values (e.g., `cwd: ${JSON.stringify(cwd)}`). JSON and YAML double-quoted strings share the same escape syntax. Encountered in Copilot CLI session-context tests (`workspace.yaml` with Windows paths).
- **`git commit` in temp repos requires local identity** — CI runners have no global `user.email`/`user.name` configured. Any test that runs `git commit` on a temp repo MUST call `git config user.email` and `git config user.name` locally after `git init`. Fails with a cryptic `ShellPromise` error in CI; passes locally. Also captured in ARCH-005 Do's.
- **Never use `bunx prettier` directly** — Always use `bun run format` (to fix) or `bun run format:check` (to verify). Using `bunx prettier` can fail or use a different version than the project's devDependency. The same applies to all dev tools: prefer `bun run <script>` over `bunx <tool>` when a package.json script exists.
Expand Down
11 changes: 9 additions & 2 deletions src/engine/git-files.ts
Original file line number Diff line number Diff line change
Expand Up @@ -68,12 +68,19 @@ export function getGitTrackedFiles(
export async function resolveScopedFiles(
projectRoot: string,
adrFileGlobs?: string[],
options?: { respectGitignore?: boolean; adrId?: string }
options?: {
respectGitignore?: boolean;
adrId?: string;
/** Override the file-count warning threshold (defaults to SCOPE_FILE_WARN_THRESHOLD). */
fileWarnThreshold?: number;
}
): Promise<string[]> {
const hasExplicitFiles = (adrFileGlobs?.length ?? 0) > 0;
const patterns = hasExplicitFiles ? adrFileGlobs! : ["**/*"];
const respectGitignore = options?.respectGitignore !== false;
const label = options?.adrId ? `ADR ${options.adrId}` : "resolveScopedFiles";
const fileWarnThreshold =
options?.fileWarnThreshold ?? SCOPE_FILE_WARN_THRESHOLD;

if (!respectGitignore && !hasExplicitFiles) {
logWarn(
Expand Down Expand Up @@ -105,7 +112,7 @@ export async function resolveScopedFiles(

const all = [...new Set(results.flat())].sort();

if (all.length > SCOPE_FILE_WARN_THRESHOLD || scanMs > SCOPE_SCAN_WARN_MS) {
if (all.length > fileWarnThreshold || scanMs > SCOPE_SCAN_WARN_MS) {
logWarn(
`${label}: Resolved ${all.length} files from patterns: ${patterns.join(", ")} (scan took ${Math.round(scanMs)}ms). Consider narrowing the \`files\` patterns in the ADR frontmatter to improve performance.`
);
Expand Down
11 changes: 8 additions & 3 deletions tests/engine/git-files.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,11 @@ describe("git-files", () => {
test("warns when file scope exceeds threshold", async () => {
await git(["init"], tempDir);
mkdirSync(join(tempDir, "src"), { recursive: true });
const fileCount = SCOPE_FILE_WARN_THRESHOLD + 1;
// Use a tiny injected threshold so we only need a handful of files —
// creating 1000+ real files + `git add .` is slow enough on Windows
// runners to trip the per-test timeout (SIGTERM mid-`git add`).
const fileWarnThreshold = 5;
const fileCount = fileWarnThreshold + 1;
for (let i = 0; i < fileCount; i++) {
writeFileSync(
join(tempDir, "src", `file-${String(i).padStart(4, "0")}.ts`),
Expand All @@ -395,8 +399,9 @@ describe("git-files", () => {
try {
const files = await resolveScopedFiles(tempDir, ["src/**/*.ts"], {
adrId: "SCOPE-001",
fileWarnThreshold,
});
expect(files.length).toBeGreaterThan(SCOPE_FILE_WARN_THRESHOLD);
expect(files.length).toBeGreaterThan(fileWarnThreshold);
const warnCalls = warnSpy.mock.calls.map((args) => args.join(" "));
expect(
warnCalls.some(
Expand All @@ -410,7 +415,7 @@ describe("git-files", () => {
} finally {
warnSpy.mockRestore();
}
}, 30_000);
});

test("does not warn when file scope is within threshold", async () => {
await git(["init"], tempDir);
Expand Down
Loading