Skip to content

fix: enforce 3000-file cap on _auto_setup fallback path (#34)#86

Merged
Wolfvin merged 2 commits into
mainfrom
fix/auto-setup-fallback-cap-34
Jun 29, 2026
Merged

fix: enforce 3000-file cap on _auto_setup fallback path (#34)#86
Wolfvin merged 2 commits into
mainfrom
fix/auto-setup-fallback-cap-34

Conversation

@Wolfvin

@Wolfvin Wolfvin commented Jun 29, 2026

Copy link
Copy Markdown
Owner

Summary

Fixes #34.

_auto_setup in scripts/codelens.py runs the scan via a subprocess with --max-files 3000 as a timeout guard (documented in its own docstring). But when that subprocess failed / exited non-zero, the fallback called cmd_scan(workspace, incremental=False) with no cap and no timeout — so on huge repos (tens of thousands of files) auto-setup could hang indefinitely, while the result hint still claimed "Auto-setup capped at 3000 files". That was a lie.

Worse, while investigating I found that commands/scan.add_args did not register --max-files at all, so argparse rejected the subprocess command with unrecognized arguments: --max-files 3000 (exit 2) every time. The "main" subprocess path was effectively dead code, and the uncapped fallback was the only path actually exercised.

Fix

Cap 3000 files (or equivalent) now applies consistently to both the subprocess path and the in-process fallback — no silent unprotected path.

Files changed

  • scripts/commands/scan.py

    • Added --max-files argument to add_args so the subprocess path actually accepts it (previously rejected by argparse → dead code).
    • Added max_files: Optional[int] = None parameter to cmd_scan.
    • Added _cap_discovered_files(files, max_files) helper that truncates per-category file lists so the total ≤ max_files. Applied after discover_files, before parsing — os.walk cost is unchanged but parsing/registry-build cost is bounded.
    • Wired execute() to pass args.max_files through to cmd_scan.
  • scripts/codelens.py (_auto_setup + main())

    • The in-process fallback now calls cmd_scan(workspace, incremental=False, max_files=_AUTO_SETUP_MAX_FILES) — same 3000-file cap as the subprocess path.
    • Tracks a fallback flag (True iff the in-process fallback was taken).
    • Computes capped = total_files >= _AUTO_SETUP_MAX_FILES consistently across both paths.
    • Returns both capped and fallback in the _auto_setup() result dict.
    • main() propagates capped and fallback from _auto_setup()'s return value into auto_setup_info, which becomes result["_auto_setup"] — so MCP clients / agents can tell which path produced the registry and whether the cap was hit (explicitly requested in issue [BUG-06] Auto-setup fallback ignores --max-files cap — defeats timeout protection #34).
    • Switched __import__("subprocess") to a normal import subprocess (KISS, no dead code).
    • Updated docstring to reflect actual behavior.
  • tests/test_cli.py — new TestAutoSetupFallbackCap class with 3 regression tests (see below).

Constraints satisfied

  • Fallback passes max_files=_AUTO_SETUP_MAX_FILES to cmd_scan, same as the subprocess main path.
  • result["_auto_setup"]["capped"] and result["_auto_setup"]["fallback"] flags exposed for MCP clients.
  • KISS — single cap helper, single cap enforcement point in cmd_scan.
  • No dead code — the subprocess path now actually works (--max-files is a registered arg), instead of silently failing every time.

Definition of Done

1. Regression test: subprocess fails -> fallback calls cmd_scan with max_files=3000

tests/test_cli.py::TestAutoSetupFallbackCap::test_fallback_passes_max_files_cap

Monkeypatches subprocess.run to raise SubprocessError, spies on cmd_scan to capture the max_files kwarg, calls codelens._auto_setup(ws), and asserts captured["max_files"] == 3000 (not None).

2. Test verifies result["_auto_setup"] contains capped=True and fallback=True when fallback is taken

tests/test_cli.py::TestAutoSetupFallbackCap::test_fallback_sets_capped_and_fallback_flags

Builds a workspace with 3001 .py files, forces subprocess.run to raise (-> fallback path), drives the full codelens.main() flow in-process with sys.argv = ["codelens.py", "list", ws, "--format", "json"], captures stdout via capsys, parses the JSON, and asserts:

  • result["_auto_setup"]["fallback"] is True
  • result["_auto_setup"]["capped"] is True

(Plus a third sanity test, test_main_path_no_fallback_when_subprocess_succeeds, that confirms the main path still works with fallback=False / capped=False for a small workspace, and that the flags are always present in the schema.)

3. Full test suite

PYTHONPATH=scripts python3 -m pytest tests/ -v

The new tests all pass and introduce zero regressions — the same set of pre-existing environmental failures (missing tree-sitter / SQLite graph_model dependencies in this sandbox) occurs identically before and after this PR. Diff of the FAILED lists is empty.

New tests (verbose)

$ PYTHONPATH=scripts python3 -m pytest tests/test_cli.py::TestAutoSetupFallbackCap -v -o "addopts="
============================= test session starts ==============================
platform linux -- Python 3.12.13, pytest-9.0.2, pluggy-1.6.0 -- /home/z/.venv/bin/python3
cachedir: .pytest_cache
rootdir: /home/z/my-project/repos/CodeLens
configfile: pytest.ini
plugins: Faker-40.1.2, metadata-3.1.1, asyncio-1.3.0, ddtrace-4.2.2, cov-7.0.0, json-report-1.5.0, anyio-4.13.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collecting ... collected 3 items

tests/test_cli.py::TestAutoSetupFallbackCap::test_fallback_passes_max_files_cap PASSED [ 33%]
tests/test_cli.py::TestAutoSetupFallbackCap::test_fallback_sets_capped_and_fallback_flags PASSED [ 66%]
tests/test_cli.py::TestAutoSetupFallbackCap::test_main_path_no_fallback_when_subprocess_succeeds PASSED [100%]

============================== 3 passed in 1.62s ===============================

Full tests/test_cli.py (verbose)

$ PYTHONPATH=scripts python3 -m pytest tests/test_cli.py -v -o "addopts="
============================= test session starts ==============================
platform linux -- Python 3.12.13, pytest-9.0.2, pluggy-1.6.0 -- /home/z/.venv/bin/python3
collecting ... collected 21 items

tests/test_cli.py::TestCmdInit::test_init_creates_codelens_dir PASSED    [  4%]
tests/test_cli.py::TestCmdInit::test_init_creates_config PASSED          [  9%]
tests/test_cli.py::TestCmdScan::test_scan_workspace PASSED               [ 14%]
tests/test_cli.py::TestCmdScan::test_scan_creates_registry PASSED        [ 19%]
tests/test_cli.py::TestCmdScan::test_scan_finds_classes PASSED           [ 23%]
tests/test_cli.py::TestCmdScan::test_scan_finds_ids PASSED               [ 28%]
tests/test_cli.py::TestCmdQuery::test_query_existing_id PASSED           [ 33%]
tests/test_cli.py::TestCmdQuery::test_query_existing_class PASSED        [ 38%]
tests/test_cli.py::TestCmdQuery::test_query_nonexistent PASSED           [ 42%]
tests/test_cli.py::TestCmdQuery::test_query_backend_function PASSED      [ 47%]
tests/test_cli.py::TestCmdQuery::test_query_auto_detect_domain PASSED    [ 52%]
tests/test_cli.py::TestCmdList::test_list_all PASSED                     [ 57%]
tests/test_cli.py::TestCmdList::test_list_dead PASSED                    [ 61%]
tests/test_cli.py::TestCmdList::test_list_frontend_only PASSED           [ 66%]
tests/test_cli.py::TestCmdList::test_list_backend_only PASSED            [ 71%]
tests/test_cli.py::TestCheckCommandArgs::test_check_accepts_positional_workspace PASSED [ 76%]
tests/test_cli.py::TestCheckCommandArgs::test_check_workspace_optional PASSED          [ 80%]
tests/test_cli.py::TestCheckCommandArgs::test_check_full_cli_invocation_with_positional PASSED [ 85%]
tests/test_cli.py::TestAutoSetupFallbackCap::test_fallback_passes_max_files_cap PASSED [ 90%]
tests/test_cli.py::TestAutoSetupFallbackCap::test_fallback_sets_capped_and_fallback_flags PASSED [ 95%]
tests/test_cli.py::TestAutoSetupFallbackCap::test_main_path_no_fallback_when_subprocess_succeeds PASSED [100%]

============================== 21 passed in 2.57s ==============================

Full suite summary (excluding tests/test_integration.py which requires a live network/grammar setup)

$ PYTHONPATH=scripts python3 -m pytest tests/ --ignore=tests/test_integration.py --tb=no -q
...
================= 37 failed, 804 passed, 14 skipped in 15.14s ==================

The 37 failures are pre-existing and environmental (this sandbox lacks tree-sitter grammars and the SQLite graph_model extension that the test_graph_model / test_graph_incremental / test_architecture / test_hybrid_type_resolver suites require). Verified by stashing this PR's changes and re-running on origin/main:

$ git stash && PYTHONPATH=scripts python3 -m pytest tests/ --ignore=tests/test_integration.py --tb=no -q
...
# same 37 failures, 801 passed, 14 skipped
$ git stash pop

diff <(baseline FAILED list) <(after-change FAILED list) is empty — zero regressions. The +3 in passed count (801 -> 804) is exactly the 3 new TestAutoSetupFallbackCap tests.

Notes

  • Did not update skill.json version (no new command/engine added — bug fix only, per CONTRIBUTING.md "Update skill.json version if adding new commands").
  • Did not update CHANGELOG.md (top-level CHANGELOG.md is for release notes; the maintainers' release process bundles changes per release).
  • PAT used to push was temporary and is not embedded in any committed file.

Issue #34: _auto_setup's subprocess scan passes --max-files 3000 on the
CLI, but commands/scan.add_args did not register --max-files, so argparse
rejected it (exit 2) every time. The fallback cmd_scan(workspace,
incremental=False) was therefore ALWAYS taken — with no cap and no
timeout — while the result hint still claimed 'Auto-setup capped at
3000 files'. On huge repos this could hang indefinitely.

Fix:
- Add max_files: Optional[int] param to cmd_scan + _cap_discovered_files
  helper that truncates per-category file lists so total <= max_files.
- Register --max-files in commands/scan.add_args so the subprocess path
  actually works (previously dead code).
- Rework _auto_setup so the in-process fallback calls
  cmd_scan(..., max_files=_AUTO_SETUP_MAX_FILES) — same cap as the
  subprocess path.
- Surface capped and fallback flags on result['_auto_setup'] so MCP
  clients/agents can tell which path produced the registry and whether
  the cap was hit (explicitly requested in issue #34).

Tests (tests/test_cli.py::TestAutoSetupFallbackCap):
- test_fallback_passes_max_files_cap: monkeypatch subprocess.run to
  raise, spy on cmd_scan, assert max_files=3000 is passed.
- test_fallback_sets_capped_and_fallback_flags: 3001-file workspace +
  forced fallback, drive full codelens.main() flow, assert
  result['_auto_setup']['capped'] is True and ['fallback'] is True.
- test_main_path_no_fallback_when_subprocess_succeeds: sanity guard
  that the main path still works (fallback=False, capped=False for a
  small workspace) and the flags are always present in the schema.
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@Wolfvin Wolfvin merged commit d371f55 into main Jun 29, 2026
0 of 6 checks passed
@Wolfvin Wolfvin deleted the fix/auto-setup-fallback-cap-34 branch June 29, 2026 17:13
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG-06] Auto-setup fallback ignores --max-files cap — defeats timeout protection

1 participant