Skip to content

Commit bb02217

Browse files
authored
fix: disable debug=True / Werkzeug debugger exposure (closes #9) (#20)
* fix: gate Flask debug / Werkzeug debugger behind opt-in flag (closes #9) The Werkzeug debugger is a documented remote-code-execution primitive. app.py was hard-coding `debug=True`, which exposed RCE to anyone who could reach the listening port — a misconfigured `--host`, an SSH tunnel, or a careless reverse proxy was enough. - Remove the `debug=True` literal from app.py. - Default debug OFF. Opt-in via either `--debug` CLI flag or `FLASK_DEBUG=1` env var (truthy = "1" / "true" / "yes", case-insensitive, whitespace-tolerant). - Print a stderr WARNING when debug is enabled, naming the RCE risk and reminding the operator to bind only to loopback. - Gate the auto-reloader on the same flag. Live-tested all four matrix cells: (default off / --debug / FLASK_DEBUG=1 / FLASK_DEBUG=0). Bogus paths under debug-off return a plain Flask 404, not the Werkzeug debugger console. Helper `resolve_debug_flag(env_value, cli_flag)` lives in `utils/debug_flag.py` so it can be unit-tested without importing Flask (matching the existing test convention in tests/test_cli_args.py). Regression coverage in tests/test_cli_args.py adds 8 cases: - default-off, env-truthy, env-falsey, CLI override - argparse `--debug` default + explicit - source-level guard that fails if `debug=True` is reintroduced * test: AST-walk the debug=True regression guard (CodeRabbit on PR #10) Old guard: `self.assertNotIn("debug=True", src)` — substring match. That misses cosmetic variants like `debug = True` (with spaces), multi-line `debug=\n True`, or any other form that produces the same runtime semantics. CodeRabbit correctly flagged it as evadable. Replaced with an `ast.walk(tree)` over the parsed app.py: find any `ast.Call` whose keywords contain `debug=True` as a literal Constant. Catches every cosmetic variant by definition. Failure message includes the offending line number(s) and the rationale (issue #9), so a future CI break is immediately debuggable. Verified by injecting `debug = True` (with spaces — the form the old check missed) into app.py: - Old check: would have passed (false negative). - New check: failed with `[136]` and the issue-#9 message. Then reverted the inject; test passes again. 42/42 tests still pass on the actual app.py. * review: address PR #20 nits — broaden debug=True guard + FLASK_DEBUG note - AST guard now handles ast.NameConstant (Py3.7) and **{"debug":True} dict-spread bypass; helper extracted for unit testing. - README: opt-in note for the Werkzeug debugger, including that FLASK_ENV=development is NOT consulted (only FLASK_DEBUG=1). - Replace em dashes in app.py comments with ASCII to silence GitHub's non-ASCII banner on review.
1 parent cad215f commit bb02217

4 files changed

Lines changed: 201 additions & 3 deletions

File tree

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,8 @@ python app.py
6969

7070
Open <http://localhost:3000> in your browser.
7171

72+
The Werkzeug debugger is **off by default** and must be opted in explicitly via the `--debug` flag or by setting `FLASK_DEBUG=1`. (Note: `FLASK_ENV=development` is **not** consulted - only `FLASK_DEBUG` is. See issue #9 for the rationale.)
73+
7274
## Tests
7375

7476
Run the full suite from the repository root (install `requirements.txt` first):

app.py

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,18 @@
11
"""
2-
Cursor Chat Browser Python Edition
2+
Cursor Chat Browser - Python Edition
33
A Flask web application for browsing and managing chat histories
44
from the Cursor editor's AI chat feature.
55
"""
66

7+
import os
78
import sys
89
from datetime import datetime
910
from pathlib import Path
1011

1112
from flask import Flask, render_template, send_from_directory
1213

14+
from utils.debug_flag import resolve_debug_flag
15+
1316
from api.workspaces import bp as workspaces_bp
1417
from api.composers import bp as composers_bp
1518
from api.logs import bp as logs_bp
@@ -101,6 +104,13 @@ def favicon():
101104
help="Path to exclusion rules file (sensitive projects/chats are omitted). "
102105
"If omitted, uses ~/.cursor-chat-browser/exclusion-rules.txt if present.",
103106
)
107+
parser.add_argument(
108+
"--debug",
109+
action="store_true",
110+
help="Enable Flask debug mode and the Werkzeug debugger. "
111+
"DANGEROUS: allows remote code execution if the port is exposed. "
112+
"Off by default; can also be enabled via FLASK_DEBUG=1.",
113+
)
104114
args = parser.parse_args()
105115

106116
if args.base_dir:
@@ -109,10 +119,23 @@ def favicon():
109119

110120
app = create_app(exclusion_rules_path=args.exclude_rules)
111121
print(f"Cursor Chat Browser (Python) running at http://{args.host}:{args.port}")
122+
123+
debug_enabled = resolve_debug_flag(os.environ.get("FLASK_DEBUG"), args.debug)
124+
if debug_enabled:
125+
# Print the warning to stderr so it's visible even when stdout is
126+
# piped/redirected. The Werkzeug debugger is a remote-code-execution
127+
# primitive - anyone reaching the host:port can hijack the process.
128+
print(
129+
"WARNING: Flask debug mode ENABLED. The Werkzeug debugger allows "
130+
"arbitrary code execution by anyone who can reach this server. "
131+
"Bind only to 127.0.0.1 and never expose to untrusted networks.",
132+
file=sys.stderr,
133+
)
134+
112135
# Disable reloader on Windows to avoid a socket conflict with Flask's stat reloader.
113136
app.run(
114137
host=args.host,
115138
port=args.port,
116-
debug=True,
117-
use_reloader=(sys.platform != "win32"),
139+
debug=debug_enabled,
140+
use_reloader=debug_enabled and (sys.platform != "win32"),
118141
)

tests/test_cli_args.py

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
python -m unittest tests.test_cli_args -v
1010
"""
1111

12+
import ast
1213
import sys
1314
import os
1415
import unittest
@@ -43,6 +44,7 @@ def _build_app_parser():
4344
parser.add_argument("--base-dir", default=None)
4445
parser.add_argument("--exclude-rules", "-e", default=None,
4546
metavar="PATH", dest="exclude_rules")
47+
parser.add_argument("--debug", action="store_true")
4648
return parser
4749

4850

@@ -246,5 +248,155 @@ def test_export_py_has_since_choices(self):
246248
self.assertIn('choices=["all", "last"]', src)
247249

248250

251+
# ---------------------------------------------------------------------------
252+
# Werkzeug debugger gating (security): debug must be off by default,
253+
# opt-in via --debug or FLASK_DEBUG=1. Regression for the Critical
254+
# `debug=True` exposure that was hard-coded in app.py.
255+
# ---------------------------------------------------------------------------
256+
257+
class TestDebugFlagGating(unittest.TestCase):
258+
259+
# -- _resolve_debug_flag helper ------------------------------------------
260+
261+
def setUp(self):
262+
# Import from the standalone utility module so the test does not pull
263+
# Flask into scope (the rest of this file deliberately avoids Flask).
264+
from utils.debug_flag import resolve_debug_flag
265+
self._resolve = resolve_debug_flag
266+
267+
def test_debug_off_when_env_unset_and_no_cli(self):
268+
self.assertFalse(self._resolve(None, False))
269+
270+
def test_debug_off_when_env_empty_string(self):
271+
self.assertFalse(self._resolve("", False))
272+
273+
def test_debug_off_for_explicit_falsey_env_values(self):
274+
for v in ("0", "false", "False", "no", "off", "anything-not-truthy"):
275+
with self.subTest(env=v):
276+
self.assertFalse(self._resolve(v, False))
277+
278+
def test_debug_on_for_truthy_env_values(self):
279+
for v in ("1", "true", "True", "TRUE", "yes", "YES", " 1 "):
280+
with self.subTest(env=v):
281+
self.assertTrue(self._resolve(v, False))
282+
283+
def test_cli_flag_overrides_env(self):
284+
# Even with FLASK_DEBUG explicitly off, --debug should turn it on.
285+
self.assertTrue(self._resolve("0", True))
286+
self.assertTrue(self._resolve(None, True))
287+
288+
# -- argparse: --debug flag ----------------------------------------------
289+
290+
def test_app_parser_debug_default_false(self):
291+
opts = _build_app_parser().parse_args([])
292+
self.assertFalse(opts.debug)
293+
294+
def test_app_parser_debug_explicit(self):
295+
opts = _build_app_parser().parse_args(["--debug"])
296+
self.assertTrue(opts.debug)
297+
298+
# -- source-level guard: app.py must NOT carry a literal debug=True -------
299+
# AST-walk so cosmetic variations (`debug = True`, multi-line formatting,
300+
# leading whitespace, etc.) cannot bypass the guard. A regression that
301+
# reintroduces the literal in any form fails this test with the offending
302+
# line number(s).
303+
304+
def test_app_py_does_not_hardcode_debug_true(self):
305+
app_path = os.path.join(REPO_ROOT, "app.py")
306+
with open(app_path, "r", encoding="utf-8") as f:
307+
tree = ast.parse(f.read(), filename=app_path)
308+
309+
offenders = _find_debug_true_offenders(tree)
310+
self.assertEqual(
311+
offenders, [],
312+
"Found a literal `debug=True` keyword argument in app.py at "
313+
"line(s) %s. The Werkzeug debugger must be opt-in via the "
314+
"--debug flag or FLASK_DEBUG env var (see issue #9), never "
315+
"hard-coded." % offenders,
316+
)
317+
318+
319+
class FindDebugTrueOffendersTests(unittest.TestCase):
320+
"""Unit tests for the AST-walk helper itself, so the regression guard
321+
above keeps catching what we expect across Python AST shape changes.
322+
323+
Covers:
324+
- direct keyword `f(debug=True)` (ast.Constant on 3.8+, ast.NameConstant on 3.7)
325+
- dict-spread `f(**{"debug": True})` bypass
326+
- benign shapes that should NOT trip the guard (False, variable, attribute)
327+
"""
328+
329+
def _find(self, src):
330+
return _find_debug_true_offenders(ast.parse(src))
331+
332+
def test_simple_keyword_literal(self):
333+
self.assertEqual(self._find("app.run(debug=True)"), [1])
334+
335+
def test_keyword_false_not_flagged(self):
336+
self.assertEqual(self._find("app.run(debug=False)"), [])
337+
338+
def test_keyword_variable_not_flagged(self):
339+
# Out of scope per PR review - only literals are tracked.
340+
self.assertEqual(self._find("flag = True\napp.run(debug=flag)"), [])
341+
342+
def test_keyword_attribute_not_flagged(self):
343+
self.assertEqual(self._find("app.run(debug=cfg.debug_on)"), [])
344+
345+
def test_dict_spread_literal(self):
346+
# Determined-bypass shape: kwargs come in via **dict literal.
347+
offenders = self._find("app.run(**{'debug': True})")
348+
self.assertEqual(len(offenders), 1)
349+
350+
def test_dict_spread_false_not_flagged(self):
351+
self.assertEqual(self._find("app.run(**{'debug': False})"), [])
352+
353+
def test_dict_spread_other_key_not_flagged(self):
354+
self.assertEqual(self._find("app.run(**{'foo': True})"), [])
355+
356+
357+
# ---------------------------------------------------------------------------
358+
# AST helper (module-level so it's testable in isolation)
359+
# ---------------------------------------------------------------------------
360+
361+
def _find_debug_true_offenders(tree):
362+
"""Return line numbers of any literal `debug=True` (or `**{"debug": True}`)
363+
on a Call node in the AST.
364+
365+
Cross-version safe: works with both ast.Constant (3.8+) and the legacy
366+
ast.NameConstant shape (3.7) by reading `.value` attribute-style rather
367+
than narrowing to a specific node class. Only literal True is flagged;
368+
`debug=variable` and `debug=mod.attr` are out of scope.
369+
"""
370+
offenders = []
371+
for node in ast.walk(tree):
372+
if not isinstance(node, ast.Call):
373+
continue
374+
for kw in node.keywords:
375+
# Shape 1: direct keyword - f(debug=True)
376+
if kw.arg == "debug" and _is_literal_true(kw.value):
377+
offenders.append(kw.lineno)
378+
continue
379+
# Shape 2: dict-spread - f(**{"debug": True})
380+
if kw.arg is None and isinstance(kw.value, ast.Dict):
381+
for k, v in zip(kw.value.keys, kw.value.values):
382+
if _is_str_literal(k, "debug") and _is_literal_true(v):
383+
offenders.append(getattr(v, "lineno", kw.lineno))
384+
return offenders
385+
386+
387+
def _is_literal_true(node):
388+
"""True only when *node* is the literal True (ast.Constant on 3.8+,
389+
ast.NameConstant on 3.7). Excludes variables/attributes via the strict
390+
`is True` identity check on `.value`."""
391+
return getattr(node, "value", None) is True
392+
393+
394+
def _is_str_literal(node, expected):
395+
"""True when *node* is a string literal equal to *expected* (handles
396+
ast.Constant on 3.8+ and ast.Str on 3.7)."""
397+
val = getattr(node, "value", getattr(node, "s", None))
398+
return isinstance(val, str) and val == expected
399+
400+
249401
if __name__ == "__main__":
250402
unittest.main()

utils/debug_flag.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
"""Resolution of the Flask debug / Werkzeug debugger flag.
2+
3+
Lives in `utils/` so it can be unit-tested without importing Flask
4+
(which the test suite intentionally avoids — see tests/test_cli_args.py).
5+
"""
6+
7+
8+
def resolve_debug_flag(env_value, cli_flag):
9+
"""Return True iff Flask debug / Werkzeug debugger should be enabled.
10+
11+
Off by default. The Werkzeug debugger lets a remote attacker execute
12+
arbitrary Python in the server process, so debug mode must be opt-in
13+
and never the default. Enabled only when:
14+
- the operator explicitly passes --debug on the command line, or
15+
- FLASK_DEBUG is set to a truthy value ("1", "true", "yes").
16+
"""
17+
if cli_flag:
18+
return True
19+
if env_value is None:
20+
return False
21+
return env_value.strip().lower() in ("1", "true", "yes")

0 commit comments

Comments
 (0)