Skip to content

Commit 95d3140

Browse files
timon0305clean6378-max-itbradjin8
authored
fix: validate /api/set-workspace path — reject traversal, symlink escape, non-existent (closes #15) (#22)
* fix: validate /api/set-workspace path — reject traversal, symlink escape, non-existent (closes #15) POST /api/set-workspace accepted any string in body.path, ran tilde expansion, and stored it in a module-global override consumed by every subsequent file lookup. Anyone reaching the endpoint (including a hostile JS payload — see XSS issue #11) could repoint the app at /etc, /, ~/.ssh, or any other directory; the dashboard would then serve files from there as if they were Cursor chat data. Symlink-based escape (e.g. /tmp/cursor-link → /) bypassed any naive startswith-style check. New utils/path_validation.py with validate_workspace_path(raw): - Expands ~/. - os.path.realpath() — collapses `..` AND resolves symlinks in one step. Both classes of escape become equivalent to the canonical real path on disk; downstream marker check then operates on the truth. - Rejects with WorkspacePathError if the path doesn't exist, isn't a directory, or contains no immediate subdirectory with a state.vscdb marker (the same heuristic /api/validate-path already uses). - Returns the canonical real path so the override is stored in canonical form, not whatever the caller sent. api/config_api.py:set_workspace now calls the validator and returns 400 { error: "<reason>" } on rejection (was silently 200), and stores the canonical path on success: 200 { success: true, path: "<canonical>" }. Lives outside api/ so the test suite can import without Flask in scope (tests/test_cli_args.py convention). Tests: tests/test_workspace_path_validation.py — 11 cases covering: - happy path with marker file present - canonical path returned (`..` collapsed) - empty / whitespace / non-string rejected - non-existent / file-not-directory rejected - directory without Cursor markers rejected - traversal lands outside workspace → rejected on markers - symlink → / rejected on markers (POSIX-only) - symlink → real workspace canonicalised + accepted (POSIX-only) Full suite: 148/148 OK (was 137; +11 new). Live HTTP smoke against the running app verified all 9 documented behaviours (200 with canonical path on accept; 400 with the documented reason on each reject). * fix: harden /api/set-workspace handler + traversal test (CodeRabbit on PR #16) Two CodeRabbit follow-ups: 1. api/config_api.py:set_workspace — when get_json(silent=True) returned a non-dict (JSON array, string, number, null), the truthy fallback `or {}` was bypassed and `body.get("path", "")` raised AttributeError, which the outer Exception handler mis-reported as a 500 server error. Added isinstance(body, dict) guard that returns 400 { error: "request body must be a JSON object" } directly. Diverged from CodeRabbit's literal patch in one way: they had it raise WorkspacePathError("path is required"), but the actual problem here is a malformed body shape — the error message should match the cause so the client can fix it. 2. tests/test_workspace_path_validation.py — the traversal test escaped to /tmp itself, which is shared and could be flipped by any other test / process creating <something>/state.vscdb there. Pinned the escape target to an isolated root inside self.tmp. Also added 4 API-layer regression tests (TestSetWorkspaceApi) using Flask test_client: JSON array / string / number return 400 (not 500), plus a sanity end-to-end with a valid {path} body returning 200 and the canonical realpath. Full suite: 152/152 OK (was 148; +4 new API tests). * fix(set-workspace): wrap override persistence to keep 500 as JSON (CodeRabbit on PR #22) validate_workspace_path() failures were already returning structured JSON, but set_workspace_path_override(canonical) was outside the try block — a persistence failure would have surfaced as Flask's HTML 500 page instead of {"error": "..."}. Wraps the call in its own try/except so the response shape stays structured for any consumer parsing JSON. * test(set-workspace): segment-aware `..` assert + reset override after API tests Two test-hygiene follow-ups from a structured re-review of PR #22; no production code changes. 1. tests/test_workspace_path_validation.py — test_returns_canonical_path_collapsing_dotdot The canonical-path assertion was a substring check against `..` on the raw realpath string. That would spuriously fail if the OS-supplied tempdir name ever embedded `..` in a folder component — substring presence is the wrong primitive for the invariant we actually care about, which is "no `..` *segment* in the canonical path." Switched to `Path(result).parts`, which handles `\` vs `/` natively and asserts on path components. 2. tests/test_workspace_path_validation.py — TestSetWorkspaceApi.setUp The 200-path test mutates the module-global _workspace_path_override in utils/workspace_path.py via the API, and the tempdir it then points at is rmtree'd by the existing cleanup. Without an explicit reset, the global stays pinned at a now-deleted path across tests. Added addCleanup(set_workspace_path_override, None) so any future sibling test inspecting the override sees a clean None. Full suite: 152/152 OK (skipped=2 POSIX-only symlink tests on Windows). No behaviour change; ReadLints clean. Co-authored-by: Cursor <cursoragent@cursor.com> * Align /api/validate-path with validate_workspace_path (PR #22) - POST /api/validate-path now uses the same realpath + marker checks as set-workspace; returns canonical path and structured errors on failure. - README documents WORKSPACE_PATH as trusted-operator tilde expansion only. - Config page shows server error text when validation fails. - Docstrings + symlink-test CI note; TOCTOU comment after realpath. Co-authored-by: Cursor <cursoragent@cursor.com> * validate-path: reject non-object JSON before body.get Avoid AttributeError on truthy JSON scalars/arrays (same class as PR #16). Return valid:false + workspaceCount:0 to match validation error shape. --------- Co-authored-by: yu-med <clean6378@gmail.com> Co-authored-by: Brad <headit74@hotmail.com>
1 parent bb02217 commit 95d3140

6 files changed

Lines changed: 383 additions & 17 deletions

File tree

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,8 @@ The application automatically detects your Cursor workspace storage location:
138138

139139
To override, set the `WORKSPACE_PATH` environment variable or use the Configuration page in the web UI.
140140

141+
Paths submitted through **`POST /api/set-workspace`** (and **`POST /api/validate-path`**) are validated the same way: canonical resolution (`realpath`), directory checks, and Cursor workspace markers (`state.vscdb` under immediate subdirectories). The **`WORKSPACE_PATH`** environment variable is only tilde-expanded — it is a **trusted-operator** escape hatch for automation and known-good paths, not a substitute for those API checks when untrusted input matters.
142+
141143
Cursor CLI agent sessions are read from `~/.cursor/chats/` (the default path used by the `cursor agent` CLI). Override with the `CLI_CHATS_PATH` environment variable.
142144

143145
## Project Structure

api/config_api.py

Lines changed: 43 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212

1313
from flask import Blueprint, jsonify, request
1414

15-
from utils.path_helpers import expand_tilde_path
15+
from utils.path_validation import WorkspacePathError, validate_workspace_path
1616
from utils.workspace_path import set_workspace_path_override
1717

1818
bp = Blueprint("config_api", __name__)
@@ -50,23 +50,34 @@ def detect_environment():
5050

5151
@bp.route("/api/validate-path", methods=["POST"])
5252
def validate_path():
53+
"""Same path rules as POST /api/set-workspace: realpath, markers (issue #15)."""
5354
try:
5455
body = request.get_json(silent=True) or {}
55-
workspace_path = body.get("path", "")
56-
expanded = expand_tilde_path(workspace_path)
57-
58-
if not os.path.isdir(expanded):
59-
return jsonify({"valid": False, "error": "Path does not exist"})
56+
if not isinstance(body, dict):
57+
return jsonify(
58+
{"valid": False, "error": "invalid JSON body", "workspaceCount": 0}
59+
)
60+
raw = body.get("path", "")
61+
try:
62+
canonical = validate_workspace_path(raw)
63+
except WorkspacePathError as e:
64+
return jsonify({"valid": False, "error": str(e), "workspaceCount": 0})
6065

6166
workspace_count = 0
62-
for name in os.listdir(expanded):
63-
full = os.path.join(expanded, name)
67+
for name in os.listdir(canonical):
68+
full = os.path.join(canonical, name)
6469
if os.path.isdir(full):
6570
db = os.path.join(full, "state.vscdb")
6671
if os.path.isfile(db):
6772
workspace_count += 1
6873

69-
return jsonify({"valid": workspace_count > 0, "workspaceCount": workspace_count})
74+
return jsonify(
75+
{
76+
"valid": workspace_count > 0,
77+
"workspaceCount": workspace_count,
78+
"path": canonical,
79+
}
80+
)
7081

7182
except Exception as e:
7283
print(f"Validation error: {e}")
@@ -75,14 +86,31 @@ def validate_path():
7586

7687
@bp.route("/api/set-workspace", methods=["POST"])
7788
def set_workspace():
89+
# Reject non-dict JSON bodies (array / string / number / null). Without
90+
# this, get_json returns the value directly, the truthy fallback `or {}`
91+
# is bypassed, and `body.get("path", "")` raises AttributeError — which
92+
# the outer Exception handler then mis-reports as a 500 server error
93+
# instead of a 400 client error. (CodeRabbit on PR #16.)
94+
body = request.get_json(silent=True)
95+
if not isinstance(body, dict):
96+
return jsonify({"error": "request body must be a JSON object"}), 400
97+
raw = body.get("path", "")
98+
# Validate the supplied path BEFORE storing the override (issue #15).
99+
# validate_workspace_path collapses `..` traversal AND resolves symlinks
100+
# via realpath, then enforces that the canonical target is an existing
101+
# directory containing Cursor workspace markers. Returns the canonical
102+
# path so we store that, not whatever the caller sent.
78103
try:
79-
body = request.get_json(silent=True) or {}
80-
path = body.get("path", "")
81-
expanded = expand_tilde_path(path)
82-
set_workspace_path_override(expanded)
83-
return jsonify({"success": True})
84-
except Exception:
104+
canonical = validate_workspace_path(raw)
105+
except WorkspacePathError as e:
106+
return jsonify({"error": str(e)}), 400
107+
except Exception: # noqa: BLE001 — only here as a fallback
108+
return jsonify({"error": "Failed to validate workspace path"}), 500
109+
try:
110+
set_workspace_path_override(canonical)
111+
except Exception: # noqa: BLE001 — keep the response shape structured JSON
85112
return jsonify({"error": "Failed to set workspace path"}), 500
113+
return jsonify({"success": True, "path": canonical})
86114

87115

88116
@bp.route("/api/get-username")

templates/config.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ <h1>Configuration</h1>
9696
setTimeout(() => { window.location.href = '/'; }, 1000);
9797
} else {
9898
statusEl.className = 'alert alert-danger';
99-
statusEl.textContent = 'No workspaces found in the specified location';
99+
statusEl.textContent = data.error || 'No workspaces found in the specified location';
100100
statusEl.style.display = 'block';
101101
}
102102
} catch (e) {
Lines changed: 243 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,243 @@
1+
"""
2+
Regression tests for issue #15 — /api/set-workspace path validation.
3+
4+
Exercises validate_workspace_path() directly. Imports from utils/ to avoid
5+
pulling Flask into scope (tests/test_cli_args.py convention).
6+
7+
Run:
8+
python -m unittest tests.test_workspace_path_validation -v
9+
"""
10+
11+
from __future__ import annotations
12+
13+
import os
14+
import shutil
15+
import sys
16+
import tempfile
17+
import unittest
18+
from pathlib import Path
19+
20+
REPO_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
21+
sys.path.insert(0, REPO_ROOT)
22+
23+
from utils.path_validation import WorkspacePathError, validate_workspace_path
24+
25+
26+
def _make_cursor_workspace_dir(parent: str, name: str = "real-storage") -> str:
27+
"""Create a directory that looks like a Cursor workspaceStorage dir.
28+
29+
Layout:
30+
<parent>/<name>/
31+
ws-001/state.vscdb ← marker file the validator looks for
32+
"""
33+
storage = os.path.join(parent, name)
34+
ws = os.path.join(storage, "ws-001")
35+
os.makedirs(ws)
36+
with open(os.path.join(ws, "state.vscdb"), "wb") as f:
37+
f.write(b"")
38+
return storage
39+
40+
41+
class TestValidateWorkspacePath(unittest.TestCase):
42+
43+
def setUp(self):
44+
self.tmp = tempfile.mkdtemp(prefix="cursor-validate-test-")
45+
self.addCleanup(shutil.rmtree, self.tmp, ignore_errors=True)
46+
47+
# ─── Happy path ────────────────────────────────────────────────
48+
49+
def test_accepts_directory_with_cursor_marker(self):
50+
storage = _make_cursor_workspace_dir(self.tmp)
51+
result = validate_workspace_path(storage)
52+
self.assertEqual(result, os.path.realpath(storage))
53+
54+
def test_returns_canonical_path_collapsing_dotdot(self):
55+
# /tmp/<x>/real-storage/../real-storage → /tmp/<x>/real-storage
56+
storage = _make_cursor_workspace_dir(self.tmp)
57+
traversal_input = os.path.join(storage, "..", os.path.basename(storage))
58+
result = validate_workspace_path(traversal_input)
59+
self.assertEqual(result, os.path.realpath(storage))
60+
# Assert no `..` *segment* in the canonical path (vs. a substring check
61+
# on the raw string, which would spuriously fail if the OS-supplied
62+
# tempdir name ever embedded `..` in a folder name).
63+
self.assertNotIn(os.pardir, Path(result).parts)
64+
65+
# ─── Hard rejects ──────────────────────────────────────────────
66+
67+
def test_rejects_empty_string(self):
68+
with self.assertRaises(WorkspacePathError) as ctx:
69+
validate_workspace_path("")
70+
self.assertIn("required", str(ctx.exception))
71+
72+
def test_rejects_whitespace_only(self):
73+
with self.assertRaises(WorkspacePathError):
74+
validate_workspace_path(" \t ")
75+
76+
def test_rejects_non_string(self):
77+
with self.assertRaises(WorkspacePathError):
78+
validate_workspace_path(None) # type: ignore[arg-type]
79+
80+
def test_rejects_non_existent_path(self):
81+
bogus = os.path.join(self.tmp, "does-not-exist", "anywhere")
82+
with self.assertRaises(WorkspacePathError) as ctx:
83+
validate_workspace_path(bogus)
84+
self.assertIn("does not exist", str(ctx.exception))
85+
86+
def test_rejects_file_not_directory(self):
87+
f = os.path.join(self.tmp, "regular-file")
88+
with open(f, "w") as h:
89+
h.write("not a directory")
90+
with self.assertRaises(WorkspacePathError) as ctx:
91+
validate_workspace_path(f)
92+
self.assertIn("not a directory", str(ctx.exception))
93+
94+
def test_rejects_directory_without_cursor_markers(self):
95+
# Existing directory but no state.vscdb anywhere — common case for
96+
# a user pointing at /tmp, /etc, /, ~/.ssh, etc.
97+
plain = os.path.join(self.tmp, "plain-dir")
98+
os.makedirs(os.path.join(plain, "subdir"))
99+
with self.assertRaises(WorkspacePathError) as ctx:
100+
validate_workspace_path(plain)
101+
self.assertIn("Cursor workspaceStorage", str(ctx.exception))
102+
103+
# ─── Path-traversal class ──────────────────────────────────────
104+
105+
def test_traversal_into_non_workspace_is_rejected(self):
106+
# Keep traversal target inside this test's own temp tree — escaping
107+
# to /tmp itself would be non-deterministic (any other test or
108+
# process creating a `state.vscdb` under /tmp/<dir>/state.vscdb
109+
# would flip this test's outcome).
110+
#
111+
# <self.tmp>/isolated-root/storage/../.. → <self.tmp>/isolated-root
112+
# which contains no state.vscdb under any subdir → reject on markers.
113+
isolated_root = os.path.join(self.tmp, "isolated-root")
114+
os.makedirs(isolated_root)
115+
storage = _make_cursor_workspace_dir(isolated_root)
116+
escape = os.path.join(storage, "..", "..")
117+
with self.assertRaises(WorkspacePathError):
118+
validate_workspace_path(escape)
119+
120+
# ─── Symlink-escape class ──────────────────────────────────────
121+
# POSIX-only; CI runs tests on ubuntu-latest so these still run in CI.
122+
123+
@unittest.skipIf(sys.platform == "win32", "POSIX symlinks only")
124+
def test_symlink_to_non_workspace_is_rejected(self):
125+
# A symlink that points to / (no Cursor markers) is rejected because
126+
# realpath() resolves to the real target before the marker check.
127+
link = os.path.join(self.tmp, "evil-link")
128+
os.symlink("/", link)
129+
with self.assertRaises(WorkspacePathError) as ctx:
130+
validate_workspace_path(link)
131+
self.assertIn("Cursor workspaceStorage", str(ctx.exception))
132+
133+
@unittest.skipIf(sys.platform == "win32", "POSIX symlinks only")
134+
def test_symlink_to_real_workspace_is_canonicalised_and_accepted(self):
135+
# Symlink → real Cursor storage. Accepted, but the canonical path
136+
# returned is the realpath (the storage dir), NOT the symlink path.
137+
storage = _make_cursor_workspace_dir(self.tmp)
138+
link = os.path.join(self.tmp, "good-link")
139+
os.symlink(storage, link)
140+
result = validate_workspace_path(link)
141+
self.assertEqual(result, os.path.realpath(storage))
142+
self.assertNotEqual(result, link)
143+
144+
145+
class TestSetWorkspaceApi(unittest.TestCase):
146+
"""API-layer regressions for POST /api/set-workspace.
147+
148+
The validator helper has its own coverage above; these cases exist to
149+
pin behaviour the API handler owns (request body shape handling,
150+
HTTP status mapping). Notably the non-dict-body case which used to
151+
surface as a 500 instead of a 400 — see CodeRabbit on PR #16.
152+
"""
153+
154+
def setUp(self):
155+
from flask import Flask
156+
from api.config_api import bp as config_bp
157+
from utils.workspace_path import set_workspace_path_override
158+
159+
self.tmp = tempfile.mkdtemp(prefix="cursor-validate-api-test-")
160+
self.addCleanup(shutil.rmtree, self.tmp, ignore_errors=True)
161+
# Reset the module-global workspace override after each test. The
162+
# 200-path test below mutates it via the API and the tempdir is then
163+
# rmtree'd by the cleanup above — without this, a future sibling test
164+
# inspecting the override would see a stale, now-deleted path.
165+
self.addCleanup(set_workspace_path_override, None)
166+
167+
app = Flask(__name__)
168+
app.config["TESTING"] = True
169+
app.register_blueprint(config_bp)
170+
self.client = app.test_client()
171+
172+
def test_non_dict_json_array_returns_400_not_500(self):
173+
# Regression: a JSON array body (truthy, non-dict) used to trip
174+
# AttributeError on body.get(...) and surface as a 500.
175+
resp = self.client.post(
176+
"/api/set-workspace",
177+
data="[]",
178+
content_type="application/json",
179+
)
180+
self.assertEqual(resp.status_code, 400)
181+
self.assertIn("error", resp.get_json())
182+
183+
def test_non_dict_json_string_returns_400(self):
184+
resp = self.client.post(
185+
"/api/set-workspace",
186+
data='"some string"',
187+
content_type="application/json",
188+
)
189+
self.assertEqual(resp.status_code, 400)
190+
191+
def test_non_dict_json_number_returns_400(self):
192+
resp = self.client.post(
193+
"/api/set-workspace",
194+
data="42",
195+
content_type="application/json",
196+
)
197+
self.assertEqual(resp.status_code, 400)
198+
199+
def test_dict_with_valid_path_returns_200_with_canonical(self):
200+
storage = _make_cursor_workspace_dir(self.tmp)
201+
resp = self.client.post(
202+
"/api/set-workspace",
203+
json={"path": storage},
204+
)
205+
self.assertEqual(resp.status_code, 200)
206+
body = resp.get_json()
207+
self.assertTrue(body["success"])
208+
self.assertEqual(body["path"], os.path.realpath(storage))
209+
210+
def test_validate_path_returns_canonical_and_count(self):
211+
storage = _make_cursor_workspace_dir(self.tmp)
212+
resp = self.client.post("/api/validate-path", json={"path": storage})
213+
self.assertEqual(resp.status_code, 200)
214+
data = resp.get_json()
215+
self.assertTrue(data["valid"])
216+
self.assertGreaterEqual(data["workspaceCount"], 1)
217+
self.assertEqual(data["path"], os.path.realpath(storage))
218+
219+
def test_validate_path_invalid_returns_error(self):
220+
plain = os.path.join(self.tmp, "no-markers")
221+
os.makedirs(plain)
222+
resp = self.client.post("/api/validate-path", json={"path": plain})
223+
self.assertEqual(resp.status_code, 200)
224+
data = resp.get_json()
225+
self.assertFalse(data["valid"])
226+
self.assertIn("error", data)
227+
228+
def test_validate_path_non_dict_json_returns_structured_error(self):
229+
# Mirror set_workspace: truthy non-dict JSON must not reach body.get.
230+
resp = self.client.post(
231+
"/api/validate-path",
232+
data='"not an object"',
233+
content_type="application/json",
234+
)
235+
self.assertEqual(resp.status_code, 200)
236+
data = resp.get_json()
237+
self.assertFalse(data["valid"])
238+
self.assertEqual(data["error"], "invalid JSON body")
239+
self.assertEqual(data["workspaceCount"], 0)
240+
241+
242+
if __name__ == "__main__":
243+
unittest.main()

0 commit comments

Comments
 (0)