Improve staged pipeline reliability and runtime gating#215
Conversation
…g gates over legacy heuristics Staged mode detection via spec_frozen + wiring_validation presence. Consistency score uses wiring_validation results (0.7) blended with legacy (0.3). Drops 'deterministic fallback scaffold detected' blocker in staged mode. Relaxed pass criteria: wiring_passed + match_rate>=70 + runnability>=70. Adds provenance summary (LLM vs deterministic file counts) to eval output. 8 new tests, 158 related tests pass, 0 regressions.
…events - llm.py: use_responses_api for gpt-5.3-codex/gpt-5.4 via model_endpoint_type(), max_retries via LangChain built-in, remove fallback model switching - per_file_code_generator: inject wiring_validation.repair_instructions and build_errors into LLM prompt on retry, skip already-generated files unless they caused the build failure - sse.py: register all staged pipeline nodes (api_contract_generator through deploy_gate) in NODE_EVENTS for dashboard visibility - pipeline_runtime.py: emit structured SSE events for spec_freeze, backend_gen, frontend_gen, contract_validation, runtime_validation, deploy_gate results
…rompt Sort frontend specs so non-page files generate first, then extract actual exports (default/named), Props interface signatures, and api-client functions from already_generated + foundation files before generating page.tsx. Inject exact import statements with props signatures and CRITICAL IMPORT RULE into LLM system message to prevent non-existent export references.
PARALLEL: - frontend_generator_node: tier-based asyncio.gather for component specs (up to VIBEDEPLOY_MAX_PARALLEL_LLM=4 concurrent LLM calls) - page.tsx always generated last, after all components complete FILE-LEVEL REPAIR: - build_validator extracts failing file paths + frontend_only_failure flag - route_after_build_staged: frontend-only failures route to new frontend_file_repairer node instead of full backend+frontend rerun - frontend_file_repairer_node: regenerates only the specific files identified in build errors, then re-runs build_validator - state: build_failing_files, build_frontend_only_failure, build_errors_full JSX TRUNCATION GUARD: - _has_truncated_jsx: detects unclosed tags / truncated files - _generate_file_with_llm: retries up to 3x on truncation detection STATE: - build_failing_files: list of failing file paths - build_frontend_only_failure: bool for routing decision - build_errors_full: full error text (3000 chars)
Add concise generation directive on truncation retry: shorter names, fewer comments, properly closed JSX, explicit end-of-file requirements.
…n max_tokens limit
Root cause: LangChain ChatOpenAI(max_tokens=12000) passes max_output_tokens=12000
to Responses API, truncating large components mid-file.
Fix: _generate_file_via_responses_api() calls openai.AsyncOpenAI.responses.create()
directly without max_output_tokens. Uses reasoning={'effort':'medium'} per official
code generation best practices. LangChain path remains as fallback.
Model config: gpt-5.4 for frontend (official recommendation for one-shot UI gen)
The tag-counting approach (open < - close </ - self-close />) was producing false positives since JSX nesting naturally has many opens. Replace with last-line-ending check: valid if line ends with ; } ) > /> or is a comment. This matches actual truncation patterns accurately.
…enerate_file_with_llm Responses API path handles retries internally. LangChain fallback just uses ainvoke_with_retry(max_attempts=3) directly without truncation re-detection loop that could cause infinite cycling.
All non-page frontend files already had stable templates: - Hero, WorkspacePanel, StatePanel, InsightPanel, CollectionPanel - generic component fallback - api-client and config/style files Use deterministic generation for all component/api/config/style files and keep LLM only for page.tsx. This removes the main truncation hotspot and cuts most frontend LLM calls to one.
- build_validator: infer src/app/page.tsx for prerender '/' and src_app_page errors - frontend_file_repairer: when repairing page.tsx after build failure, use deterministic _page_template instead of another LLM attempt This keeps the fast staged pipeline while ensuring page-level build/runtime errors converge to a known-good template instead of looping through more fragile LLM retries.
Previously session.completed always used council scoring.decision, which stays NO-GO when skip_council=true. For staged local-first runs that successfully pass code_eval + build + local_runtime + deploy_gate + deployer(local_running), the final meeting result should be GO. Also fall back score to code_eval.match_rate when council scoring is absent.
When showcase apps exist, dashboard snapshot was replacing the full meeting list with only showcase-matched deployed apps. This hid successful local_running runs like staged verification threads. Keep unmatched local_running/local_error meetings in reconciled results and add limit query support to /dashboard/results and /dashboard/brainstorms.
- contract_validator now applies include_router(prefix=...) when comparing FastAPI decorators against OpenAPI paths - deterministic backend route template strips /api from decorators because main.py already mounts router at prefix=/api - local_runtime_validator now calls contract POST endpoints (/api/plan, /api/insights, etc.) instead of only checking /health This closes the gap where a build could pass and local runtime could still report success even though the generated API routes were actually 404ing.
Route/service/api backend files are contract-driven and already have stable templates. Generating them via LLM caused path drift (/api/api/...) and runtime instability. Use deterministic generation for backend files so runtime/API validation can converge reliably. Keep LLM only for page.tsx.
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Walkthrough프론트엔드 파일 단위 복구 노드를 도입하고 빌드 검증에서 실패 파일을 추출해 복구 경로를 추가했으며, LLM 호출에 대한 재시도·Responses API 라우팅을 일관화하고 단계별(staged) 파이프라인 로직과 이벤트 스트리밍을 확장했습니다. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant BuildValidator
participant FileRepairer
participant BackendGenerator
participant LLMProvider
Client->>BuildValidator: 요청(build validation)
BuildValidator->>BuildValidator: 실패 파일 추출
alt frontend_only failures & attempts <= 3
BuildValidator->>FileRepairer: route failing files
FileRepairer->>FileRepairer: map files → specs
FileRepairer->>LLMProvider: 요청(repair, use_responses_api?, max_retries)
LLMProvider-->>FileRepairer: 생성된 코드 / 실패
FileRepairer->>FileRepairer: deterministic 폴백(필요 시)
FileRepairer-->>Client: 복구된 파일 반환
else other failures or max attempts
BuildValidator->>BackendGenerator: route to backend generator
BackendGenerator->>LLMProvider: 요청(generation, max_retries)
LLMProvider-->>BackendGenerator: 생성된 코드
BackendGenerator-->>Client: 생성된 파일 반환
end
sequenceDiagram
participant Caller
participant GetLLM
participant ProviderRegistry
participant OpenAIAPI
participant DOInference
Caller->>GetLLM: 요청(get_llm, model, max_retries, use_responses_api)
GetLLM->>GetLLM: 라우팅 결정
alt registry available
GetLLM->>ProviderRegistry: 요청(timeout, max_retries)
ProviderRegistry->>OpenAIAPI: 위임(timeout, max_retries)
OpenAIAPI-->>ProviderRegistry: 응답
ProviderRegistry-->>GetLLM: LLM 인스턴스
else use_responses_api path
GetLLM->>OpenAIAPI: 직접 호출(use_responses_api, max_retries)
OpenAIAPI-->>GetLLM: LLM 인스턴스
else DO Inference
GetLLM->>DOInference: 호출(max_retries, use_responses_api)
DOInference-->>GetLLM: LLM 인스턴스
end
GetLLM-->>Caller: LLM 인스턴스 반환
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refines the staged pipeline's robustness and feedback mechanisms. It introduces targeted repair capabilities for frontend build issues, enhances the reliability of LLM-driven code generation through retries and parallel processing, and improves the accuracy of both contract and runtime validations. The changes also provide more granular observability into the pipeline's execution and better integration of local development runs into the dashboard, ultimately leading to a more dependable and transparent development workflow. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces significant improvements to the staged pipeline's reliability and observability. Key changes include adding a file-level repair loop for frontend build failures, making code generation more deterministic, and enhancing runtime validation. The introduction of a frontend_file_repairer node and more granular SSE events for the staged pipeline are excellent additions for robustness and monitoring. My review identified a few areas for improvement, including a duplicate regular expression, a redundant conditional check, some dead code, and a bug where available component exports are not being added to the LLM prompt, which could impact code generation quality.
| sig = props_map.get(n, "") | ||
| props_note = f" // props: {sig}" if sig else "" | ||
| export_lines.append(f' import {{ {n} }} from "{module}";{props_note}') | ||
|
|
There was a problem hiding this comment.
The export_lines variable is calculated to list available component exports, but it's never used. It seems this information was intended to be added to the prompt to guide the LLM in generating correct import statements for page components. Without it, the LLM might generate incorrect imports. You should add this information to the prompt to improve generation quality.
| if export_lines: | |
| prompt += "\n\n## Available Component Exports\n" + "\n".join(export_lines) |
| re.compile(r"\./?(src/[^\s:>]+\.[a-z]{2,4})"), | ||
| re.compile(r"\./?(src/[^\s:>]+\.[a-z]{2,4})"), |
There was a problem hiding this comment.
This list contains a duplicate regular expression pattern for extracting file paths. Line 43 is identical to line 42. Removing the duplicate will make the code cleaner and prevent potential maintenance issues.
| re.compile(r"\./?(src/[^\s:>]+\.[a-z]{2,4})"), | |
| re.compile(r"\./?(src/[^\s:>]+\.[a-z]{2,4})"), | |
| re.compile(r"\./?(src/[^\s:>]+\.[a-z]{2,4})"), |
| for route in routes: | ||
| path = str(route.get("path") or "") | ||
| method = str(route.get("method") or "GET") | ||
| if path.startswith("/") and path.startswith("/api"): |
| async def _generate_tier_parallel( | ||
| specs: list, | ||
| context: dict, | ||
| code_store: dict, | ||
| warnings: list, | ||
| file_type_filter: set[str], | ||
| is_frontend: bool, | ||
| ) -> None: | ||
| model_key = "code_gen_frontend" if is_frontend else "code_gen_backend" | ||
| model = MODEL_CONFIG.get(model_key, MODEL_CONFIG["code_gen"]) | ||
| semaphore = asyncio.Semaphore(_MAX_PARALLEL_LLM) | ||
|
|
||
| async def _generate_one(spec) -> tuple[str, dict[str, str]]: | ||
| async with semaphore: | ||
| if _use_llm_per_file_generation() and spec.file_type in file_type_filter: | ||
| if not llm_credentials_available(model): | ||
| warnings.append(f"per_file_llm_unavailable:{model}") | ||
| return spec.path, _generate_file_from_spec(spec, context) | ||
| try: | ||
| content = await _generate_file_with_llm(spec, context) | ||
| route = llm_auth_route_for_model(model) or "unknown" | ||
| target = "frontend" if is_frontend else "backend" | ||
| warnings.append(f"per_file_{target}_llm_used:{model}:{route}") | ||
| return spec.path, {spec.path: content} | ||
| except Exception as exc: | ||
| target = "frontend" if is_frontend else "backend" | ||
| logger.warning("[PER_FILE_LLM] %s fallback for %s: %s", target, spec.path, str(exc)[:200]) | ||
| warnings.append(f"per_file_{target}_llm_fallback:{spec.path}") | ||
| return spec.path, _generate_file_from_spec(spec, context) | ||
| else: | ||
| try: | ||
| return spec.path, _generate_file_from_spec(spec, context) | ||
| except Exception: | ||
| return spec.path, _generate_file_from_spec(spec, context) | ||
|
|
||
| results = await asyncio.gather(*[_generate_one(spec) for spec in specs]) | ||
| for _, generated in results: | ||
| code_store.update(generated) | ||
| context["already_generated"].update(generated) | ||
|
|
Revert /zero-prompt/start to emit a short SSE stream again so route tests and consumers receive zp.session.start consistently. Update the web client to accept both SSE and JSON payloads when starting a session, and skip launching the ZP background pipeline during test-mode requests to keep route tests deterministic. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
agent/llm.py (1)
435-440:⚠️ Potential issue | 🟠 Major레지스트리 경로에서
max_retries파라미터가 누락되고 있습니다.직접 OpenAI/DO 경로(llm.py 452, 465)는
max_retries=effective_retries를 전달하지만, 레지스트리 경로(llm.py 435)는timeout만 전달합니다.effective_retries가 계산되고 있음에도 불구하고 registry.get_llm 호출에 포함되지 않습니다.openai_adapter와 anthropic_adapter 모두 **kwargs를 통해 max_retries를 받지 않으므로, 레지스트리 경로로 라우팅되면 재시도 정책이 적용되지 않아 모델별 동작이 달라집니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agent/llm.py` around lines 435 - 440, The registry path is missing the max_retries parameter so retry policies aren't applied; update the registry.get_llm call to pass max_retries=effective_retries (i.e., change registry.get_llm(canonical, temperature=temperature, max_tokens=effective_max_tokens, timeout=effective_timeout) to include max_retries=effective_retries) so that effective_retries computed earlier is honored for registry-routed LLMs; this aligns behavior with the direct OpenAI/DO branches and ensures openai_adapter and anthropic_adapter receive the retry setting via kwargs.agent/server.py (1)
1248-1286:⚠️ Potential issue | 🔴 Critical
/zero-prompt/start응답을 SSE로 바꾸면 현재 웹 클라이언트가 시작 자체를 못 합니다.
web/src/hooks/use-zero-prompt.ts:183-230는startSession()결과를 JSON 세션 객체로 받아normalizeSession()에 넘기고, 같은 파일 20-56행은session_id/cards/status가 있는 객체를 전제로 합니다. 여기서StreamingResponse를 반환하면 세션 초기화가 깨지고, 이 스트림도 시작 이벤트 2개 후 바로 닫혀/zero-prompt/events의 대체가 되지 않습니다. 기본 응답은 기존 JSON payload 를 유지하고, 스트리밍이 필요하면 별도 SSE 엔드포인트(이미 존재함)나Accept: text/event-stream분기로 분리하는 편이 안전합니다.💡 최소 수정안
`@app.post`("/api/zero-prompt/start") `@app.post`("/zero-prompt/start") async def zero_prompt_start(request: ZPStartRequest): orch = _get_zp_orchestrator() session, _start_event = orch.create_session(goal=request.goal) session_id = session.session_id goal = request.goal or 5 if not _test_api_enabled(): asyncio.create_task(_run_zp_pipeline(orch, session_id, goal)) push_zp_event( {"type": "zp.session.start", "session_id": session_id, "goal_go_cards": goal, "session_status": session.status} ) push_zp_event({"type": "zp.pipeline.started", "session_id": session_id, "goal": goal}) - async def event_stream() -> AsyncGenerator[str, None]: - yield _sse( - "zp.session.start", - { - "type": "zp.session.start", - "session_id": session_id, - "goal_go_cards": goal, - "session_status": session.status, - }, - ) - yield _sse( - "zp.pipeline.started", - {"type": "zp.pipeline.started", "session_id": session_id, "goal": goal}, - ) - - return StreamingResponse( - event_stream(), - media_type="text/event-stream", - headers={ - "Cache-Control": "no-cache", - "Connection": "keep-alive", - "X-Accel-Buffering": "no", - }, - ) + return session.model_dump()
🧹 Nitpick comments (1)
web/src/lib/zero-prompt-api.ts (1)
7-7: JSON 파싱에 대한 오류 처리 및 검증 누락
"{"로 시작하지만 유효하지 않은 JSON인 경우JSON.parse가 명확하지 않은 오류를 발생시킵니다. 또한, 파싱된 객체가 필수 필드(session_id,status,cards)를 포함하는지 검증하지 않습니다.🛡️ 오류 처리 및 검증 추가 제안
- if (trimmed.startsWith("{")) return JSON.parse(trimmed) as ZPSession; + if (trimmed.startsWith("{")) { + try { + const parsed = JSON.parse(trimmed); + if (!parsed.session_id || !parsed.status) { + throw new Error("Invalid session response: missing required fields"); + } + return parsed as ZPSession; + } catch (e) { + throw new Error(`Failed to parse session JSON: ${e instanceof Error ? e.message : String(e)}`); + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/src/lib/zero-prompt-api.ts` at line 7, The current quick-parse branch "if (trimmed.startsWith("{")) return JSON.parse(trimmed) as ZPSession;" can throw on invalid JSON and doesn't validate required fields; wrap the parse in a try/catch to catch JSON.parse errors and return/throw a controlled error, then validate the parsed object (the ZPSession) contains the required keys (session_id, status, cards) with expected types/structure before returning it; reference the local variable trimmed, the ZPSession type and JSON.parse when adding the try/catch and add explicit checks for session_id, status and cards, returning a clear error or fallback when validation fails.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@agent/nodes/build_validator.py`:
- Around line 395-399: The new top-level build state keys (build_errors_full,
build_failing_files, build_frontend_only_failure) are only being set on failure
paths, leaving stale values on success/skip paths; update the state emission in
the function that prepares the build-state fragment (the block that currently
sets "build_errors_full": combined_stderr[:3000], "build_failing_files":
failing_paths, "build_frontend_only_failure": frontend_only_failure, and
"build_attempt_count") to explicitly overwrite those keys with empty/falsey
values on non-failure branches (e.g., set build_errors_full="",
build_failing_files=[], build_frontend_only_failure=False) so reducer-friendly
merges won't retain old data; keep build_attempt_count logic as-is and ensure
the same fragment shape is returned for all outcomes so merge_dicts/reducer
merges behave predictably.
In `@agent/nodes/contract_validator.py`:
- Around line 61-76: The current apply_router_prefixes function incorrectly
special-cases paths starting with "/api"; instead, remove the hardcoded "/api"
check and treat a route as already prefixed if its path starts with any of the
provided prefixes (normalize prefixes to begin with "/" and compare against path
prefixes), so that routes already containing one of the configured prefixes are
left unchanged; ensure prefix normalization (leading slash, trim trailing slash
for comparisons) and keep the rest of the expansion logic (building full_path
from prefix + path) and the final return behavior (expanded if non-empty,
otherwise original routes).
In `@agent/nodes/local_runtime_validator.py`:
- Around line 103-114: The POST validator is sending a fixed payload and not
honoring OpenAPI path parameters or requestBody schema, causing false failures;
update the loop that iterates over paths (the block using `for endpoint, methods
in list(paths.items())[:3]:`) to: detect and substitute any path parameters in
`endpoint` with dummy safe values (e.g., "test" or "1"); inspect the OpenAPI
operation object for "requestBody" and the schema for "application/json" (or
form data) and construct a minimal valid payload matching required properties
instead of the fixed `{"query":"test","preferences":"test"}` (and keep the
special-case for "insight" only if that matches the schema); then call
`_http_json` with the constructed URL and payload and keep appending failures to
`errors` as before (`post_ok, post_detail = await asyncio.to_thread(_http_json,
...)`) so the validator skips or correctly tests endpoints with path params and
proper request schemas.
In `@agent/nodes/per_file_code_generator.py`:
- Around line 298-320: The code inside the target == "frontend" &&
_is_page_file(spec.path) block builds import hints in export_lines from
available_exports but never injects them into the prompt/context, so the import
constraint is not applied; after building export_lines (use the
available_exports, defaults, named, props_map logic already present), join them
into a single string (with a brief header comment) and append or merge that
string into the same prompt/context payload used later to generate the page (the
variable that holds the prompt or the context passed to the generator), ensuring
the import hints are included when _is_page_file(spec.path) is true.
- Around line 1195-1198: The _is_page_file function misclassifies files because
it checks for substrings; change it to inspect only the final filename (use
os.path.basename or Path(path).name) and return True only when the basename
exactly equals "page.tsx" or "page.ts" (after normalizing slashes), so files
like "homepage.tsx" won't be treated as page files; update the _is_page_file
implementation to use the basename comparison accordingly.
In `@agent/pipeline_runtime.py`:
- Around line 203-205: The current fallback uses "if not score" which treats 0
as missing; change the logic around the score variable so you only fallback when
final_score is absent or None (not when it is 0). Retrieve score from scoring
via scoring.get("final_score") and then replace the "if not score" check with a
strict None/type check (e.g., "if score is None and
isinstance(code_eval_result.get('match_rate'), (int, float))") before assigning
the match_rate; update references to scoring, score, and code_eval_result
accordingly.
- Around line 199-201: The current logic unconditionally sets verdict = "GO"
whenever pipeline_succeeded is true, which can override a failing/hard-gate
decision; change it to first map the raw decision (use
verdict_map.get(decision_raw, "NO-GO") into a local variable like
mapped_verdict) and only set verdict = "GO" if pipeline_succeeded is true AND
mapped_verdict == "GO" (otherwise keep mapped_verdict). Update the code around
verdict_map, decision_raw and pipeline_succeeded so hard-gate or scoring
failures (scoring.decision / decision_raw) cannot be overridden by
pipeline_succeeded.
In `@web/src/lib/zero-prompt-api.ts`:
- Around line 9-22: Wrap the JSON.parse call inside the loop (the invocation
using JSON.parse(line.slice(6))) with a try-catch so malformed SSE lines are
skipped/logged instead of throwing; continue the loop on parse failure. Also
update the ZPSession interface to include goal_go_cards: number (and remove
build_queue and active_build from the returned object or add them to the
interface only if they will be used) so the returned object shape matches the
ZPSession type used elsewhere (see use-zero-prompt.ts for consumers).
---
Outside diff comments:
In `@agent/llm.py`:
- Around line 435-440: The registry path is missing the max_retries parameter so
retry policies aren't applied; update the registry.get_llm call to pass
max_retries=effective_retries (i.e., change registry.get_llm(canonical,
temperature=temperature, max_tokens=effective_max_tokens,
timeout=effective_timeout) to include max_retries=effective_retries) so that
effective_retries computed earlier is honored for registry-routed LLMs; this
aligns behavior with the direct OpenAI/DO branches and ensures openai_adapter
and anthropic_adapter receive the retry setting via kwargs.
---
Nitpick comments:
In `@web/src/lib/zero-prompt-api.ts`:
- Line 7: The current quick-parse branch "if (trimmed.startsWith("{")) return
JSON.parse(trimmed) as ZPSession;" can throw on invalid JSON and doesn't
validate required fields; wrap the parse in a try/catch to catch JSON.parse
errors and return/throw a controlled error, then validate the parsed object (the
ZPSession) contains the required keys (session_id, status, cards) with expected
types/structure before returning it; reference the local variable trimmed, the
ZPSession type and JSON.parse when adding the try/catch and add explicit checks
for session_id, status and cards, returning a clear error or fallback when
validation fails.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 83fab7fd-9b16-43d1-b046-b2d68433d8c1
📒 Files selected for processing (13)
agent/graph.pyagent/llm.pyagent/nodes/build_validator.pyagent/nodes/code_evaluator.pyagent/nodes/contract_validator.pyagent/nodes/local_runtime_validator.pyagent/nodes/per_file_code_generator.pyagent/pipeline_runtime.pyagent/server.pyagent/sse.pyagent/state.pyagent/tests/test_code_evaluator.pyweb/src/lib/zero-prompt-api.ts
| def _extract_failing_file_paths(error_text: str) -> list[str]: | ||
| paths: list[str] = [] | ||
| patterns = [ | ||
| re.compile(r"\./?(src/[^\s:>]+\.[a-z]{2,4})"), | ||
| re.compile(r"\./?(src/[^\s:>]+\.[a-z]{2,4})"), | ||
| re.compile(r"Module not found.*['\"](@/[^'\"]+)['\"]"), | ||
| re.compile(r"Export\s+\w+\s+doesn't exist.*['\"](@/[^'\"]+)['\"]"), | ||
| ] | ||
| for pattern in patterns: | ||
| for match in pattern.finditer(error_text): | ||
| p = match.group(1) | ||
| if p not in paths: | ||
| paths.append(p) | ||
| lowered = error_text.lower() | ||
| if 'prerendering page "/"' in lowered or "src_app_page" in lowered: | ||
| if "src/app/page.tsx" not in paths: | ||
| paths.append("src/app/page.tsx") | ||
| return paths[:5] |
There was a problem hiding this comment.
build_failing_files 가 흔한 frontend-only 실패를 복구 경로로 제대로 넘기지 못합니다.
여기서 추출한 @/… alias 는 repo path(src/...)로 정규화되지 않고, design 오류는 basename만 남아서 failing_paths 가 비거나 매칭 불가 값으로 남습니다. 그런데 agent/nodes/per_file_code_generator.py:620-647 는 이 리스트로 spec 을 찾고, agent/graph.py:88-99 는 non-empty build_failing_files 일 때만 frontend_file_repairer 로 보냅니다. 그래서 Module not found '@/…' 나 design-only failure 같은 케이스에서 targeted repair 가 바로 빠집니다. alias→repo path 정규화와 design 체크의 원본 filepath 보존이 필요합니다.
Also applies to: 367-399
| "build_errors_full": combined_stderr[:3000], | ||
| "build_repair_prompt": repair_prompt, | ||
| "build_attempt_count": state.get("build_attempt_count", 0) + 1, | ||
| "build_failing_files": failing_paths, | ||
| "build_frontend_only_failure": frontend_only_failure, |
There was a problem hiding this comment.
새 top-level build 상태는 실패 시에만 쓰면 stale 값이 남습니다.
build_errors_full, build_failing_files, build_frontend_only_failure 를 여기서만 채우면 다음 성공/skip/다른 실패에서 이전 값이 merge 후 그대로 남습니다. 그러면 이후 라우팅과 대시보드가 오래된 frontend-only failure 를 계속 읽을 수 있으니, 비실패 경로에서도 빈 값으로 명시적으로 덮어써 주세요. As per coding guidelines, Return state fragments keyed for reducer-friendly merges; graph state uses merge_dicts for council_analysis and scoring.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@agent/nodes/build_validator.py` around lines 395 - 399, The new top-level
build state keys (build_errors_full, build_failing_files,
build_frontend_only_failure) are only being set on failure paths, leaving stale
values on success/skip paths; update the state emission in the function that
prepares the build-state fragment (the block that currently sets
"build_errors_full": combined_stderr[:3000], "build_failing_files":
failing_paths, "build_frontend_only_failure": frontend_only_failure, and
"build_attempt_count") to explicitly overwrite those keys with empty/falsey
values on non-failure branches (e.g., set build_errors_full="",
build_failing_files=[], build_frontend_only_failure=False) so reducer-friendly
merges won't retain old data; keep build_attempt_count logic as-is and ensure
the same fragment shape is returned for all outcomes so merge_dicts/reducer
merges behave predictably.
| def apply_router_prefixes(routes: list[dict], prefixes: list[str]) -> list[dict]: | ||
| if not prefixes: | ||
| return routes | ||
| expanded: list[dict] = [] | ||
| for route in routes: | ||
| path = str(route.get("path") or "") | ||
| method = str(route.get("method") or "GET") | ||
| if path.startswith("/") and path.startswith("/api"): | ||
| expanded.append({"method": method, "path": path}) | ||
| continue | ||
| for prefix in prefixes: | ||
| if not prefix.startswith("/"): | ||
| prefix = "/" + prefix | ||
| full_path = prefix.rstrip("/") + (path if path.startswith("/") else f"/{path}") | ||
| expanded.append({"method": method, "path": full_path}) | ||
| return expanded or routes |
There was a problem hiding this comment.
/api 하드코딩 예외로 prefix 처리 로직이 취약합니다.
prefix 우회 판단이 고정 문자열(/api)에 묶여 있어, 다른 prefix를 쓰는 계약에서 잘못된 엔드포인트 매칭이 발생할 수 있습니다.
🔧 제안 수정안
def apply_router_prefixes(routes: list[dict], prefixes: list[str]) -> list[dict]:
if not prefixes:
return routes
+ normalized_prefixes = [p if p.startswith("/") else f"/{p}" for p in prefixes]
expanded: list[dict] = []
for route in routes:
path = str(route.get("path") or "")
method = str(route.get("method") or "GET")
- if path.startswith("/") and path.startswith("/api"):
+ if any(path == p or path.startswith(f"{p.rstrip('/')}/") for p in normalized_prefixes):
expanded.append({"method": method, "path": path})
continue
- for prefix in prefixes:
- if not prefix.startswith("/"):
- prefix = "/" + prefix
+ for prefix in normalized_prefixes:
full_path = prefix.rstrip("/") + (path if path.startswith("/") else f"/{path}")
expanded.append({"method": method, "path": full_path})
return expanded or routes🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@agent/nodes/contract_validator.py` around lines 61 - 76, The current
apply_router_prefixes function incorrectly special-cases paths starting with
"/api"; instead, remove the hardcoded "/api" check and treat a route as already
prefixed if its path starts with any of the provided prefixes (normalize
prefixes to begin with "/" and compare against path prefixes), so that routes
already containing one of the configured prefixes are left unchanged; ensure
prefix normalization (leading slash, trim trailing slash for comparisons) and
keep the rest of the expansion logic (building full_path from prefix + path) and
the final return behavior (expanded if non-empty, otherwise original routes).
| for endpoint, methods in list(paths.items())[:3]: | ||
| if not isinstance(methods, dict): | ||
| continue | ||
| if "post" in methods: | ||
| payload = {"query": "test", "preferences": "test"} | ||
| if "insight" in endpoint.lower(): | ||
| payload = {"selection": "test", "context": "test"} | ||
| post_ok, post_detail = await asyncio.to_thread( | ||
| _http_json, f"http://127.0.0.1:{port}{endpoint}", payload | ||
| ) | ||
| if not post_ok: | ||
| errors.append(f"backend_endpoint_failed:{endpoint}:{post_detail}") |
There was a problem hiding this comment.
OpenAPI 기반 POST 검증이 경로/스키마를 무시해서 정상 API를 오탐 실패시킬 수 있습니다.
현재는 모든 POST에 고정 payload를 보내므로, 경로 파라미터가 있는 URL이나 다른 request schema를 가진 엔드포인트에서 backend_endpoint_failed가 과도하게 발생할 수 있습니다.
🔧 제안 수정안
- for endpoint, methods in list(paths.items())[:3]:
+ for endpoint, methods in list(paths.items())[:3]:
+ # 경로 파라미터 엔드포인트는 샘플 값 매핑 없이 호출하면 오탐 위험이 큼
+ if "{" in endpoint or "}" in endpoint:
+ continue
if not isinstance(methods, dict):
continue
- if "post" in methods:
- payload = {"query": "test", "preferences": "test"}
- if "insight" in endpoint.lower():
- payload = {"selection": "test", "context": "test"}
+ post_op = methods.get("post")
+ if isinstance(post_op, dict):
+ # TODO: requestBody schema를 읽어 required 필드 기반 payload 생성
+ payload = {"query": "test", "preferences": "test"}
+ if "insight" in endpoint.lower():
+ payload = {"selection": "test", "context": "test"}
post_ok, post_detail = await asyncio.to_thread(
_http_json, f"http://127.0.0.1:{port}{endpoint}", payload
)
if not post_ok:
errors.append(f"backend_endpoint_failed:{endpoint}:{post_detail}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@agent/nodes/local_runtime_validator.py` around lines 103 - 114, The POST
validator is sending a fixed payload and not honoring OpenAPI path parameters or
requestBody schema, causing false failures; update the loop that iterates over
paths (the block using `for endpoint, methods in list(paths.items())[:3]:`) to:
detect and substitute any path parameters in `endpoint` with dummy safe values
(e.g., "test" or "1"); inspect the OpenAPI operation object for "requestBody"
and the schema for "application/json" (or form data) and construct a minimal
valid payload matching required properties instead of the fixed
`{"query":"test","preferences":"test"}` (and keep the special-case for "insight"
only if that matches the schema); then call `_http_json` with the constructed
URL and payload and keep appending failures to `errors` as before (`post_ok,
post_detail = await asyncio.to_thread(_http_json, ...)`) so the validator skips
or correctly tests endpoints with path params and proper request schemas.
| if target == "frontend" and _is_page_file(spec.path): | ||
| available_exports = context.get("available_exports") or {} | ||
| if available_exports: | ||
| export_lines = [] | ||
| for file_path, info in sorted(available_exports.items()): | ||
| module = ( | ||
| "@/" + file_path.replace("src/", "", 1).rsplit(".", 1)[0] | ||
| if file_path.startswith("src/") | ||
| else file_path.rsplit(".", 1)[0] | ||
| ) | ||
| defaults = info.get("default") or [] | ||
| named = info.get("named") or [] | ||
| props_map = info.get("props") or {} | ||
| if defaults: | ||
| sig = props_map.get(defaults[0], "") | ||
| props_note = f" // props: {sig}" if sig else " // default export" | ||
| export_lines.append(f' import {defaults[0]} from "{module}";{props_note}') | ||
| if named: | ||
| for n in named: | ||
| sig = props_map.get(n, "") | ||
| props_note = f" // props: {sig}" if sig else "" | ||
| export_lines.append(f' import {{ {n} }} from "{module}";{props_note}') | ||
|
|
There was a problem hiding this comment.
available_exports를 수집만 하고 프롬프트에 주입하지 않아 import 제약이 실제로 동작하지 않습니다.
export_lines를 만들지만 prompt에 붙이지 않아, page 생성 시 import 오작동 방지 의도가 반영되지 않습니다.
🔧 제안 수정안
if target == "frontend" and _is_page_file(spec.path):
available_exports = context.get("available_exports") or {}
if available_exports:
export_lines = []
for file_path, info in sorted(available_exports.items()):
module = (
"@/" + file_path.replace("src/", "", 1).rsplit(".", 1)[0]
if file_path.startswith("src/")
else file_path.rsplit(".", 1)[0]
)
defaults = info.get("default") or []
named = info.get("named") or []
props_map = info.get("props") or {}
if defaults:
sig = props_map.get(defaults[0], "")
props_note = f" // props: {sig}" if sig else " // default export"
export_lines.append(f' import {defaults[0]} from "{module}";{props_note}')
if named:
for n in named:
sig = props_map.get(n, "")
props_note = f" // props: {sig}" if sig else ""
export_lines.append(f' import {{ {n} }} from "{module}";{props_note}')
+ if export_lines:
+ prompt = f"{prompt}\n\n## Available Component Exports\n" + "\n".join(export_lines)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@agent/nodes/per_file_code_generator.py` around lines 298 - 320, The code
inside the target == "frontend" && _is_page_file(spec.path) block builds import
hints in export_lines from available_exports but never injects them into the
prompt/context, so the import constraint is not applied; after building
export_lines (use the available_exports, defaults, named, props_map logic
already present), join them into a single string (with a brief header comment)
and append or merge that string into the same prompt/context payload used later
to generate the page (the variable that holds the prompt or the context passed
to the generator), ensuring the import hints are included when
_is_page_file(spec.path) is true.
| def _is_page_file(path: str) -> bool: | ||
| normalized = path.replace("\\", "/") | ||
| return "page.tsx" in normalized or "page.ts" in normalized | ||
|
|
There was a problem hiding this comment.
페이지 파일 판별이 부분 문자열 기반이라 오탐이 발생합니다.
현재 로직은 homepage.tsx 같은 파일도 페이지로 분류할 수 있어, 생성/수리 경로가 잘못 선택될 수 있습니다.
🔧 제안 수정안
def _is_page_file(path: str) -> bool:
- normalized = path.replace("\\", "/")
- return "page.tsx" in normalized or "page.ts" in normalized
+ normalized = path.replace("\\", "/")
+ file_name = Path(normalized).name.lower()
+ return file_name in {"page.tsx", "page.ts"}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@agent/nodes/per_file_code_generator.py` around lines 1195 - 1198, The
_is_page_file function misclassifies files because it checks for substrings;
change it to inspect only the final filename (use os.path.basename or
Path(path).name) and return True only when the basename exactly equals
"page.tsx" or "page.ts" (after normalizing slashes), so files like
"homepage.tsx" won't be treated as page files; update the _is_page_file
implementation to use the basename comparison accordingly.
| verdict = verdict_map.get(decision_raw, "NO-GO") | ||
| if pipeline_succeeded: | ||
| verdict = "GO" |
There was a problem hiding this comment.
하드 게이트 실패 시에도 최종 verdict가 GO로 남을 수 있습니다.
현재는 성공 케이스만 GO로 덮어쓰고, 실패 케이스는 scoring.decision을 그대로 사용합니다. 이 때문에 게이트가 실패했는데도 세션 verdict가 GO가 될 수 있습니다.
🔧 제안 수정안
- verdict = verdict_map.get(decision_raw, "NO-GO")
- if pipeline_succeeded:
- verdict = "GO"
+ verdict = "GO" if pipeline_succeeded else "NO-GO"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@agent/pipeline_runtime.py` around lines 199 - 201, The current logic
unconditionally sets verdict = "GO" whenever pipeline_succeeded is true, which
can override a failing/hard-gate decision; change it to first map the raw
decision (use verdict_map.get(decision_raw, "NO-GO") into a local variable like
mapped_verdict) and only set verdict = "GO" if pipeline_succeeded is true AND
mapped_verdict == "GO" (otherwise keep mapped_verdict). Update the code around
verdict_map, decision_raw and pipeline_succeeded so hard-gate or scoring
failures (scoring.decision / decision_raw) cannot be overridden by
pipeline_succeeded.
| score = scoring.get("final_score", 0) | ||
| if not score and isinstance(code_eval_result.get("match_rate"), (int, float)): | ||
| score = code_eval_result.get("match_rate", 0) |
There was a problem hiding this comment.
점수 fallback 조건이 0 유효값을 덮어쓸 수 있습니다.
if not score는 final_score == 0도 누락으로 간주합니다. 값 존재 여부와 타입을 기준으로 분기하는 편이 안전합니다.
🔧 제안 수정안
- score = scoring.get("final_score", 0)
- if not score and isinstance(code_eval_result.get("match_rate"), (int, float)):
- score = code_eval_result.get("match_rate", 0)
+ score = scoring.get("final_score")
+ if not isinstance(score, (int, float)):
+ score = code_eval_result.get("match_rate", 0) if isinstance(code_eval_result.get("match_rate"), (int, float)) else 0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| score = scoring.get("final_score", 0) | |
| if not score and isinstance(code_eval_result.get("match_rate"), (int, float)): | |
| score = code_eval_result.get("match_rate", 0) | |
| score = scoring.get("final_score") | |
| if not isinstance(score, (int, float)): | |
| score = code_eval_result.get("match_rate", 0) if isinstance(code_eval_result.get("match_rate"), (int, float)) else 0 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@agent/pipeline_runtime.py` around lines 203 - 205, The current fallback uses
"if not score" which treats 0 as missing; change the logic around the score
variable so you only fallback when final_score is absent or None (not when it is
0). Retrieve score from scoring via scoring.get("final_score") and then replace
the "if not score" check with a strict None/type check (e.g., "if score is None
and isinstance(code_eval_result.get('match_rate'), (int, float))") before
assigning the match_rate; update references to scoring, score, and
code_eval_result accordingly.
| for (const line of trimmed.split(/\r?\n/)) { | ||
| if (!line.startsWith("data: ")) continue; | ||
| const payload = JSON.parse(line.slice(6)); | ||
| if (payload.type === "zp.session.start") { | ||
| return { | ||
| session_id: String(payload.session_id), | ||
| status: String(payload.session_status || "exploring"), | ||
| goal_go_cards: Number(payload.goal_go_cards || 0), | ||
| cards: [], | ||
| build_queue: [], | ||
| active_build: null, | ||
| } as ZPSession; | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# ZPSession 인터페이스 정의 확인
echo "=== ZPSession TypeScript interface ==="
cat web/src/types/zero-prompt.ts | head -50
echo ""
echo "=== Usage of goal_go_cards, build_queue, active_build in codebase ==="
rg -n "goal_go_cards|build_queue|active_build" web/src/ --type tsRepository: Two-Weeks-Team/vibeDeploy
Length of output: 2739
JSON 파싱 오류 처리 누락 및 ZPSession 인터페이스 타입 불일치
두 가지 실제 문제가 있습니다:
-
Line 11의
JSON.parse(line.slice(6))는 잘못된 JSON에 대해 처리되지 않은 예외를 발생시킵니다. -
반환하는 객체의
goal_go_cards,build_queue,active_build필드가ZPSession인터페이스에 정의되어 있지 않습니다. 그러나goal_go_cards는use-zero-prompt.ts에서 실제로 사용되고 있으므로, 인터페이스를 업데이트해야 합니다.
♻️ 개선 제안
- JSON 파싱에 try-catch 추가:
- const payload = JSON.parse(line.slice(6));
+ let payload;
+ try {
+ payload = JSON.parse(line.slice(6));
+ } catch {
+ continue;
+ }ZPSession인터페이스를 업데이트하여goal_go_cards필드 추가 (실제 사용을 반영):
export interface ZPSession {
session_id: string;
status: "exploring" | "paused" | "completed";
goal_go_cards: number;
cards: ZPCard[];
}build_queue와 active_build는 현재 사용되지 않으므로 제거하거나, 향후 필요시 인터페이스에 추가할 수 있습니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/src/lib/zero-prompt-api.ts` around lines 9 - 22, Wrap the JSON.parse call
inside the loop (the invocation using JSON.parse(line.slice(6))) with a
try-catch so malformed SSE lines are skipped/logged instead of throwing;
continue the loop on parse failure. Also update the ZPSession interface to
include goal_go_cards: number (and remove build_queue and active_build from the
returned object or add them to the interface only if they will be used) so the
returned object shape matches the ZPSession type used elsewhere (see
use-zero-prompt.ts for consumers).
Keep /zero-prompt/start on JSON session bootstrap for the current merge-ref contract and update route tests to match the actual API shape used in CI. Retain the test-mode guard so zero-prompt route tests stay deterministic and do not launch the background exploration pipeline. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
agent/server.py (1)
1256-1261:⚠️ Potential issue | 🟡 Minor테스트 모드에서
zp.pipeline.started이벤트는 조건부로 보내는 편이 일관됩니다.현재는 파이프라인 실행을 건너뛰어도
zp.pipeline.started가 발행됩니다. 테스트 모드에서는 해당 이벤트를 생략(또는 별도skipped이벤트)하는 편이 상태 해석이 명확합니다.🔧 제안 패치
- if not _test_api_enabled(): - asyncio.create_task(_run_zp_pipeline(orch, session_id, goal)) + if not _test_api_enabled(): + asyncio.create_task(_run_zp_pipeline(orch, session_id, goal)) + push_zp_event({"type": "zp.pipeline.started", "session_id": session_id, "goal": goal}) push_zp_event( {"type": "zp.session.start", "session_id": session_id, "goal_go_cards": goal, "session_status": session.status} ) - push_zp_event({"type": "zp.pipeline.started", "session_id": session_id, "goal": goal})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agent/server.py` around lines 1256 - 1261, The code currently always pushes the "zp.pipeline.started" event even when pipeline execution is skipped in test mode; update the logic around _test_api_enabled(), _run_zp_pipeline, and push_zp_event so that when _test_api_enabled() is true you either omit emitting "zp.pipeline.started" or emit a distinct "zp.pipeline.skipped" event instead; specifically, move or conditionally call push_zp_event({"type": "zp.pipeline.started", ...}) only in the branch where you create the asyncio task (the path that invokes _run_zp_pipeline), or add an else that pushes {"type":"zp.pipeline.skipped", "session_id": session_id, "goal": goal} to make the state explicit.
🧹 Nitpick comments (3)
agent/tests/test_zp_routes.py (1)
18-20:/api경로도 응답 shape를 동일하게 검증해 route parity 회귀를 잡아주세요.현재는
session_id만 확인해서/zero-prompt/start와/api/zero-prompt/start의 응답 스키마 불일치를 놓칠 수 있습니다.🔧 제안 패치
async def test_zp_start_api_prefix(app_client): resp = await app_client.post("/api/zero-prompt/start", json={}) assert resp.status_code == 200 body = resp.json() assert body["session_id"] + assert body["goal_go_cards"] == 5 + assert body["status"] == "exploring" + assert body["build_queue"] == [] + assert body["active_build"] is NoneAs per coding guidelines: Keep route parity between
/api/...and local bare paths when changing request/response shapes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agent/tests/test_zp_routes.py` around lines 18 - 20, The test only asserts body["session_id"] for one endpoint, which misses schema differences between /zero-prompt/start and /api/zero-prompt/start; update agent/tests/test_zp_routes.py to call both endpoints (e.g., '/zero-prompt/start' and '/api/zero-prompt/start'), parse each resp.json() into variables (e.g., body and api_body) and assert full shape parity — either compare key sets and required fields (including "session_id") or assert body == api_body — so the test fails on any response-schema regression between the two routes.agent/server.py (2)
326-330: 로컬 배포 URL 감지가 일부 필드를 놓칠 수 있습니다.현재
localUrl/local_url만 보는데, 저장 포맷에 따라local_frontend_url/local_backend_url만 채워진 경우가 있어 로컬 항목이 누락될 수 있습니다.🔧 제안 패치
- local_url = str(deployment.get("localUrl") or deployment.get("local_url") or "").strip() + local_url = str( + deployment.get("localUrl") + or deployment.get("local_url") + or deployment.get("local_frontend_url") + or deployment.get("local_backend_url") + or "" + ).strip()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agent/server.py` around lines 326 - 330, The local deployment detection only checks deployment.get("localUrl") and deployment.get("local_url"), which misses cases where only deployment keys like "local_frontend_url" or "local_backend_url" are set; update the logic around the deployment, status, and local_url variables in server.py (the block that defines deployment = dict(...), status = ..., local_url = ...) to consider any of these keys as indicating a local URL (e.g., treat local_frontend_url or local_backend_url as valid local_url values) before deciding to append meeting to reconciled so meetings with those fields are not missed.
1269-1270: 응답 필드는 하드코딩 대신 세션 객체를 직접 사용하면 드리프트를 줄일 수 있습니다.
create_session의 기본값과 응답값이 미래에 달라질 수 있어, 실제session값으로 반환하는 편이 안전합니다.🔧 제안 패치
- "build_queue": [], - "active_build": None, + "build_queue": list(session.build_queue), + "active_build": session.active_build,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agent/server.py` around lines 1269 - 1270, 응답에서 "build_queue": [], "active_build": None처럼 값을 하드코딩하지 말고 create_session에서 생성된 실제 session 객체를 사용해 반환하도록 변경하세요; 서버 코드의 해당 응답 생성 부분(참조: create_session 및 변수 session)을 찾아 session을 직렬화(예: session.to_dict() 혹은 pydantic/.dict()/JSON 직렬화 방식)에 사용해 현재 세션 상태를 그대로 응답에 포함시키고, 필요한 경우 민감 필드를 필터링하도록 처리하세요.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@agent/tests/test_zp_routes.py`:
- Around line 53-57: The test currently allows resp.status_code == 200 which is
too permissive; update the assertions in the test for queue_build to require a
422 response and validate the error payload shape: assert resp.status_code ==
422 and then parse resp.json() and assert that body["type"] == "zp.action.error"
and that body["detail"] is one of the expected error identifiers (e.g.,
"session_not_found", "card_not_found", "card_not_go_ready"); ensure you remove
the {200, 422} set check and the conditional branch so the test fails on
unexpected 200 responses.
---
Outside diff comments:
In `@agent/server.py`:
- Around line 1256-1261: The code currently always pushes the
"zp.pipeline.started" event even when pipeline execution is skipped in test
mode; update the logic around _test_api_enabled(), _run_zp_pipeline, and
push_zp_event so that when _test_api_enabled() is true you either omit emitting
"zp.pipeline.started" or emit a distinct "zp.pipeline.skipped" event instead;
specifically, move or conditionally call push_zp_event({"type":
"zp.pipeline.started", ...}) only in the branch where you create the asyncio
task (the path that invokes _run_zp_pipeline), or add an else that pushes
{"type":"zp.pipeline.skipped", "session_id": session_id, "goal": goal} to make
the state explicit.
---
Nitpick comments:
In `@agent/server.py`:
- Around line 326-330: The local deployment detection only checks
deployment.get("localUrl") and deployment.get("local_url"), which misses cases
where only deployment keys like "local_frontend_url" or "local_backend_url" are
set; update the logic around the deployment, status, and local_url variables in
server.py (the block that defines deployment = dict(...), status = ...,
local_url = ...) to consider any of these keys as indicating a local URL (e.g.,
treat local_frontend_url or local_backend_url as valid local_url values) before
deciding to append meeting to reconciled so meetings with those fields are not
missed.
- Around line 1269-1270: 응답에서 "build_queue": [], "active_build": None처럼 값을
하드코딩하지 말고 create_session에서 생성된 실제 session 객체를 사용해 반환하도록 변경하세요; 서버 코드의 해당 응답 생성
부분(참조: create_session 및 변수 session)을 찾아 session을 직렬화(예: session.to_dict() 혹은
pydantic/.dict()/JSON 직렬화 방식)에 사용해 현재 세션 상태를 그대로 응답에 포함시키고, 필요한 경우 민감 필드를 필터링하도록
처리하세요.
In `@agent/tests/test_zp_routes.py`:
- Around line 18-20: The test only asserts body["session_id"] for one endpoint,
which misses schema differences between /zero-prompt/start and
/api/zero-prompt/start; update agent/tests/test_zp_routes.py to call both
endpoints (e.g., '/zero-prompt/start' and '/api/zero-prompt/start'), parse each
resp.json() into variables (e.g., body and api_body) and assert full shape
parity — either compare key sets and required fields (including "session_id") or
assert body == api_body — so the test fails on any response-schema regression
between the two routes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d6da19de-7a3e-46e4-926b-46047a3555c3
📒 Files selected for processing (2)
agent/server.pyagent/tests/test_zp_routes.py
| assert resp.status_code in {200, 422} | ||
| if resp.status_code == 200: | ||
| body = resp.json() | ||
| assert body["type"] == "zp.action.error" | ||
| assert body["error"] in {"session_not_found", "card_not_found", "card_not_go_ready"} |
There was a problem hiding this comment.
오류 케이스에서 200 허용은 테스트를 지나치게 느슨하게 만듭니다.
queue_build에서 zp.action.error는 서버가 422로 변환하므로(서버 핸들러 Line 1367-1368), 200 허용은 계약 회귀를 숨깁니다. 이 케이스는 422와 detail 값을 명확히 고정하는 편이 맞습니다.
🔧 제안 패치
- assert resp.status_code in {200, 422}
- if resp.status_code == 200:
- body = resp.json()
- assert body["type"] == "zp.action.error"
- assert body["error"] in {"session_not_found", "card_not_found", "card_not_go_ready"}
+ assert resp.status_code == 422
+ body = resp.json()
+ assert body["detail"] == "card_not_found"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| assert resp.status_code in {200, 422} | |
| if resp.status_code == 200: | |
| body = resp.json() | |
| assert body["type"] == "zp.action.error" | |
| assert body["error"] in {"session_not_found", "card_not_found", "card_not_go_ready"} | |
| assert resp.status_code == 422 | |
| body = resp.json() | |
| assert body["detail"] == "card_not_found" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@agent/tests/test_zp_routes.py` around lines 53 - 57, The test currently
allows resp.status_code == 200 which is too permissive; update the assertions in
the test for queue_build to require a 422 response and validate the error
payload shape: assert resp.status_code == 422 and then parse resp.json() and
assert that body["type"] == "zp.action.error" and that body["detail"] is one of
the expected error identifiers (e.g., "session_not_found", "card_not_found",
"card_not_go_ready"); ensure you remove the {200, 422} set check and the
conditional branch so the test fails on unexpected 200 responses.
Summary
GOresults instead of staleNO-GOverdictspage.tsxplus targeted repair pathsVerification
cd agent && python -m pytest tests/test_staged_pipeline_nodes.py tests/test_contract_validator.py tests/test_build_error_feedback.py tests/test_code_evaluator.py -q --tb=shortcd agent && python -m pytest tests/test_runtime_config.py tests/test_store.py -q --tb=shortCONTRACT 2/2->CODE_EVAL 92.4->BUILD PASS->RUNTIME PASS->DEPLOY_GATE PASS->SESSION verdict=GO200, backend/api/plan200, backend/api/insights200Notes
docs/20260318/nutriplan-reliability-proposal.mdout of this PR intentionallySummary by CodeRabbit
릴리스 노트
새로운 기능
개선 사항
테스트