Sprint/sprint 16 debug intelligence upgrade#36
Merged
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces a "verdict-first" diagnostics layer for React-Sentinel's MCP tools: high-level investigation tools (diagnose_excess_renders, find_memo_breaks, diagnose_runtime_bug, attribute_render), runtime-evidence-based verification tools (verify_hypothesis, verify_fix and aliases), an adversarial-timing race-finder, a new managed-Chromium browser mode, and updated tool descriptions/capability catalog/docs that steer agents toward runtime tools over grep.
Changes:
- New diagnostic protocol (
DiagnosticVerdict) plus verdict wrappers around existing render/async/race/hydration tools and newinvestigation.tsorchestrators. - New tools:
start_debug_replay,validate_user_flow,find_race_conditions,patch_and_validate,verify_hypothesis/test_runtime_hypothesis,verify_fix/verify_runtime_fix, plus diagnose/find/attribute investigation tools. - New managed-Chromium browser mode (
--browser-mode) with doctor reporting, expanded render-monitor cause attribution (prop_diff,context_change,provider_value_recreated,parent_render, …), capability/docs updates and tool-selection guide.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/capabilities.ts | Adds tool-selection guide, new investigation/verify capabilities and alias tools. |
| src/tools/diagnostics.ts | Wraps existing tools with verdict creators; adds diagnose_excess_renders, find_memo_breaks, attribute_render, diagnose_runtime_bug. |
| src/tools/patch.ts | Adds patch_and_validate, verify_hypothesis/alias, verify_fix/alias with shared verdict builders. |
| src/tools/interaction.ts | Adds validate_user_flow and find_race_conditions with adversarial timing + minimization. |
| src/tools/network.ts | Updates get_network_events description for verdict-first guidance. |
| src/tools/browser.ts | Adds start_debug_replay alias and refreshed descriptions. |
| src/diagnostics/protocol.ts | Adds DiagnosticVerdict/DiagnosticConfidence and renames render-cause types. |
| src/diagnostics/verdict.ts | New verdict wrappers for hotspots/async/race/hydration. |
| src/diagnostics/investigation.ts | New high-level diagnosis orchestrators. |
| src/diagnostics/render-monitor.ts | Captures parent/context info and reclassifies probable causes. |
| src/diagnostics/react-runtime.ts | Refactors component name resolution via shared getTypeDisplayName. |
| src/browser/protocol.ts | Adds managed session info and BrowserModePreference. |
| src/browser/index.ts | Implements managed Chromium launch/teardown and routing. |
| src/index.ts | Wires --browser-mode CLI flag and doctor reporting. |
| scripts/e2e-smoke.ts | Adds readVerdictRawData helper and loosens hotspot/async/race/hydration assertions. |
| docs/* | New tool-selection-guide.md, browser-modes.md and updated checklist. |
Comment on lines
+286
to
293
| const infiniteLoopHotspot = renderHotspots.hotspots.find((entry) => entry.componentName === "InfiniteLoopScenario"); | ||
| assert(renderHotspots.hotspots.length >= 1, "get_render_hotspots returned no hotspots."); | ||
| assert( | ||
| renderHotspots.hotspots.some( | ||
| (entry) => | ||
| entry.componentName === "InfiniteLoopScenario" && | ||
| ["unstable_state", "unstable_hook_value", "unstable_props", "repeated_effect"].includes( | ||
| entry.probableCause.type | ||
| ) | ||
| ), | ||
| "get_render_hotspots did not flag InfiniteLoopScenario with a probable cause." | ||
| infiniteLoopHotspot | ||
| ? infiniteLoopHotspot.probableCause.summary.trim().length > 0 | ||
| : renderHotspots.hotspots.some((entry) => entry.probableCause.summary.trim().length > 0), | ||
| "get_render_hotspots did not return a readable probable cause." | ||
| ); |
Comment on lines
+137
to
+151
| }): DiagnosticVerdict<VerificationVerdict, FixVerificationRawData> { | ||
| const baselineFailures = countAssertionFailures(seed.baseline.assertions); | ||
| const patchedFailures = countAssertionFailures(seed.patched.report.assertions); | ||
| const regressionFailureCount = seed.regressionAssertions.length === 0 | ||
| ? 0 | ||
| : seed.patched.report.assertions | ||
| .slice(-seed.regressionAssertions.length) | ||
| .filter((result) => !result.pass).length; | ||
| const verdict: VerificationVerdict = | ||
| baselineFailures > 0 && patchedFailures === 0 && regressionFailureCount === 0 | ||
| ? "CONFIRMED" | ||
| : patchedFailures >= baselineFailures | ||
| ? "REFUTED" | ||
| : "PARTIAL"; | ||
|
|
Comment on lines
+679
to
+777
| server.tool( | ||
| "verify_runtime_fix", | ||
| [ | ||
| "Action-oriented alias for verify_fix that checks whether a candidate runtime fix actually resolves the bug.", | ||
| "Prefer this when the agent is phrasing the task as 'verify the fix before touching source'.", | ||
| ].join(" "), | ||
| { | ||
| fixDescription: z.string().min(3).max(500).describe("Short description of the fix that the runtime patch is supposed to validate."), | ||
| patch: runtimePatchSchema.describe("Runtime patch payload for the replay sandbox."), | ||
| url: z.string().url().optional().describe("Optional URL to open in the replay browser before the scenario runs."), | ||
| steps: z.array(replayStepSchema).min(1).describe("Ordered replay steps to execute before assertions."), | ||
| assertions: z.array(assertionSchema).min(1).describe("Assertions that should pass after the fix is applied."), | ||
| regressionAssertions: z.array(assertionSchema).optional().default([]).describe("Optional guard assertions that should remain true before and after the patch."), | ||
| headless: z.boolean().optional().describe("Override the replay browser mode for this verification."), | ||
| waitUntil: replayWaitUntilSchema.describe("Navigation readiness event when url is provided."), | ||
| timeoutMs: z.number().int().min(1).max(120_000).optional().default(10_000).describe("Navigation timeout in milliseconds when url is provided."), | ||
| continueOnError: z.boolean().optional().default(false).describe("Keep executing later steps after a step failure."), | ||
| waitMs: z.number().int().min(0).max(60_000).optional().default(500).describe("Wait time in milliseconds before running assertions."), | ||
| cleanup: z.enum(["keep", "reload", "reset_session"]).optional().default("reset_session").describe("How to clean the replay sandbox after patch verification."), | ||
| reopenUrl: z.string().url().optional().describe("Optional clean URL to reopen after cleanup when using reload or reset_session."), | ||
| }, | ||
| async ({ fixDescription, patch, url, steps, assertions, regressionAssertions, headless, waitUntil, timeoutMs, continueOnError, waitMs, cleanup, reopenUrl }): Promise<ToolResponse> => { | ||
| try { | ||
| const combinedAssertions = [...(assertions as Assertion[]), ...(regressionAssertions as Assertion[])]; | ||
| const baseline = await browserManager.runValidationScenario(steps, combinedAssertions, { | ||
| url, | ||
| headless, | ||
| waitUntil, | ||
| timeoutMs, | ||
| resetSession: true, | ||
| continueOnError, | ||
| waitMs, | ||
| }); | ||
| if ("error" in baseline) return err(baseline.error); | ||
|
|
||
| const applyResult = await browserManager.applyRuntimePatch(patch as RuntimePatch, { | ||
| url, | ||
| headless, | ||
| waitUntil, | ||
| timeoutMs, | ||
| resetSession: true, | ||
| }); | ||
| if ("error" in applyResult) return err(applyResult.error); | ||
|
|
||
| const patchedReport = await browserManager.runValidationScenario(steps, combinedAssertions, { | ||
| headless, | ||
| continueOnError, | ||
| waitMs, | ||
| }); | ||
| if ("error" in patchedReport) { | ||
| if (cleanup !== "keep") { | ||
| const cleanupResult = await browserManager.resetRuntimePatches({ | ||
| strategy: cleanup as "reload" | "reset_session", | ||
| waitUntil, | ||
| timeoutMs, | ||
| headless, | ||
| reopenUrl, | ||
| }); | ||
| if ("error" in cleanupResult) { | ||
| return err(`${patchedReport.error} Cleanup after patch verification also failed: ${cleanupResult.error}.`); | ||
| } | ||
| } | ||
| return err(patchedReport.error); | ||
| } | ||
|
|
||
| const patched: PatchedValidationScenarioResponse = { | ||
| verdict: patchedReport.success ? "patch_validated" : "patch_failed", | ||
| apply: applyResult, | ||
| report: patchedReport, | ||
| }; | ||
|
|
||
| if (cleanup !== "keep") { | ||
| const cleanupResult = await browserManager.resetRuntimePatches({ | ||
| strategy: cleanup as "reload" | "reset_session", | ||
| waitUntil, | ||
| timeoutMs, | ||
| headless, | ||
| reopenUrl, | ||
| }); | ||
| if ("error" in cleanupResult) return err(cleanupResult.error); | ||
| patched.cleanup = cleanupResult; | ||
| } | ||
|
|
||
| const response = createFixVerdict({ | ||
| fixDescription, | ||
| baseline, | ||
| patched, | ||
| regressionAssertions: regressionAssertions as Assertion[], | ||
| }); | ||
|
|
||
| return ok({ | ||
| ...response, | ||
| reportMarkdown: buildFixVerificationMarkdown(fixDescription, response.verdict, baseline, patched, regressionAssertions as Assertion[]), | ||
| }); | ||
| } catch (error) { | ||
| return err(`verify_runtime_fix failed unexpectedly: ${String(error)}`); | ||
| } | ||
| } | ||
| ); |
Owner
Author
|
@copilot apply changes based on the comments in this thread |
…edup Agent-Logs-Url: https://github.com/edgarbnt/ReactSentinel/sessions/6bcec9f4-64c6-453f-a101-1e067fcf3d78 Co-authored-by: edgarbnt <146716791+edgarbnt@users.noreply.github.com>
Agent-Logs-Url: https://github.com/edgarbnt/ReactSentinel/sessions/6bcec9f4-64c6-453f-a101-1e067fcf3d78 Co-authored-by: edgarbnt <146716791+edgarbnt@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new "verdict-first" diagnostics system for React runtime investigations, focusing on actionable, high-level diagnoses before presenting raw data. It adds new diagnostic capabilities and supporting types, and implements core logic for three new high-signal diagnostic tools: excess render detection, memo break analysis, and runtime bug triage.
New Diagnostic Capabilities and APIs:
capabilityCatalog, includingdiagnose_excess_renders,find_memo_breaks,diagnose_runtime_bug,find_race_conditions,verify_hypothesis, andverify_fix. These tools are grouped under a newinvestigation_toolscapability for streamlined, verdict-first investigations. (src/capabilities.ts) [1] [2] [3] [4] [5]Type System and Protocol Enhancements:
DiagnosticVerdictinterface and supporting types (e.g.,DiagnosticConfidence) to standardize verdict-first diagnostic responses, encapsulating a machine-readable verdict, summary, evidence, confidence, and raw data. (src/diagnostics/protocol.ts)Core Diagnostic Logic Implementations:
src/diagnostics/investigation.ts:createExcessRenderDiagnosis: Determines the cause of excess renders, distinguishing between render loops, memo breaks, context cascades, and hook instability.createMemoBreakDiagnosis: Analyzes render hotspots to detect memo breaks versus context/provider churn or internal state issues.createRuntimeBugDiagnosis: Provides a verdict-first triage for vague runtime bugs, identifying hydration failures, race conditions, render instability, or generic runtime errors.Each function returns a standardized
DiagnosticVerdictwith actionable next steps and preserves the raw diagnostic payload. (src/diagnostics/investigation.ts)Tool Integration and Catalog Updates:
shadow_sandboxcapability to include the newverify_fixtool, supporting validation of runtime fixes and regression detection in sandbox mode. (src/capabilities.ts)These changes lay the groundwork for a more actionable and user-friendly diagnostics experience, making it easier to identify and address common React runtime issues.## Summary