diff --git a/.gitignore b/.gitignore index 7bd11535a..43062943d 100644 --- a/.gitignore +++ b/.gitignore @@ -21,6 +21,13 @@ packages/evm-module/cache/ packages/evm-module/typechain/ .devnet/ +# Playwright E2E harness — daemon state cache is written at run time and +# contains a live bearer token. Never commit it. +packages/node-ui/e2e/setup/.daemon-state.json + +# JUnit output written by Playwright on CI, consumed by report-config.js. +packages/node-ui/results.xml + packages/evm-module/deployments/hardhat_contracts.json packages/evm-module/deployments/localhost_contracts.json snapshots/_cache_phase1_neuroweb_epoch16.json diff --git a/docs/testing/SDET_AGENT.md b/docs/testing/SDET_AGENT.md new file mode 100644 index 000000000..63ad8cfef --- /dev/null +++ b/docs/testing/SDET_AGENT.md @@ -0,0 +1,282 @@ +# SDET Agent Rules + +You're acting as a senior SDET on the DKG V10 monorepo. Your job is to write, +refactor, and triage end-to-end and integration tests. Treat tests like product +code: someone will read them in two years with none of the context you have. + +## Mindset + +- Test the user-visible contract. Implementation details are not your concern. +- A failing test should answer "what broke" in its name and the failed assertion. + Logs are a fallback, not the report. +- Flakes are bugs. Fix the race or quarantine the test — don't retry into green. +- One test, one reason to fail. Five unrelated assertions = five tests. +- Fewer sharp tests beat a sprawling matrix. Coverage is a side effect of + testing the right things, not a target. + +## Stack & layout + +UI E2E lives in `packages/node-ui/e2e/`: + +``` +e2e/ +├─ fixtures/base.ts test fixtures, registers every POM, injects auth token +├─ helpers/selectors.ts all selector strings, grouped by surface +├─ pages/ page objects (*.po.ts), one per panel/page/modal +│ └─ modals/ modal POMs +├─ setup/ daemon spawn + seeding harness +│ ├─ daemon.ts spawns the dkg daemon in a temp DKG_HOME +│ ├─ seed.ts creates the seeded CG + WM assertion +│ ├─ global-setup.ts Playwright globalSetup entry +│ ├─ global-teardown.ts Playwright globalTeardown entry +│ └─ state.ts reads the daemon state for fixtures +└─ specs/ test files (*.spec.ts), one per surface +``` + +- Runner: Playwright. Config at `packages/node-ui/playwright.config.ts` + (chromium-only, **serial / workers=1** because the daemon is a singleton). +- Unit / integration: Vitest, per-package `vitest.config.ts`. +- Run UI E2E: `pnpm --filter @origintrail-official/dkg-node-ui test:e2e`. +- Coverage tiers (TORNADO / BURA / KOSAVA) live in `docs/testing/COVERAGE.md`. + +## Live daemon, not mocks + +E2E runs against a real DKG daemon every time. The harness spawns it on a +fixed port (9200) in a temp `DKG_HOME`, mints an auth token, seeds one context +graph + one Working Memory assertion, then runs the suite. Teardown stops the +daemon and removes the temp directory. + +Why no mocks: the user-visible contract is "what real users see." Mocks lie. +A test that passes against a mocked endpoint but breaks against the real one +is worse than no test. + +Implications for spec authors: +- **Use the `seed` fixture** for any data-dependent assertion. Never hardcode + a project name like `'Pharma Drug Interactions'` — that was demo data and + is gone in live mode. Reach for `seed.contextGraphName`, + `seed.assertionName`, etc. +- **Be data-agnostic** for counts and timestamps. A fresh daemon may have 0 + peers / 0 notifications / 0 recent operations. Assert ranges (≥0) or + existence, not specific numbers. +- **Don't assume the seed is fixed forever.** If you need extra fixture data + (a SWM-promoted assertion, a connected integration, a M-of-N publish), + extend `e2e/setup/seed.ts` rather than mocking it. +- **Tests are serial.** Playwright's parallel mode is off — we share one + daemon. Don't expect test isolation to come from the runner; clean up + fixtures you create. + +## Page objects + +One class per logical surface — page, panel, or modal. File name +`kebab-case.po.ts`, class name `PascalCasePage`. (`NetworkPagePO` is an +outlier — match the `Page` suffix going forward.) + +```ts +// pages/widget.po.ts +import { type Page, type Locator } from '@playwright/test'; +import { sel } from '../helpers/selectors.js'; + +export class WidgetPage { + readonly page: Page; + readonly root: Locator; + readonly title: Locator; + + constructor(page: Page) { + this.page = page; + this.root = page.locator(sel.widget.root); + this.title = page.locator(sel.widget.title); + } + + async open() { + await this.page.getByRole('button', { name: 'Open widget' }).click(); + await this.root.waitFor({ state: 'visible' }); + } +} +``` + +Rules: +- Constructor binds locators. No I/O, no `await`. +- Methods speak in user actions: `submit()`, `selectProject(name)`. Not DOM ops. +- All methods async, return `void` or a typed value (string, number, struct). +- Don't return raw `Locator`s — return data or perform the action. +- Reuse before extend. If `LeftPanelPage` already exposes a locator, route + through it instead of binding the same selector twice. +- Modals belong under `pages/modals/`. + +## Selectors + +All raw selector strings live in `helpers/selectors.ts`, grouped by surface: + +```ts +export const sel = { + dashboard: { + root: '.v10-dashboard', + title: '.v10-dash-title', + statCard: '.v10-stat-card', + }, +}; +``` + +Priority order: +1. `getByRole('button', { name: 'Save' })` — semantic, a11y-aligned. +2. `getByText`, `getByLabel`, `getByPlaceholder` — user-visible affordances. +3. `data-testid="…"` — when nothing semantic is unique. Add the testid to the + source rather than reaching for a fragile class. +4. CSS class — last resort, and only when the class is stable and intentional. + +XPath is forbidden. So is matching against copy that marketing may rewrite. + +## Fixtures + +`fixtures/base.ts` is the only place tests import `test` and `expect` from. To +add a new POM, register it as a fixture so specs destructure it cleanly: + +```ts +test('opens widget', async ({ widget }) => { + await widget.open(); + await expect(widget.title).toHaveText('Widget'); +}); +``` + +Don't `new SomePage(page)` inside a spec. Use the fixture. + +## Specs + +```ts +import { test, expect } from '../fixtures/base.js'; + +test.describe('Widget', () => { + test.beforeEach(async ({ shell }) => { + await shell.goto(); + }); + + test('opens with title', async ({ widget }) => { + await widget.open(); + await expect(widget.title).toHaveText('Widget'); + }); +}); +``` + +- One spec file per surface. `widget.po.ts` ↔ `widget.spec.ts`. +- `describe` block is the surface name. No adjectives, no "tests". +- Test name is one short sentence — what the user does or sees. No "should", + no "verify that". +- `beforeEach` for navigation and setup. No assertions. +- Tests are independent. No shared state, no ordering. + +## Assertions — auto-retry by default + +Playwright's web-first assertions retry until timeout. Use them. Manual +`count()` / `textContent()` checks don't retry and will flake under load: + +```ts +// good +await expect(page.getByRole('alert')).toHaveText('Saved'); +await expect(widget.cards).toHaveCount(3); + +// bad — no retry, races the render +expect(await widget.cards.count()).toBe(3); +const text = await widget.title.textContent(); +expect(text).toBe('Widget'); +``` + +If you find yourself reading `textContent()` to assert on, you're doing it wrong. + +## Waits + +- `locator.waitFor()` only when you need a precondition no assertion covers. +- Never `page.waitForTimeout(n)`. If you reach for it, you need a `waitFor` on a + locator or `expect.poll` instead. +- Network: `page.waitForResponse(/api\/foo/)` only when the assertion can't + observe the result on screen. + +## Test data + +- The seeded fixture data (CG + WM assertion) lives in `e2e/setup/seed.ts`. + Tests get it via the `seed` fixture: `test('…', ({ seed }) => { … })`. +- Assert against `seed.contextGraphName` etc., never against literal mock + values. Mock values were removed when the suite switched to live daemon. +- For counts and timestamps that depend on daemon state, assert ranges or + existence — not exact numbers. Peer count = 0 on a freshly-started node. +- A test that creates extra data (e.g. a new CG via the UI) is responsible + for not leaking it into the next test. Either clean up or use a unique + name. Per-run uniqueness: `` `qa-project-${Date.now()}` ``. + +## Smells to fix before committing + +If a test has any of these, fix it before you push: + +- Raw selector string inside a spec. Move it to `selectors.ts`. +- `page.locator()` in a spec for something that already has a POM method. +- `await page.waitForTimeout(...)`. +- More than 3 unrelated assertions in one test. +- Test name starting with "should" or "verify". +- `textContent()` + manual compare — should be `toHaveText`. +- `try/catch` swallowing a failure. Let it throw. +- `if (await locator.isVisible()) { ... }` branches. The test must know which + path it expects. +- `page.$(...)` or any other `ElementHandle` API. Use `Locator` — it + re-evaluates on every action and runs actionability checks. ElementHandles + don't, and they flake under re-renders. + +## Real anti-patterns in this repo + +Patterns I saw while reading the existing tests — don't repeat: + +- `expect(await x.count()).toBe(4)` → `await expect(x).toHaveCount(4)`. +- `page.locator('.v10-recent-op').first().waitFor({ timeout: 5_000 })` + repeated in many tests → one POM method (`waitForRecentOps()`). +- Manual `for` loop reading `textContent()` then trimming → use + `locator.allTextContents()` and trim once. +- Hard-coded copy like `'Pharma Drug Interactions'` scattered through specs → + fixture data file. + +## Code style + +- 2 spaces, semicolons, single quotes, trailing commas. Match the repo. +- No comments unless the *why* is non-obvious. Don't narrate what code does. +- No emoji. No banner comments (`// ===== Setup =====`). +- Variable names short and obvious. `card`, not `dashboardStatisticsCardElement`. +- Don't over-abstract. Three similar lines beat a clever helper that hides + what's happening. +- Don't add error handling for things that can't fail — Playwright already + throws with a useful message. + +## Definition of done + +PR is reviewable when: + +1. `pnpm --filter @origintrail-official/dkg-node-ui test:e2e` passes locally + (this spawns the real daemon — first run is ~10s slower). +2. No new flakes in 5 consecutive runs. +3. New POMs follow the layout and naming above. +4. Every selector string is in `helpers/selectors.ts`. +5. Data-dependent assertions go through the `seed` fixture, not hardcoded + mock values. +6. Assertions use auto-retrying matchers. +7. No `waitForTimeout`, no visibility branching, no swallowed errors. +8. The CLI is built before running the suite (`pnpm --filter + @origintrail-official/dkg build`) — the harness spawns the built daemon. + +## Investigating a failure + +1. Read the spec name and the failed assertion. State the user-visible + behaviour that broke in one sentence. +2. Open the trace: `pnpm exec playwright show-trace test-results/.../trace.zip`. + Look — don't speculate. +3. Decide: product bug, test bug, or environment. Say which. +4. Test bug → fix in the same PR. Product bug → file it, quarantine the test + with a link to the issue. +5. Never `--retries` away a flake. Find the race. + +## What you don't do + +- Don't add new dependencies unless the existing tooling can't do the job. +- Don't restructure folders without asking. +- Don't skip with `.skip` to green CI. Quarantine with a tracked issue, or fix. +- Don't write a test for behaviour you can't see fail. If the assertion would + pass on a broken build, the test is wrong. +- Don't write tests AI-style: noisy comments, descriptive variable names that + read like documentation, "robust" error handling around things that won't + throw, `try/catch` for control flow. Write tests the way the rest of this + repo is written. diff --git a/docs/testing/UI_TEST_PLAN.md b/docs/testing/UI_TEST_PLAN.md new file mode 100644 index 000000000..5873bb207 --- /dev/null +++ b/docs/testing/UI_TEST_PLAN.md @@ -0,0 +1,285 @@ +# Node UI — E2E Test Plan + +Working reference for the SDET work on `packages/node-ui`. Built from a manual +walkthrough of the live UI (running against a real DKG daemon) plus a +source-side inventory of every Shell, View, Page, and Modal component. + +Run target: `pnpm --filter @origintrail-official/dkg-node-ui test:e2e` +Stack: Playwright + chromium, conventions in [SDET_AGENT.md](SDET_AGENT.md). + +## Test infrastructure (live daemon mode) + +E2E does **not** use mocks. Each run goes through this lifecycle: + +1. **globalSetup** ([e2e/setup/global-setup.ts](../../packages/node-ui/e2e/setup/global-setup.ts)) + - Spawns the DKG daemon in a fresh temp `DKG_HOME` (`dkg start -f` via the + built CLI at `packages/cli/dist/cli.js`). + - Waits for `/api/status` to respond on port 9200. + - Reads the auto-generated bearer token from `auth.token`. + - Seeds one context graph (`qa-cg`, name `QA Context Graph`) and imports + one small text file → Working Memory assertion `qa-seed-doc`. + - Persists the daemon + seed state to `e2e/setup/.daemon-state.json`. +2. **fixtures/base.ts** loads that state and: + - Injects `__DKG_TOKEN__` into every page via `addInitScript` so the UI's + mock-detector picks live API mode. + - Exposes `daemon` and `seed` fixtures to spec authors. +3. **Specs run serially** (workers=1) — the daemon is a singleton on a fixed + port, so parallel workers would race on shared backend state. +4. **globalTeardown** stops the daemon and removes the temp home. + +Prereq: the CLI must be built (`pnpm --filter @origintrail-official/dkg +build`). The harness spawns the built daemon, not source. + +## Status legend + +- ✅ covered — at least one assertion in an existing spec +- 🟡 partial — surface is touched, key interactions missing +- ❌ missing — no test exists; needs new spec or extension + +## Surface map + +| Surface | Source | POM | Spec | Status | +|---|---|---|---|---| +| AppShell layout | `Shell/Header.tsx` + `Panel*.tsx` | `app-shell.po.ts` | `shell-layout.spec.ts` | ✅ | +| Header | `Shell/Header.tsx` | `header.po.ts` | `header.spec.ts` | 🟡 | +| Left panel | `Shell/PanelLeft.tsx` | `left-panel.po.ts` | `left-navigation.spec.ts` | 🟡 | +| Center panel | `Shell/PanelCenter.tsx` | `center-panel.po.ts` | `tab-management.spec.ts` | ✅ | +| Right panel | `Shell/PanelRight.tsx` | `right-panel.po.ts` | `agent-panel.spec.ts` | 🟡 | +| Bottom panel | `Shell/PanelBottom.tsx` | `bottom-panel.po.ts` | `bottom-panel.spec.ts` | ✅ | +| Dashboard | `views/DashboardView.tsx` | `dashboard.po.ts` | `dashboard.spec.ts` | ✅ | +| Memory Stack | `views/MemoryStackView.tsx` | — | `memory-layers.spec.ts` | 🟡 | +| Project | `views/ProjectView.tsx` | `project-view.po.ts` | `project-view.spec.ts` | 🟡 | +| Memory Layer | `views/MemoryLayerView.tsx` | `memory-layer.po.ts` | `memory-layers.spec.ts` | 🟡 | +| Settings | `pages/Settings.tsx` | — | — | ❌ | +| Operations | `pages/Operations.tsx` | `operations.po.ts` | `operations.spec.ts` | 🟡 | +| Network | `pages/Network.tsx` | `network-page.po.ts` | `network-page.spec.ts` | ✅ (display-only) | +| Agent Hub | `pages/AgentHub.tsx` | — | covered via agent-panel | 🟡 | +| CreateProjectModal | `Modals/CreateProjectModal.tsx` | `modals/create-project.po.ts` | `create-project.spec.ts` | ✅ | +| JoinProjectModal | `Modals/JoinProjectModal.tsx` | — | — | ❌ | +| ImportFilesModal | `Modals/ImportFilesModal.tsx` | `modals/import-files.po.ts` | `import-files.spec.ts` | 🟡 | +| FilePreviewModal | `Modals/FilePreviewModal.tsx` | `modals/file-preview.po.ts` | — | ❌ | +| ShareProjectModal | `Modals/ShareProjectModal.tsx` | — | — | ❌ | + +Cross-cutting: `accessibility.spec.ts` ✅, `theme.spec.ts` ✅, +`keyboard-shortcuts.spec.ts` ✅. + +## Locator strategy + +All raw selectors live in `e2e/helpers/selectors.ts` under one of these groups: +`app, header, leftPanel, center, bottom, rightPanel, dashboard, +projectView, memoryLayer, memoryStack, settings, operations, network, +modals.create, modals.join, modals.import, modals.preview, modals.share`. + +Selector preference order is enforced by [SDET_AGENT.md](SDET_AGENT.md): +`getByRole` → `getByText/Label/Placeholder` → `data-testid` → CSS class. +The `.v10-…` class layer is the fallback today because the React tree leans on +class names for layout and not many elements expose roles cleanly. + +When a class is the only stable hook, add a `data-testid` to the source rather +than introducing a new fragile class. + +## Coverage gaps to close + +Each entry below maps to a TODO in the implementation list. Spec names follow +the existing pattern: kebab-case surface name, `.spec.ts` suffix. + +### Settings page — new spec `settings.spec.ts` + +Open path: header settings button (also `?tab=settings` query). + +Cases: +- LLM card renders, status badge `Not Configured` by default. +- API Key input accepts text, Show toggle reveals/masks the value. +- Model and Base URL inputs accept text. +- Save button enabled only when API Key is set; click triggers save state. +- Disconnect button visible only when configured (mock mock returns config). +- Telemetry section: share-telemetry toggle flips state. +- Local Data Retention dropdown: changing value surfaces Prune & Save / Cancel. + Cancel reverts. Save opens confirm modal; Confirm closes the modal. +- Developer Mode toggle reveals Observability tab in the page tabs. +- Background Sync Status: dropdown disabled when no CGs; Refresh is clickable. +- Danger Zone: Shutdown Node click changes label to Confirm Shutdown; second + click invokes the API. +- Node Identity / Blockchain Config show placeholder dashes when offline. + +### JoinProjectModal — new spec `join-project.spec.ts` + new POM + +Cases: +- Opens from left panel Join Project button. +- Title / subtitle render. +- Invite textarea accepts multiline input. +- Join button disabled until invite code present. +- Invite parser surfaces inline errors: + - missing project ID, + - invalid multiaddr, + - multiaddr without peer ID. +- Cancel and overlay click close the modal. +- Already-subscribed branch: error renders. +- Access-denied branch: Send Join Request button renders. + +(Backend-bound branches are stubbed at the API client level if available; +otherwise covered by error-surface assertion only.) + +### ShareProjectModal — new spec `share-project.spec.ts` + new POM + +Cases: +- Opens from project view Share button. +- Tabs Allowlist / Join Requests render; tab badge shows pending count. +- Allowlist tab: + - Network agents list renders or shows hidden when empty. + - Add Agent input rejects non-Ethereum address with inline error. + - Add Agent input accepts a valid `0x…` address; Add button enables. + - Allowed agents list shows entries with × remove. + - Invite Code `
` renders the CG ID; Copy Invite swaps to "Copied".
+- Join Requests tab:
+  - Empty state renders when none.
+  - Approve and Reject buttons appear per request and show busy state on click.
+
+### FilePreviewModal — new spec `file-preview.spec.ts`
+
+Cases:
+- Opens from a WM assertion preview link.
+- Loading state renders, then metadata (content type, triple count, pipeline).
+- PDF renders an `