Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 123 additions & 11 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -1,37 +1,149 @@
name: CI

on:
workflow_dispatch:
push:
branches: [ master ]
branches: [ master, main ]
pull_request:

permissions:
contents: read

jobs:
build-and-test:
runs-on: ubuntu-latest
timeout-minutes: 15
defaults:
run:
working-directory: universal-refiner
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5

- name: Setup Node 22
uses: actions/setup-node@v4
uses: actions/setup-node@v5
with:
node-version: '22'

- name: Install dependencies
run: npm install --no-fund
run: npm ci --no-fund

- name: Rebuild native modules
run: npm rebuild better-sqlite3

- name: Generate version file
run: node scripts/sync-version.mjs

- name: Type check
run: npx tsc --noEmit
- name: Build
run: npm run build

- name: Run tests
run: npm run test:coverage

acceptance:
runs-on: ubuntu-latest
timeout-minutes: 15
defaults:
run:
working-directory: universal-refiner
strategy:
fail-fast: false
matrix:
model-order:
- primary
- reversed
steps:
- uses: actions/checkout@v5
- uses: actions/setup-node@v5
with:
node-version: '22'
cache: npm
cache-dependency-path: universal-refiner/package-lock.json
- run: npm ci --no-fund
- run: npm rebuild better-sqlite3
- run: npm run build
- name: Run all-tool and provider acceptance
run: npm run test:acceptance
- name: Run fake-model semantic acceptance
env:
PROMPT_REFINER_PRIMARY_MODEL: ${{ matrix.model-order == 'primary' && 'gemma3:12b' || 'gemma3:1b' }}
PROMPT_REFINER_FALLBACK_MODEL: ${{ matrix.model-order == 'primary' && 'gemma3:1b' || 'gemma3:12b' }}
run: npm run acceptance:semantic

stress:
runs-on: ubuntu-latest
timeout-minutes: 15
defaults:
run:
working-directory: universal-refiner
steps:
- uses: actions/checkout@v5
- uses: actions/setup-node@v5
with:
node-version: '22'
cache: npm
cache-dependency-path: universal-refiner/package-lock.json
- run: npm ci --no-fund
- run: npm rebuild better-sqlite3
- run: npm run build
- name: Run restart and in-process concurrency tests
run: npm run test:stress
- name: Run multi-process EventStore stress
env:
PROMPT_REFINER_STRESS_WORKERS: '4'
PROMPT_REFINER_STRESS_WRITES: '100'
run: npm run stress:event-store

windows:
runs-on: windows-latest
timeout-minutes: 20
defaults:
run:
working-directory: universal-refiner
steps:
- uses: actions/checkout@v5
- uses: actions/setup-node@v5
with:
node-version: '22'
cache: npm
cache-dependency-path: universal-refiner/package-lock.json
- run: npm ci --no-fund
- run: npm rebuild better-sqlite3
- run: npm run build
- run: npm run test:coverage
- run: npm run test:acceptance
- run: npm run test:stress

supply-chain:
runs-on: ubuntu-latest
timeout-minutes: 15
defaults:
run:
working-directory: universal-refiner
steps:
- uses: actions/checkout@v5
- uses: actions/setup-node@v5
with:
node-version: '22'
cache: npm
cache-dependency-path: universal-refiner/package-lock.json
- run: npm ci --no-fund
- run: npm run security:audit
- run: npm run security:secrets
- run: npm run build
- run: npm run package:check

release-gate:
if: always()
needs: [build-and-test, acceptance, stress, windows, supply-chain]
runs-on: ubuntu-latest
steps:
- name: Require every enterprise gate
env:
BUILD: ${{ needs['build-and-test'].result }}
ACCEPTANCE: ${{ needs.acceptance.result }}
STRESS: ${{ needs.stress.result }}
WINDOWS: ${{ needs.windows.result }}
SUPPLY_CHAIN: ${{ needs['supply-chain'].result }}
run: |
chmod +x node_modules/.bin/vitest 2>/dev/null || true
node_modules/.bin/vitest run --exclude '**/correlation.test.ts'
test "$BUILD" = "success"
test "$ACCEPTANCE" = "success"
test "$STRESS" = "success"
test "$WINDOWS" = "success"
test "$SUPPLY_CHAIN" = "success"
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Dependencies and generated outputs
**/node_modules/
**/dist/
**/coverage/
*.tgz

# Runtime state and local databases
Expand All @@ -23,4 +24,4 @@
.vscode/
.idea/
.DS_Store
Thumbs.db
Thumbs.db
159 changes: 159 additions & 0 deletions .planning/phases/01-fs-watcher/01-01-PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
---
phase: 01-fs-watcher
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
- universal-refiner/src/watcher/file-watcher.ts
- universal-refiner/src/watcher/index.ts
- universal-refiner/tests/file-watcher.test.ts
- universal-refiner/src/index.ts
autonomous: true
requirements:
- AUTO-01
- AUTO-02
must_haves:
truths:
- "FileWatcher emits a 'change' event when a watched .ts file is written"
- "FileWatcher emits an 'add' event when a new .ts file appears"
- "FileWatcher does not emit for files inside node_modules"
- "FileWatcher does not emit for *.log or *.tmp files"
- "FileWatcher.stop() prevents any further events"
- "Detected changes are logged via RuntimeLogger on server startup"
artifacts:
- path: "universal-refiner/src/watcher/file-watcher.ts"
provides: "FileWatcher class with start/stop/on('change') interface"
exports: ["FileWatcher", "FileChangeEvent", "FileEventKind"]
- path: "universal-refiner/src/watcher/index.ts"
provides: "Re-exports for the watcher module"
- path: "universal-refiner/tests/file-watcher.test.ts"
provides: "5 Vitest tests covering AUTO-01 and AUTO-02"
key_links:
- from: "universal-refiner/src/index.ts"
to: "universal-refiner/src/watcher/index.ts"
via: "import FileWatcher, call start() at server init"
---

<objective>
Implement a real-time file system watcher (Phase 1) for the universal-refiner MCP server.

Purpose: Satisfy AUTO-01 (detect meaningful file save events) and AUTO-02 (filter noise paths) as the foundation for the Background Autonomy milestone.

Output:
- src/watcher/file-watcher.ts — FileWatcher class wrapping chokidar v5
- src/watcher/index.ts — re-exports
- tests/file-watcher.test.ts — 5 passing Vitest tests
- src/index.ts updated to start watcher on server init
</objective>

<execution_context>
chokidar v5.0.0 is already in dependencies. No new packages needed.
Uses RuntimeLogger (stderr, JSON-RPC safe) for all output.
</execution_context>

<context>
@.planning/ROADMAP.md
@.planning/REQUIREMENTS.md
@universal-refiner/src/core/logger.ts
@universal-refiner/src/index.ts
</context>

<tasks>

<task type="auto">
<name>Task 1: Create FileWatcher module (AUTO-01, AUTO-02)</name>
<files>
universal-refiner/src/watcher/file-watcher.ts
universal-refiner/src/watcher/index.ts
</files>
<action>
FileWatcher extends EventEmitter. Constructor takes rootPath: string.
start(): watches rootPath via chokidar.watch() with ignored: CHOKIDAR_IGNORE patterns.
stop(): closes the chokidar watcher, nulls inner reference.
emitChange() applies two-layer filter before emitting:
1. Path segment check: reject paths containing /node_modules/, /dist/, /.git/, /coverage/
2. Extension check: only emit for .ts, .js, .md, .txt, .prompt
3. Suffix noise check: reject .log, .tmp
Emits: { path: string, event: 'add'|'change'|'unlink', timestamp: Date }
Logs start/stop and per-event debug via RuntimeLogger.
index.ts re-exports FileWatcher, FileChangeEvent, FileEventKind.
</action>
<verify>npm run build -- succeeds with zero type errors</verify>
<done>Both files exist, TypeScript compiles clean, exports are correct.</done>
</task>

<task type="auto" tdd="true">
<name>Task 2: Write Vitest tests (AUTO-01, AUTO-02)</name>
<files>universal-refiner/tests/file-watcher.test.ts</files>
<behavior>
- Test: write to existing .ts file in tmp dir -> 'change' event emitted with correct path and Date timestamp
- Test: write new .ts file -> 'add' event emitted
- Test: write .ts file inside node_modules subdirectory -> no event emitted (AUTO-02)
- Test: write .log file -> no event emitted (AUTO-02)
- Test: stop() called -> subsequent file write produces no events
</behavior>
<action>
Use vitest describe/it/expect. beforeEach creates a unique tmp dir via fs.mkdtempSync.
afterEach calls watcher.stop() and fs.rmSync.
Use a polling waitFor() helper with 6000ms timeout for positive assertions.
Allow 1500ms settle time after watcher.start() before writing files (Windows FS listener warm-up).
Set per-test timeout to 15_000.
</action>
<verify>npm test -- shows 5/5 file-watcher tests passing</verify>
<done>All 48 total tests pass including the 5 new watcher tests.</done>
</task>

<task type="auto">
<name>Task 3: Wire FileWatcher into server entry point</name>
<files>universal-refiner/src/index.ts</files>
<action>
Import FileWatcher from "./watcher/index.js".
After CommandCenterDashboard.start() and before runBackgroundTasks():
const fileWatcher = new FileWatcher(rootPath);
fileWatcher.on('change', (evt) => {
RuntimeLogger.info(`[FS] ${evt.event}: ${evt.path}`);
CommandCenterDashboard.log(`[FS] ${evt.event}: ${path.relative(rootPath, evt.path)}`);
});
fileWatcher.start();
BackgroundAutonomyService is left intact — FileWatcher is additive.
</action>
<verify>npm run build succeeds; server starts without error</verify>
<done>File watcher starts automatically when the MCP server initialises.</done>
</task>

</tasks>

<threat_model>
## Trust Boundaries

| Boundary | Description |
|----------|-------------|
| FS path → emitChange | File paths from chokidar are OS-provided and not sanitised before logging |

## STRIDE Threat Register

| Threat ID | Category | Component | Disposition | Mitigation Plan |
|-----------|----------|-----------|-------------|-----------------|
| T-01-01 | Information Disclosure | RuntimeLogger path output | accept | Logs go to stderr/runtime.log, not stdout (JSON-RPC channel). No PII in paths. |
| T-01-02 | Denial of Service | Rapid file changes flood emitChange | mitigate | awaitWriteFinish debounce (100ms stability) prevents event storms. |
| T-01-03 | Tampering | Malicious path with path-traversal characters | accept | FileWatcher is read-only (no FS writes). Path is logged, not executed. |
</threat_model>

<verification>
- npm run build -- zero TypeScript errors
- npm test -- 48/48 tests pass (5 new FileWatcher tests)
- Server starts and logs "[FileWatcher] Starting file system watcher" on init
</verification>

<success_criteria>
1. Writing a .ts file in the watched directory emits a change event within 3 seconds.
2. Files under node_modules, dist, .git, coverage are never emitted.
3. .log and .tmp files are never emitted.
4. stop() terminates all event delivery immediately.
5. Build and full test suite remain green.
</success_criteria>

<output>
Create .planning/phases/01-fs-watcher/01-01-SUMMARY.md after execution.
</output>
24 changes: 24 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,30 @@ cd Promptimprover

Both installers perform a deterministic dependency install, run the full test suite, build the package, install it globally, and verify the `gemini-prompt-refiner` command. Add that command to your MCP client configuration. See the [Setup Guide](https://github.com/Coding-Autopilot-System/Promptimprover/wiki/Setup-Guide) for full configuration instructions.

For optional automatic pre-prompt linting and post-execution recording, see the [cross-CLI automation guide](./docs/cross-cli-automation.md). Claude Code and Gemini CLI expose the required lifecycle hooks. Codex currently requires MCP-first instructions or explicit helper invocation because its hook lifecycle does not transparently intercept each prompt.

## Local Semantic Model

PromptImprover uses a local OpenAI-compatible endpoint before optional MCP sampling. The safe defaults target `http://localhost:9000/v1`, use `gemma3:12b` first, and fall back to `gemma3:1b`. If neither local model nor MCP sampling is available, rule-based refinement continues without semantic output.

Override the defaults per repository with `.gemini-refiner.json`:

```json
{
"semantic": {
"localEnabled": true,
"mcpSamplingEnabled": true,
"baseUrl": "http://localhost:9000/v1",
"models": ["gemma3:12b", "gemma3:1b"],
"timeoutMs": 120000,
"temperature": 0.2,
"allowNonLoopback": false
}
}
```

Non-loopback model endpoints are rejected unless `allowNonLoopback` is explicitly enabled. Generated lessons and templates remain pending until reviewed through the MCP learning-review tools.

## License

MIT - see [LICENSE](LICENSE)
32 changes: 32 additions & 0 deletions docs/cross-cli-automation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Cross-CLI Automation

PromptImprover ships fail-open pre-prompt and post-execution helpers:

- `promptimprover-hook-pre` makes one latency-safe rule-based `lint_prompt` call, creates a trackable prompt ID, and injects advisory context. Interactive MCP linting continues to use semantic providers by default.
- `promptimprover-hook-post` records privacy-safe completion metadata with `record_agent_output`.

Both commands read hook JSON from stdin, write JSON only to stdout, report failures to stderr, and always allow the client to continue. They start the same built MCP server used by `gemini-prompt-refiner`. Set `PROMPTIMPROVER_SERVER_PATH` only when testing a nonstandard build.

The helpers store only prompt ID, client name, and creation time in the OS temporary directory. They do not persist prompt or response bodies. Completion records contain output length rather than response text.

## Claude Code

Claude Code supports `UserPromptSubmit` and `Stop`, so both phases can run transparently. Merge [`claude.settings.fragment.json`](../universal-refiner/hooks/config/claude.settings.fragment.json) into the desired user or project settings file after installing the package globally.

## Gemini CLI

Gemini CLI supports `BeforeAgent` and `AfterAgent`, so both phases can run transparently. Merge [`gemini.settings.fragment.json`](../universal-refiner/hooks/config/gemini.settings.fragment.json) into the desired user or project settings file after installing the package globally.

## Codex CLI

Codex CLI `0.138.0` has a stable hook system, but its exposed lifecycle currently does not provide transparent per-prompt pre/post hooks. Do not claim that `SessionStart` performs prompt interception.

Keep PromptImprover registered as an MCP server and use repo instructions that require `lint_prompt` and `record_agent_output`. External automation can pipe normalized JSON into the same helpers; see [`codex.config.fragment.toml`](../universal-refiner/hooks/config/codex.config.fragment.toml).

## Failure And Privacy Behavior

- MCP startup, timeout, parsing, and tool errors fail open.
- Default timeout is 15 seconds; set `PROMPTIMPROVER_HOOK_TIMEOUT_MS` to a positive millisecond value to change it.
- Hook stdout remains strict JSON.
- No credentials or environment values are read or logged.
- Prompt and response text are not written to hook state or completion summaries.
Loading