You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: AGENTS.md
+24Lines changed: 24 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -167,3 +167,27 @@ Windows release 链路修过四颗雷,每一颗的 fix 都是不可合并的
167
167
2. Register it in `lib.rs` (`mod <name>;`).
168
168
3. Wire it into `coordinator.rs` and expose any frontend-callable surface via `commands.rs` + `invoke_handler!`.
169
169
4. Add the matching TS wrapper in `openless-all/app/src/lib/ipc.ts` (with a mock branch for browser dev).
170
+
171
+
### Third-party service integrations & library / platform API research
172
+
173
+
When implementing features that depend on **anything outside this repo** — external HTTP APIs (ASR providers, polish endpoints, GitHub API), unfamiliar crates / npm packages, platform APIs (Apple Security framework, Win32, CoreFoundation), or any SDK whose surface shifts faster than your training cut-off — do not write integration code from memory. API surfaces drift; model training data is stale by definition. The same workflow below applies whether you are calling an HTTP endpoint, learning a new Rust crate, or wiring a system framework — substitute "endpoint URL" / "function signature" / "feature flag" as appropriate.
174
+
175
+
Follow this research-first workflow:
176
+
177
+
1.**Analyze before coding.** Identify every external call this feature needs: endpoint URL, HTTP method, authentication mechanism, request body schema, expected response schema, and error codes.
178
+
2.**Delegate web search to a sub-agent.** Spawn a read-only sub-agent whose sole job is to search for official documentation. The sub-agent runs in parallel — you continue other work instead of blocking on sequential web pages.
179
+
3.**Filter sub-agent results.** When the sub-agent returns, extract only the information directly relevant to the current implementation. Discard marketing pages, unrelated API versions, or tangential tutorials.
180
+
4.**Cross-verify one key finding.** Before writing code, validate at least one structural claim (endpoint URL, required header, auth format) with a direct `web_search` or `fetch_url` call. Sub-agents can hallucinate.
181
+
5.**Implement from verified documentation.** Only write integration code after the above steps. Never guess.
182
+
183
+
**Sub-agent search brief:**
184
+
- Focus each sub-agent on a single external service or protocol — one service, one sub-agent.
185
+
- Prioritize official documentation domains (e.g., `docs.volcengine.com`, `platform.openai.com/docs`), falling back to the project's GitHub README.
186
+
- The sub-agent must return **structured** findings: endpoint URL, HTTP method, required headers, request body JSON Schema, response body JSON Schema, and error code meanings.
187
+
- If the documentation covers multiple API versions, the sub-agent must note which version was referenced.
188
+
189
+
**Anti-patterns (do not do these):**
190
+
- ✗ Writing API integration code from memory without a documentation search.
191
+
- ✗ Pasting entire web pages into the main agent context — the sub-agent does the filtering.
192
+
- ✗ Mixing field names or endpoint paths from different API versions.
193
+
- ✗ Skipping error handling — every external call must degrade gracefully when the service is unavailable.
@@ -10,6 +10,10 @@ The active codebase lives at `openless-all/app/` and is **Tauri 2 + Rust backend
10
10
11
11
UI must match `openless-all/design_handoff_openless/*.jsx` pixel-for-pixel; the JSX is reference-only, never imported.
12
12
13
+
Adjacent docs:
14
+
-`AGENTS.md` is the parallel of this file for **Codex** sessions; the research-before-coding rules at the bottom of this file delegate to it.
15
+
-`README.md` / `README.zh.md` (root) are user-facing install + feature guides; `USAGE.md` covers runtime usage. Update them when shipping user-visible features, not for internal refactors.
16
+
13
17
## Build, Run, Test
14
18
15
19
### Tauri (current — start here)
@@ -53,27 +57,48 @@ There is no test runner wired in for the frontend. `src/lib/providerSetup.test.t
53
57
54
58
## Architecture
55
59
56
-
`coordinator::Coordinator` is the **single owner of session state**. Hotkey edges drive a small phase enum (`Idle → Starting → Listening → Processing`); recorder, ASR, polish, insertion, and history are wired here and nowhere else. Library/module code never calls across modules — they each depend only on shared types.
60
+
`coordinator::Coordinator` is the **single owner of all session state** — both the dictation phase machine (`Idle → Starting → Listening → Processing → Inserting → Done`) **and** the parallel QA phase machine (`Idle → Recording → Processing`). Hotkey edges drive both. Recorder, ASR, polish, insertion, selection capture, and history are wired here and nowhere else. Leaf modules never call across each other — they each depend only on `types.rs`.
61
+
62
+
The coordinator was split into a module: `coordinator.rs` is the public entry; `coordinator/{dictation,qa,resources}.rs` carry per-pipeline logic; `coordinator_state.rs` is the pure (no Tauri / audio / clipboard) state-transition layer that makes phase decisions unit-testable.
windows_ime_{ipc,profile,protocol,session}.rs Windows IME-side text injection over IPC (parallel insertion path; activates OpenLess TSF profile and submits text via named pipe)
92
+
selection.rs Cross-platform selection capture for QA: macOS AX → Cmd/Ctrl+C simulate-copy → Linux PRIMARY (best-effort)
src/components/Capsule.tsx Capsule view + state enum
73
-
src/ (React) Main window UI: Overview / History / Vocab / Style / Settings
74
-
src/i18n/ react-i18next init + zh-CN / en resources
75
-
src/pages/_atoms.tsx Recoil atoms — global frontend state
76
-
src/state/HotkeySettingsContext.tsx HotkeySettings React context (capability + binding from backend)
97
+
src/components/Capsule.tsx Capsule view + state enum
98
+
src/ (React) Main window UI: Overview / History / Vocab / Style / Settings
99
+
src/i18n/ react-i18next init + zh-CN / en resources (zh-CN is source of truth)
100
+
src/pages/_atoms.tsx Recoil atoms — global frontend state
101
+
src/state/HotkeySettingsContext.tsx HotkeySettings React context (capability + binding from backend)
77
102
```
78
103
79
104
### Dictation pipeline
@@ -89,6 +114,32 @@ Invariants:
89
114
-**`BufferingAudioConsumer`** queues PCM until the WebSocket is ready, then drains. Recorder always pushes to it; ASR is attached after `openSession` resolves.
90
115
-**Hotkey is toggle-only**, not press-and-hold. The monitor yields one edge per modifier-key keydown; the coordinator interprets odd/even.
91
116
117
+
### Q&A pipeline (selection-based ask-the-LLM)
118
+
119
+
Parallel state machine, lives in `coordinator/qa.rs` + `qa_hotkey.rs` + `selection.rs`. Default trigger: `Cmd+Shift+;` (macOS) / `Ctrl+Shift+;` (Win/Linux).
120
+
121
+
```
122
+
QA hotkey edge → toggle panel: open → capture front_app, clear messages, show QA window
123
+
close → cancel session, hide window, sweep capsule
124
+
Option/dictation edge → routed by panel_visible flag (see below):
125
+
while panel_visible & dictation Idle → handle_qa_option_edge:
QaPhase::Recording → end_qa_session: Recorder.stop → ASR final → LLM (with selection as context) → emit qa:state
128
+
QaPhase::Processing→ ignored (LLM in flight)
129
+
otherwise handle_pressed (normal dictation)
130
+
```
131
+
132
+
Invariants & gotchas:
133
+
-**Hotkey routing.** When the QA panel is visible, the dictation hotkey edge routes to QA — *unless* a dictation session is already mid-flight (`Starting/Listening/Processing/Inserting`), in which case the edge stays with dictation. Otherwise QA's `begin_qa_session` would race for the same mic device (cpal rejects the second `build_input_stream` on macOS/Win, PipeWire opens two streams on Linux — neither is recoverable from the QA panel UI). See audit 3.3.1 in `coordinator/dictation.rs`.
134
+
-**Capsule sweep on panel open.** Open emits a fresh `CapsuleState::Idle`*only if* dictation is Idle. If dictation is Recording/Polishing/Inserting/Done, the sweep is suppressed so the user's in-flight feedback isn't wiped. See audit 3.3.4.
135
+
-**Selection capture is a 3-tier fallback** (`selection.rs`): (1) macOS AX `kAXSelectedTextAttribute` direct read, no clipboard touched; (2) macOS/Windows simulate Cmd/Ctrl+C → snapshot + restore original clipboard, 80 ms read window; (3) Linux PRIMARY via `wl-paste` / `xclip` / `xsel`, best-effort. Returns `None` when the user genuinely selected nothing.
136
+
-**Selection truncation.** Hard cap 4000 chars; over → keep first 2000 + `[…truncated…]` + last 2000. Don't raise this without checking LLM context budgeting — Gemini and Ark have different limits.
137
+
-**Multi-turn memory.**`QaSessionState.messages` accumulates `user→assistant` pairs across turns within a single panel session; closing the panel clears them.
138
+
139
+
### Insertion paths
140
+
141
+
`insertion.rs` is the cross-platform default. On Windows there is a **second insertion path** in `windows_ime_{ipc,profile,protocol,session}.rs` that activates a TSF profile (CLSID + GUID baked in `windows_ime_profile.rs`) and submits text over a named-pipe IPC. The coordinator picks one based on user preference / fallback status; both routes return the same `InsertStatus` (`Inserted` / `CopiedFallback`). When changing insertion behavior, decide which path you're touching — they don't share code.
142
+
92
143
### Permissions, credentials, on-disk state
93
144
94
145
-**Bundle ID `com.openless.app`** is hard-coded in `openless-all/app/src-tauri/tauri.conf.json` and `CredentialsVault.serviceName`. Changing it breaks system credential vault lookups *and* every existing TCC grant.
@@ -161,3 +212,55 @@ If any step fails, do not announce the release; investigate `release-tauri.yml`
161
212
2. Register it in `lib.rs` (`mod <name>;`).
162
213
3. Wire it into `coordinator.rs` and expose any frontend-callable surface via `commands.rs` + `invoke_handler!`.
163
214
4. Add the matching TS wrapper in `openless-all/app/src/lib/ipc.ts` (with a mock branch for browser dev).
215
+
216
+
## 调研先于编码:派子 agent 查 API / 库 / 平台文档
217
+
218
+
**完整规则在 [AGENTS.md `Third-party service integrations & library / platform API research`](AGENTS.md) 段落(line 171-191)。** 这里列的是 Claude Code 入场后用得上的具体工具映射。
0 commit comments