Skip to content

Commit ea1b18e

Browse files
author
baiqing
committed
chore(release): merge beta for Stable 1.3.0
2 parents d557e66 + 8ea4e89 commit ea1b18e

52 files changed

Lines changed: 7831 additions & 568 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/release-tauri.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -300,8 +300,10 @@ jobs:
300300
throw "Required WiX object missing: $p — tauri build aborted before candle ran. Check the Build (Windows) step log."
301301
}
302302
}
303-
$light = (Get-ChildItem "$env:LOCALAPPDATA\tauri\WixTools314\light.exe" -ErrorAction SilentlyContinue | Select-Object -First 1).FullName
304-
if (-not $light) { throw "WiX light.exe not found in $env:LOCALAPPDATA\tauri\WixTools314" }
303+
$light = Get-ChildItem "$env:LOCALAPPDATA\tauri\WixTools*\light.exe" -ErrorAction SilentlyContinue |
304+
Sort-Object FullName |
305+
Select-Object -Last 1 -ExpandProperty FullName
306+
if (-not $light) { throw "WiX light.exe not found under $env:LOCALAPPDATA\tauri\WixTools*" }
305307
$version = (Get-Content src-tauri\tauri.conf.json -Raw | ConvertFrom-Json).version
306308
$bundleDir = Join-Path $appRoot 'src-tauri\target\release\bundle\msi'
307309
New-Item -ItemType Directory -Force -Path $bundleDir | Out-Null

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ promo-openless/
2525
promo-openless-v2/
2626
docs/old-promo/
2727
.worktrees/
28+
# 宣传录屏 / 素材根目录(GB 级二进制,绝不入版本库)
29+
video-materials/
2830

2931
# 派生产物(兜底):项目曾出现 promo-openless-v2/node_modules 等遗漏,
3032
# 这里全局通配,避免某子目录漏配 .gitignore 时把 build artifact 推进 PR。

AGENTS.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,3 +167,27 @@ Windows release 链路修过四颗雷,每一颗的 fix 都是不可合并的
167167
2. Register it in `lib.rs` (`mod <name>;`).
168168
3. Wire it into `coordinator.rs` and expose any frontend-callable surface via `commands.rs` + `invoke_handler!`.
169169
4. Add the matching TS wrapper in `openless-all/app/src/lib/ipc.ts` (with a mock branch for browser dev).
170+
171+
### Third-party service integrations & library / platform API research
172+
173+
When implementing features that depend on **anything outside this repo** — external HTTP APIs (ASR providers, polish endpoints, GitHub API), unfamiliar crates / npm packages, platform APIs (Apple Security framework, Win32, CoreFoundation), or any SDK whose surface shifts faster than your training cut-off — do not write integration code from memory. API surfaces drift; model training data is stale by definition. The same workflow below applies whether you are calling an HTTP endpoint, learning a new Rust crate, or wiring a system framework — substitute "endpoint URL" / "function signature" / "feature flag" as appropriate.
174+
175+
Follow this research-first workflow:
176+
177+
1. **Analyze before coding.** Identify every external call this feature needs: endpoint URL, HTTP method, authentication mechanism, request body schema, expected response schema, and error codes.
178+
2. **Delegate web search to a sub-agent.** Spawn a read-only sub-agent whose sole job is to search for official documentation. The sub-agent runs in parallel — you continue other work instead of blocking on sequential web pages.
179+
3. **Filter sub-agent results.** When the sub-agent returns, extract only the information directly relevant to the current implementation. Discard marketing pages, unrelated API versions, or tangential tutorials.
180+
4. **Cross-verify one key finding.** Before writing code, validate at least one structural claim (endpoint URL, required header, auth format) with a direct `web_search` or `fetch_url` call. Sub-agents can hallucinate.
181+
5. **Implement from verified documentation.** Only write integration code after the above steps. Never guess.
182+
183+
**Sub-agent search brief:**
184+
- Focus each sub-agent on a single external service or protocol — one service, one sub-agent.
185+
- Prioritize official documentation domains (e.g., `docs.volcengine.com`, `platform.openai.com/docs`), falling back to the project's GitHub README.
186+
- The sub-agent must return **structured** findings: endpoint URL, HTTP method, required headers, request body JSON Schema, response body JSON Schema, and error code meanings.
187+
- If the documentation covers multiple API versions, the sub-agent must note which version was referenced.
188+
189+
**Anti-patterns (do not do these):**
190+
- ✗ Writing API integration code from memory without a documentation search.
191+
- ✗ Pasting entire web pages into the main agent context — the sub-agent does the filtering.
192+
- ✗ Mixing field names or endpoint paths from different API versions.
193+
- ✗ Skipping error handling — every external call must degrade gracefully when the service is unavailable.

CLAUDE.md

Lines changed: 120 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,10 @@ The active codebase lives at `openless-all/app/` and is **Tauri 2 + Rust backend
1010

1111
UI must match `openless-all/design_handoff_openless/*.jsx` pixel-for-pixel; the JSX is reference-only, never imported.
1212

13+
Adjacent docs:
14+
- `AGENTS.md` is the parallel of this file for **Codex** sessions; the research-before-coding rules at the bottom of this file delegate to it.
15+
- `README.md` / `README.zh.md` (root) are user-facing install + feature guides; `USAGE.md` covers runtime usage. Update them when shipping user-visible features, not for internal refactors.
16+
1317
## Build, Run, Test
1418

1519
### Tauri (current — start here)
@@ -53,27 +57,48 @@ There is no test runner wired in for the frontend. `src/lib/providerSetup.test.t
5357

5458
## Architecture
5559

56-
`coordinator::Coordinator` is the **single owner of session state**. Hotkey edges drive a small phase enum (`Idle → Starting → Listening → Processing`); recorder, ASR, polish, insertion, and history are wired here and nowhere else. Library/module code never calls across modules — they each depend only on shared types.
60+
`coordinator::Coordinator` is the **single owner of all session state** — both the dictation phase machine (`Idle → Starting → Listening → Processing → Inserting → Done`) **and** the parallel QA phase machine (`Idle → Recording → Processing`). Hotkey edges drive both. Recorder, ASR, polish, insertion, selection capture, and history are wired here and nowhere else. Leaf modules never call across each other — they each depend only on `types.rs`.
61+
62+
The coordinator was split into a module: `coordinator.rs` is the public entry; `coordinator/{dictation,qa,resources}.rs` carry per-pipeline logic; `coordinator_state.rs` is the pure (no Tauri / audio / clipboard) state-transition layer that makes phase decisions unit-testable.
5763

5864
```
59-
Rust (openless-all/app/src-tauri/src) Purpose
60-
────────────────────────────────────── ────────────────────────────────
61-
types.rs Pure value types: DictationSession, PolishMode, HotkeyBinding, errors
62-
hotkey.rs Global hotkey monitor (modifier-key edges)
63-
recorder.rs Mic → 16 kHz mono Int16 PCM, RMS callback
64-
asr/{mod,frame,volcengine,whisper}.rs ASR providers: Volcengine streaming WebSocket + Whisper HTTP
65-
polish.rs OpenAI-compatible chat completions (Ark / DeepSeek / etc.)
66-
insertion.rs AX focused-element write → clipboard + Cmd+V → copy-only fallback
67-
persistence.rs History/preferences/vocab JSON + platform credential vault
68-
coordinator.rs + commands.rs + lib.rs State machine, IPC surface, tray icon, window plumbing
69-
permissions.rs TCC checks (Accessibility / Microphone)
65+
Rust (openless-all/app/src-tauri/src) Purpose
66+
────────────────────────────────────────── ────────────────────────────────
67+
types.rs Pure value types: sessions, PolishMode, HotkeyBinding, errors, QaChatMessage
68+
coordinator.rs Public entry; owns Inner, hotkey wiring, capsule emits
69+
coordinator/{dictation,qa,resources}.rs Dictation pipeline / QA pipeline / shared helpers (begin/end/cancel)
70+
coordinator_state.rs Pure state transitions — Tauri-free, unit-testable
71+
commands.rs + lib.rs + main.rs IPC surface (`invoke_handler!`), tray icon, window plumbing, entry
72+
permissions.rs TCC checks (Accessibility / Microphone / AppleEvents)
73+
74+
— Hotkeys (three parallel monitors) —
75+
hotkey.rs Modifier-only hotkey via native CGEventTap (macOS) / rdev (Win/Linux)
76+
combo_hotkey.rs Custom-combo dictation hotkey (when user picks combo over modifier-only)
77+
qa_hotkey.rs QA toggle hotkey (default Cmd/Ctrl+Shift+;) via `global-hotkey` crate
78+
global_hotkey_runtime.rs Shared `global-hotkey` Carbon/Win event runtime (combo + QA share it)
79+
shortcut_binding.rs Shared parse/validate of user-configurable bindings
80+
81+
— Audio / ASR / LLM —
82+
recorder.rs Mic → 16 kHz mono Int16 PCM, RMS callback
83+
audio_mute.rs System-output mute guard while recording (RAII)
84+
asr/{mod,frame,volcengine,whisper}.rs + asr/local/* ASR providers: Volcengine streaming WS, Whisper HTTP, Bailian, local Foundry
85+
polish.rs OpenAI-compatible chat completions (Ark / DeepSeek / Codex OAuth reuse)
86+
llm_gemini.rs Native Google Gemini client — NOT OpenAI-compatible (separate auth, thinkingConfig, role:model)
87+
correction.rs User-defined correction rules (separate from vocab dictionary)
88+
89+
— Insertion (two paths) —
90+
insertion.rs AX focused-element write → clipboard + paste shortcut → copy-only fallback
91+
windows_ime_{ipc,profile,protocol,session}.rs Windows IME-side text injection over IPC (parallel insertion path; activates OpenLess TSF profile and submits text via named pipe)
92+
selection.rs Cross-platform selection capture for QA: macOS AX → Cmd/Ctrl+C simulate-copy → Linux PRIMARY (best-effort)
93+
94+
persistence.rs history.json / preferences.json / dictionary.json + platform credential vault
7095
7196
Frontend (openless-all/app/src)
72-
src/components/Capsule.tsx Capsule view + state enum
73-
src/ (React) Main window UI: Overview / History / Vocab / Style / Settings
74-
src/i18n/ react-i18next init + zh-CN / en resources
75-
src/pages/_atoms.tsx Recoil atoms — global frontend state
76-
src/state/HotkeySettingsContext.tsx HotkeySettings React context (capability + binding from backend)
97+
src/components/Capsule.tsx Capsule view + state enum
98+
src/ (React) Main window UI: Overview / History / Vocab / Style / Settings
99+
src/i18n/ react-i18next init + zh-CN / en resources (zh-CN is source of truth)
100+
src/pages/_atoms.tsx Recoil atoms — global frontend state
101+
src/state/HotkeySettingsContext.tsx HotkeySettings React context (capability + binding from backend)
77102
```
78103

79104
### Dictation pipeline
@@ -89,6 +114,32 @@ Invariants:
89114
- **`BufferingAudioConsumer`** queues PCM until the WebSocket is ready, then drains. Recorder always pushes to it; ASR is attached after `openSession` resolves.
90115
- **Hotkey is toggle-only**, not press-and-hold. The monitor yields one edge per modifier-key keydown; the coordinator interprets odd/even.
91116

117+
### Q&A pipeline (selection-based ask-the-LLM)
118+
119+
Parallel state machine, lives in `coordinator/qa.rs` + `qa_hotkey.rs` + `selection.rs`. Default trigger: `Cmd+Shift+;` (macOS) / `Ctrl+Shift+;` (Win/Linux).
120+
121+
```
122+
QA hotkey edge → toggle panel: open → capture front_app, clear messages, show QA window
123+
close → cancel session, hide window, sweep capsule
124+
Option/dictation edge → routed by panel_visible flag (see below):
125+
while panel_visible & dictation Idle → handle_qa_option_edge:
126+
QaPhase::Idle → begin_qa_session: capture_selection() → Recorder.start → ASR.openSession
127+
QaPhase::Recording → end_qa_session: Recorder.stop → ASR final → LLM (with selection as context) → emit qa:state
128+
QaPhase::Processing→ ignored (LLM in flight)
129+
otherwise handle_pressed (normal dictation)
130+
```
131+
132+
Invariants & gotchas:
133+
- **Hotkey routing.** When the QA panel is visible, the dictation hotkey edge routes to QA — *unless* a dictation session is already mid-flight (`Starting/Listening/Processing/Inserting`), in which case the edge stays with dictation. Otherwise QA's `begin_qa_session` would race for the same mic device (cpal rejects the second `build_input_stream` on macOS/Win, PipeWire opens two streams on Linux — neither is recoverable from the QA panel UI). See audit 3.3.1 in `coordinator/dictation.rs`.
134+
- **Capsule sweep on panel open.** Open emits a fresh `CapsuleState::Idle` *only if* dictation is Idle. If dictation is Recording/Polishing/Inserting/Done, the sweep is suppressed so the user's in-flight feedback isn't wiped. See audit 3.3.4.
135+
- **Selection capture is a 3-tier fallback** (`selection.rs`): (1) macOS AX `kAXSelectedTextAttribute` direct read, no clipboard touched; (2) macOS/Windows simulate Cmd/Ctrl+C → snapshot + restore original clipboard, 80 ms read window; (3) Linux PRIMARY via `wl-paste` / `xclip` / `xsel`, best-effort. Returns `None` when the user genuinely selected nothing.
136+
- **Selection truncation.** Hard cap 4000 chars; over → keep first 2000 + `[…truncated…]` + last 2000. Don't raise this without checking LLM context budgeting — Gemini and Ark have different limits.
137+
- **Multi-turn memory.** `QaSessionState.messages` accumulates `user→assistant` pairs across turns within a single panel session; closing the panel clears them.
138+
139+
### Insertion paths
140+
141+
`insertion.rs` is the cross-platform default. On Windows there is a **second insertion path** in `windows_ime_{ipc,profile,protocol,session}.rs` that activates a TSF profile (CLSID + GUID baked in `windows_ime_profile.rs`) and submits text over a named-pipe IPC. The coordinator picks one based on user preference / fallback status; both routes return the same `InsertStatus` (`Inserted` / `CopiedFallback`). When changing insertion behavior, decide which path you're touching — they don't share code.
142+
92143
### Permissions, credentials, on-disk state
93144

94145
- **Bundle ID `com.openless.app`** is hard-coded in `openless-all/app/src-tauri/tauri.conf.json` and `CredentialsVault.serviceName`. Changing it breaks system credential vault lookups *and* every existing TCC grant.
@@ -161,3 +212,55 @@ If any step fails, do not announce the release; investigate `release-tauri.yml`
161212
2. Register it in `lib.rs` (`mod <name>;`).
162213
3. Wire it into `coordinator.rs` and expose any frontend-callable surface via `commands.rs` + `invoke_handler!`.
163214
4. Add the matching TS wrapper in `openless-all/app/src/lib/ipc.ts` (with a mock branch for browser dev).
215+
216+
## 调研先于编码:派子 agent 查 API / 库 / 平台文档
217+
218+
**完整规则在 [AGENTS.md `Third-party service integrations & library / platform API research`](AGENTS.md) 段落(line 171-191)。** 这里列的是 Claude Code 入场后用得上的具体工具映射。
219+
220+
### 触发条件 — 命中任一项都先派子 agent 调研,再下笔
221+
222+
- 第三方 HTTP API(ASR 厂家 / LLM 端点 / GitHub API / Tauri plugin 服务等)
223+
- 不熟的 Rust crate / npm 包:连签名和 feature flag 都不确定时
224+
- 平台 API:Apple Security framework / CoreFoundation / Win32 / Carbon / AppKit
225+
- 仓库 lock 文件锁着的某版本到底支持什么 — 训练记忆和 `Cargo.lock` / `package-lock.json` 实际版本可能不一致
226+
- 任何跟「训练 cutoff 之后才迭代过」相关的接口
227+
228+
### 不需要派子 agent
229+
230+
- 仓库代码里已有现成调用 → `rg` / `grep` 找参考即可(仓库即文档)
231+
- 通用编程 / 算法 / 自己能推导的语言特性
232+
- 单文件 surgical 改动且改动点的 API 已有用例
233+
- 查本仓库已有模块(`types.rs` / `coordinator.rs` 等)— 直接 Read
234+
235+
### 工具优先级
236+
237+
```text
238+
1. Context7 MCP(最高优先 — 主流库覆盖广,version-aware)
239+
- mcp__context7__resolve-library-id → 拿 library id
240+
- mcp__context7__query-docs → 当前版本的官方文档片段
241+
242+
2. documentation-lookup skill
243+
/skill documentation-lookup —— 包装 Context7,含路由 + 缓存。
244+
245+
3. Agent 子 agent(subagent_type=general-purpose)
246+
场景:Context7 没覆盖(小众 crate / 新 SDK / 非英文文档),
247+
或需多源交叉(官方文档 + GitHub README + Stack Overflow)。
248+
子 agent 用 WebFetch / WebSearch / Context7 综合,回 200-400 字结构化结果。
249+
250+
4. 单点兜底:直接 WebFetch 单页文档(只读最权威一篇时)
251+
```
252+
253+
### 子 agent prompt 必备字段
254+
255+
1. **目标问题**:一句话讲清要解决的具体技术问题(不要"了解一下 X"这种空靶)
256+
2. **本仓库现状**:当前 lock 着的版本(`Cargo.lock` / `package-lock.json` 拉一下)+ 现有调用点 `file:line`(若有)
257+
3. **必须返回的结构**:函数/端点签名 → 最小可运行示例(≤20 行)→ **版本兼容范围**(vs 训练记忆的核心校验点)→ 已知坑 / 平台差异 / 弃用计划
258+
4. **禁令**:不改本仓库代码;不贴文档原文(distill 关键部分,避免上下文撑爆);多个独立服务分别派 agent — 一个服务一个 agent
259+
260+
### 反例
261+
262+
- ✗ 凭训练记忆写第三方 API 调用,假定参数签名就这样
263+
- ✗ 把整段官方文档 paste 进主上下文
264+
- ✗ 先写代码再查文档
265+
- ✗ 单子 agent 同时调研 5 个不相关库(每个独立 prompt + 独立上下文)
266+
- ✗ 子 agent 返回后跳过 cross-verify 直接写代码 — AGENTS.md 第 4 步要求至少用一次 `WebFetch` 直接命中官方源核对一项关键事实

0 commit comments

Comments
 (0)