You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add core_failure telemetry with PII-safe input signatures (#245)
* feat: add `core_failure` telemetry with PII-safe masking
Add a new `core_failure` event emitted on both soft failures
(`metadata.success === false`) and uncaught tool exceptions, with
privacy-preserving context for debugging:
- `classifyError()` — keyword-based error classification (parse, connection, timeout, validation, permission, internal, unknown)
- `computeInputSignature()` — records key names + value types/lengths, never actual values; truncates by dropping keys to preserve valid JSON
- `maskArgs()` — PII masking aligned to Rust SDK: 19 sensitive keys redacted, string literals in SQL replaced with `?`, recursive object traversal
Telemetry is fully isolated from tool execution — all tracking calls
are wrapped in `try/catch` so telemetry failures never break tools.
`Truncate.output()` runs outside the telemetry error boundary so I/O
errors aren't misattributed as tool failures.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add `skill_used` telemetry event
Tracks which skill is loaded and where it came from (`builtin`, `global`,
or `project`) with duration. Wrapped in try/catch — cannot break skill
loading. Docs table updated.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add \`sql_execute_failure\` telemetry for SQL execution errors
\`core_failure\` is for internal tool failures. SQL execution via the
dispatcher is a separate concern — soft errors are returned as results
(not thrown), so \`core_failure\` never fires for them.
New \`sql_execute_failure\` event captures: warehouse type, query type,
error message (truncated to 500 chars), and PII-masked SQL. Fires from
the \`sql.execute\` handler catch path.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add persistent machine ID from \`~/.altimate/machine-id\`
Generated once as a random UUID and stored at \`~/.altimate/machine-id\`
(alongside \`altimate.json\`, \`connections.json\`, etc.). Sent as
\`machine_id\` in \`customDimensions\` on every App Insights event.
No PII — pure random UUID, never tied to user identity.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: correct `masked_sql` field and `ERROR_PATTERNS` ordering in telemetry
- `sql_execute_failure`: use `Telemetry.maskString(params.sql)` instead of
`Telemetry.maskArgs({ sql: params.sql })` — the latter serializes a JSON
object string `{"sql":"..."}` rather than the raw masked SQL
- `ERROR_PATTERNS`: move `permission` before `validation` so errors like
"Invalid permission denied" are not misclassified as `validation_error`
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* perf: skip success \`tool_call\` telemetry for file tools
Read/write/edit/glob/grep/bash succeed constantly in normal operation —
tracking every success is high-volume noise with no actionable signal.
Failures (hard throws and soft failures) are still fully captured via
\`tool_call\` (status=error) and \`core_failure\`.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs: clarify `core_failure` event description in telemetry docs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs: simplify `core_failure` description
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: mask error messages before sending to telemetry
Error messages from SQL engines can embed data values (e.g.
"Value 'john@email.com' does not match type INTEGER"). Apply
maskString() to all error_message fields before transmission,
consistent with how args are already masked.
Affects: core_failure (tool.ts), sql_execute_failure (register.ts)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: security hardening for telemetry PII safety
- Mask error messages in `native_call` (dispatcher.ts) and `warehouse_connect` (registry.ts) — these were sending raw error strings that could embed credentials or query fragments
- Fix soft-failure `error_message` fallback: drop `result.output` as a source (raw tool output could contain file contents or secrets); fall back to `"unknown error"` instead
- Strip `_retried` internal flag from App Insights payload — was leaking into `properties` on retried events
- Add camelCase variants to `SENSITIVE_KEYS` (`authToken`, `bearerToken`, `jwtSecret`, etc.) — underscore prefix/suffix matching missed these
- Document `machine_id` in telemetry privacy docs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address major review findings in telemetry PII masking
- Extend `maskString` to also mask double-quoted strings (`"John"`, `$$secret$$`-adjacent) — single-quoted-only regex was flagged as PII leak
- Keep `connection` in `ERROR_PATTERNS` keywords (broad but intentional)
- Truncate `masked_sql` to 2000 chars before sending — was unbounded unlike `error_message` (500) and `masked_args` (2000)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs: update `core_failure` event description in telemetry reference
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore: add altimate_change markers to upstream-shared tool files
Wrap all telemetry additions in `packages/opencode/src/tool/tool.ts`
and `packages/opencode/src/tool/skill.ts` with `// altimate_change
start/end` markers so the upstream marker-guard CI passes.
- `tool.ts`: markers around `import { Telemetry }` and the full
telemetry instrumentation block (startTime through soft-failure
core_failure emission)
- `skill.ts`: markers around `classifySkillSource` helper, `startTime`
declaration, and the `Telemetry.track` try-catch for `skill_used`
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: docs/docs/reference/telemetry.md
+10-7Lines changed: 10 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,9 +11,9 @@ We collect the following categories of events:
11
11
|`session_start`| A new CLI session begins |
12
12
|`session_end`| A CLI session ends (includes duration) |
13
13
|`session_forked`| A session is forked from an existing one |
14
-
|`generation`| An AI model generation completes (model ID, token counts, duration, but no prompt content) |
15
-
|`tool_call`| A tool is invoked (tool name and category, but no arguments or output) |
16
-
|`bridge_call`| A native tool call completes (method name and duration, but no arguments) |
14
+
|`generation`| An AI model generation completes (model ID, token counts, duration — no prompt content) |
15
+
|`tool_call`| A tool is invoked (tool name and category — no arguments or output) |
16
+
|`native_call`| A native engine call completes (method name and duration — no arguments) |
17
17
|`command`| A CLI command is executed (command name only) |
18
18
|`error`| An unhandled error occurs (error type and truncated message, but no stack traces) |
19
19
|`auth_login`| Authentication succeeds or fails (provider and method, but no credentials) |
@@ -33,8 +33,11 @@ We collect the following categories of events:
33
33
|`error_recovered`| Successful recovery from a transient error (error type, strategy, attempt count) |
34
34
|`mcp_server_census`| MCP server capabilities after connect (tool and resource counts, but no tool names) |
35
35
|`context_overflow_recovered`| Context overflow is handled (strategy) |
36
+
|`skill_used`| A skill is loaded (skill name and source — `builtin`, `global`, or `project` — no skill content) |
37
+
|`sql_execute_failure`| A SQL execution fails (warehouse type, query type, error message, PII-masked SQL — no raw values) |
38
+
|`core_failure`| An internal tool error occurs (tool name, category, error class, truncated error message, PII-safe input signature, and optionally masked arguments — no raw values or credentials) |
36
39
37
-
Each event includes a timestamp, anonymous session ID, and the CLI version.
40
+
Each event includes a timestamp, anonymous session ID, CLI version, and an anonymous machine ID (a random UUID stored in `~/.altimate/machine-id`, generated once and never tied to any personal information).
38
41
39
42
## Delivery & Reliability
40
43
@@ -113,9 +116,9 @@ Event type names use **snake_case** with a `domain_action` pattern:
113
116
114
117
### Adding a New Event
115
118
116
-
1.**Define the type.** Add a new variant to the `Telemetry.Event` union in `packages/altimate-code/src/telemetry/index.ts`
117
-
2.**Emit the event.** Call `Telemetry.track()` at the appropriate location
118
-
3.**Update docs.** Add a row to the event table above
119
+
1.**Define the type**— Add a new variant to the `Telemetry.Event` union in `packages/opencode/src/altimate/telemetry/index.ts`
120
+
2.**Emit the event** — Call `Telemetry.track()` at the appropriate location
121
+
3.**Update docs** — Add a row to the event table above
0 commit comments