Skip to content

Add docs_lookup tool for fetching data engineering documentation#152

Closed
anandgupta42 wants to merge 10000 commits intomainfrom
claude/version-aware-docs-ev1et
Closed

Add docs_lookup tool for fetching data engineering documentation#152
anandgupta42 wants to merge 10000 commits intomainfrom
claude/version-aware-docs-ev1et

Conversation

@anandgupta42
Copy link
Contributor

@anandgupta42 anandgupta42 commented Mar 15, 2026

Summary

Adds a new docs_lookup tool that fetches up-to-date, version-specific documentation for data engineering tools and database platforms. The tool supports two documentation providers:

  1. webfetch (default) — Fetches directly from official documentation sites (Snowflake, Databricks, DuckDB, PostgreSQL, ClickHouse, BigQuery) with no third-party data sharing
  2. ctx7 (opt-in) — Uses Context7 CLI for richer Python library/SDK documentation (dbt, Airflow, Spark, SQLAlchemy, Polars, etc.)

The tool intelligently falls back from ctx7 to webfetch if the primary method fails, and includes comprehensive telemetry tracking for monitoring success rates and performance.

Also includes:

  • data-docs skill definition with activation guidelines and usage examples
  • library-ids.md reference document mapping 40+ data engineering tools to their documentation sources
  • Telemetry event type for tracking docs_lookup calls

Test Plan

No testing needed — this is a new tool with no existing tests to break. The tool includes:

  • Comprehensive error handling with fallback logic
  • Telemetry tracking for all outcomes (success, error, not_found)
  • Logging at info/warn/error levels for debugging
  • 30-second timeout on all network requests to prevent hangs

Manual verification can be done by calling docs_lookup(tool="dbt-core", query="incremental models") or similar queries against the supported tools listed in the skill documentation.

Checklist

  • Documentation added (SKILL.md and library-ids.md reference)
  • Telemetry event type added
  • Tool registered in tool registry

anandgupta42 and others added 30 commits March 2, 2026 19:33
- Redesign M as 5-wide with visible V-valley to distinguish from A
- Change E top from full bar to open-right, distinguishing from T
- Fix T with full-width crossbar and I as narrow column
- Fix D shape in CODE
- Render CODE in theme.accent (purple) instead of theme.primary (peach)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- publish.ts: change glob from `*/package.json` to `**/package.json` to
  find scoped package directories (@altimate/cli-*) which are 2 levels deep
- release.yml: add skip-existing to PyPI publish so it doesn't fail when
  the engine version hasn't changed between releases

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The npm org is @AltimateAI, not @Altimate. Update all package names,
workspace dependencies, imports, and documentation to use the correct
scope so npm publish succeeds.

Name mapping:
- @altimate/cli → @altimateai/altimate-code
- @altimate/cli-sdk → @altimateai/altimate-code-sdk
- @altimate/cli-plugin → @altimateai/altimate-code-plugin
- @altimate/cli-util → @altimateai/altimate-code-util
- @altimate/cli-script → @altimateai/altimate-code-script

Also updates publish.ts to emit the wrapper package as @altimateai/altimate-code
(no -ai suffix) and hardcodes the bin entry to altimate-code.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Two issues:
1. TypeScript permission-task tests: test fixture wrote config to
   `opencode.json` but the config loader only looks for
   `altimate-code.json`. Updated fixture to use correct filename.

2. Python tests: `pytest: command not found` because pyproject.toml
   had no `dev` optional dependency group. Added `dev` extras with
   pytest and ruff.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: rename opencode references to altimate-code in all test files

Update test files to use the correct names after the config loader
was renamed from opencode to altimate-code:

- `opencode.json` → `altimate-code.json`
- `.opencode/` → `.altimate-code/`
- `.git/opencode` → `.git/altimate-code`
- `OPENCODE_*` env vars → `ALTIMATE_CLI_*`
- Cache dir `opencode` → `altimate-code`
- Schema URL `opencode.ai` → `altimate-code.dev`

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve remaining test failures and build import issue

- Fix build.ts solid-plugin import to use bare specifier for monorepo hoisting
- Update agent tests: "build" → "builder", "plan" → "analyst" for disabled fallback
- Fix well-known config mock URL in config.test.ts
- Fix message-v2 test: "OpenCode" → "Altimate CLI"
- Fix retry.test.ts: replace unsupported test.concurrent with test
- Fix read.test.ts: update agent name to "builder"
- Fix agent-color.test.ts: update config keys to "builder"
- Fix registry.test.ts: remove unpublished plugin dep from test fixture
- Skip adding plugin dependency in local dev mode (installDependencies)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address Sentry review comments and Python CI deps

- Update theme schema URL from opencode.ai to altimate-code.dev (33 files)
- Rename opencode references in ACP README.md and AGENTS.md docs
- Update test fixture tmp dir prefix to altimate-code-test-
- Install warehouse extras in Python CI for duckdb/boto3 test deps

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: Python CI — SqlGuardResult allows None data, restrict pytest to tests/

- Allow SqlGuardResult.data to be None (fixes lineage.check Pydantic error)
- Set testpaths = ["tests"] in pyproject.toml to exclude src/test_local.py
  from pytest collection (it's a source module, not a test)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve ruff lint errors in Python engine

- Remove unused imports in server.py (duplicate imports, unused models)
- Remove unused `json` import in schema/cache.py
- Remove unused `os` import in sql/feedback_store.py
- Add noqa for keyring availability check import

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use import.meta.resolve to find the @opentui/core package directory
instead of hardcoding node_modules path, which fails with monorepo
hoisting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…aming

- Build: output binary as altimate-code instead of opencode
- Bin wrapper: look for @altimateai/altimate-code-* scoped packages
- Postinstall: resolve @AltimateAI scoped platform packages
- Publish: update Docker/AUR/Homebrew refs to AltimateAI/altimate-code
- Publish: make Docker/AUR/Homebrew non-fatal (infra not set up yet)
- Dockerfile: update binary paths and names

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
…sion (#15762)

Co-authored-by: Test User <test@test.com>
Co-authored-by: Shoubhit Dash <shoubhit2005@gmail.com>
Co-authored-by: Adam <2363879+adamdotdevin@users.noreply.github.com>
govindpawa and others added 20 commits March 7, 2026 13:50
Co-Authored-By: Kai (Claude Opus 4.6) <noreply@anthropic.com>
…onfig

fix: restore TUI crash after upstream merge
fix: correct TEAM_MEMBERS ref from 'dev' to 'main' in pr-standards workflow
- Add `AltimateApi` client for datamate CRUD and integration resolution
- Add `datamate` tool with 9 operations: list, show, create, update, delete,
  add (MCP connect), remove (MCP disconnect), list-integrations, status
- Extract shared MCP config utilities (`resolveConfigPath`, `addMcpToConfig`,
  `removeMcpFromConfig`, `listMcpInConfig`) to `mcp/config.ts`
- Add `/datamate-setup` skill for guided datamate onboarding
- Register datamate tool in tool registry and TUI sync context
- Add test suite for `AltimateApi` credential loading and API methods
feat: datamate manager — dynamic MCP server management
Replace arc-runner-altimate-code with ubuntu-latest across all
workflows to eliminate security risk on public repo.

Closes #109

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add `/feedback` command and `feedback_submit` tool (#86)

- Add `feedback_submit` tool at `src/altimate/tools/feedback-submit.ts`
  with `try-catch` around all `Bun.$` calls (ENOENT safety) and
  buffer-based stdout/stderr capture for better error reporting
- Add `/feedback` slash command with guided flow template
- Register `FeedbackSubmitTool` in tool registry
- Register feedback command in `command/index.ts`
- Add 49 tests (tool + command integration)
- Fix CI `pr-standards.yml`: change `ref: 'dev'` to `ref: 'main'`
  and wrap TEAM_MEMBERS lookup in `try-catch` for rebase resilience
- Add `/feedback` to docs `commands.md`

* fix: address code review findings in feedback-submit tool

- Fix misleading error: `gh auth status` catch now returns `gh_auth_check_failed`
  instead of `gh_not_installed` (gh IS installed at that point)
- Fix fragile version check: use `startsWith("gh version")` instead of
  `includes("not found")` for cross-platform correctness
- Add `trim().min(1)` validation to `title` and `description` zod schemas
  to reject empty/whitespace-only inputs
- Check `issueResult.exitCode !== 0` before stdout URL check on issue creation
- Restore `Bun.$` in `afterAll` to prevent mock leaking across test files
- Remove trailing `$ARGUMENTS` from end of feedback.txt template (duplicated
  from Step 1 pre-fill logic; could confuse model)
- Add 8 new tests: empty/whitespace validation, `gh_auth_check_failed` path,
  non-zero exitCode with stdout scenario, and updated "not found" test name

* test: fix Windows CI failures in install and bridge tests

- Replace `/bin/echo` with `process.execPath` in bridge test — /bin/echo
  does not exist on Windows, causing ENOENT on spawn; process.execPath
  (the current Bun binary) exists on all platforms and exits quickly
  without speaking JSON-RPC as expected
- Add `unixtest` guard to postinstall, bin-wrapper, and integration tests
  — on Windows, postinstall.mjs takes a different early-exit path that
  skips hard-link setup; dummy binaries are Unix shell scripts that
  cannot be executed on Windows; skip all Unix-specific test paths using
  the same `process.platform !== "win32" ? test : test.skip` pattern
  already used in fsmonitor.test.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
- Bump engine version to 0.2.5
- Update CHANGELOG.md with v0.2.5 release notes
…eam rebase

The upstream OpenCode rebase changed the package name from
`@altimateai/altimate-code` to `opencode`, causing npm publish to fail
with E403 (trying to publish to unscoped `opencode-*` packages).

- Restore `name` field in `packages/opencode/package.json`
- Remove upstream `opencode` bin entry
- Update workspace reference in `packages/web/package.json`
npm v7+ silences postinstall stdout, so the `printWelcome()` banner was
never visible to users despite running correctly.

**Fixes:**
- Postinstall now writes a `.installed-version` marker to `XDG_DATA_HOME`
- CLI reads the marker on startup, displays a styled welcome banner, then
  removes it — works regardless of npm's output suppression
- Fixed double-v bug (`vv0.2.4`) in `printWelcome()` version display

**Tests (10 new):**
- Postinstall: marker file creation, v-prefix stripping, missing version
- Welcome module: marker cleanup, empty marker, fs error resilience
) (#133)

After 20+ minutes idle, OAuth tokens expire and subsequent prompts show
unhelpful "Error" with no context or retry. This commit fixes the issue
across Anthropic and Codex OAuth plugins:

- Add 3-attempt retry with backoff for token refresh (network/5xx only)
- Fail fast on 4xx auth errors (permanent failures like revoked tokens)
- Add 30-second proactive refresh buffer to prevent mid-request expiry
- Update `currentAuth.expires` after successful refresh
- Classify token refresh failures as `ProviderAuthError` for actionable
  error messages with recovery instructions
- Make auth errors retryable at session level with user-facing guidance
- Improve generic `Error` display (no more bare "Error" in TUI)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* chore: rebrand all user-facing references from OpenCode to Altimate Code

Comprehensive rebranding of user-facing surfaces while preserving
internal code names for upstream compatibility:

- URLs: `opencode.ai` → `altimate.ai`
- GitHub org: `anomalyco/opencode` → `AltimateAI/altimate-code`
- Product name: "OpenCode" → "Altimate Code" in UI, prompts, docs
- CLI binary: `opencode` → `altimate-code` in install scripts, nix
- npm package: `opencode-ai` → `@altimateai/altimate-code`
- Extensions: VSCode rebranded
- Social: X handle → `@Altimateinc`
- Brew tap: `AltimateAI/tap/altimate-code`
- Container registry: `ghcr.io/AltimateAI`
- All 20 localized READMEs

Preserved internal names: `@opencode-ai/` scope, `packages/opencode/`
dir, `OPENCODE_*` env vars, `.opencode/` config paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: additional user-facing branding fixes missed from upstream rebase

- `.github/workflows/beta.yml`: "Install OpenCode" → "Install Altimate Code",
  `bun i -g opencode-ai` → `bun i -g @altimateai/altimate-code`
- `.github/actions/setup-git-committer/action.yml`: descriptions updated
  from "OpenCode GitHub App" → "Altimate Code GitHub App"
- `github/index.ts`: `spawn('opencode')` → `spawn('altimate-code')`

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add upstream merge automation tooling

Comprehensive automation for merging upstream OpenCode releases into
the Altimate Code fork, inspired by Kilo-Org/kilocode's approach.

Key scripts:
- `merge.ts` — 11-step merge orchestration with auto-conflict resolution,
  branding transforms, version preservation, and `--continue` support
- `analyze.ts` — `altimate_change` marker integrity audit + branding
  leak detection (CI-friendly exit codes)
- `list-versions.ts` — upstream tag listing with merge status indicators
- `verify-restructure.ts` — branch comparison verification

Transforms (10 files):
- Core branding engine with preservation-aware product name handling
- `keepOurs` / `skipFiles` / lock file conflict auto-resolution
- Specialized transforms for `package.json`, Nix, Tauri, i18n,
  extensions, web docs, workflows, and build scripts

Configuration:
- All branding rules in TypeScript (`utils/config.ts`) for type safety
- URL, GitHub, registry, email, app ID, social, npm, brew mappings
- Preserve patterns protect internal refs (`@opencode-ai/`, `OPENCODE_`)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: remove upstream-only packages, workflows, and artifacts (Kilo-style cleanup)

Remove upstream platform packages (app, console, containers, desktop,
desktop-electron, docs, enterprise, extensions, function, identity,
slack, storybook, ui, web), translated READMEs, upstream-only workflows,
nix packaging (nix/, flake.nix, flake.lock), SST infra (sst.config.ts,
sst-env.d.ts), specs/, and .signpath/ config.

- Update workspaces from `packages/*` glob to explicit package list
- Remove unused catalog entries and devDependencies (sst, @aws-sdk/client-s3)
- Remove unused scripts (dev:desktop, dev:web, dev:storybook)
- Remove `electron` from trustedDependencies
- Remove `@opencode-ai/app#test` task from turbo.json
- Update merge-config.json and config.ts skipFiles with new patterns

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: remove upstream SST `infra/` directory and add to skipFiles

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: rewrite upstream merge README as a practical runbook

Reorganized around the step-by-step workflow someone would follow
when doing an upstream merge. Covers fork strategy, prerequisites,
the full merge process, configuration reference, and troubleshooting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add comprehensive branding, build integrity, and upstream merge guard tests

Add 141 tests across 3 test files to prevent regressions:

- `branding.test.ts` (33 tests): Verify all user-facing surfaces show
  Altimate branding — package metadata, CLI entry points, logo, welcome
  banner, install script, GitHub Action, VSCode extension, postinstall,
  and a full codebase leak scanner for `opencode.ai`/`anomalyco`/`opncd.ai`
- `build-integrity.test.ts` (19 tests): Verify workspace config, turbo.json
  validity, package dependencies, binary entry points, skip/keepOurs
  consistency, and no orphaned package references
- `upstream-guard.test.ts` (89 tests): Verify skipFiles/keepOurs config
  completeness, deleted upstream packages stay deleted, branding rules
  coverage, preserve patterns, and no upstream artifacts reappear

Fix 14 upstream branding leaks found by the tests:
- Replace `opencode.ai` URLs with `altimate.ai` in config.ts, retry.ts,
  dialog-provider.tsx, oauth-provider.ts, migrate-tui-config.ts
- Replace `opncd.ai` with `altimate.ai` in share-next.ts and import.ts
- Replace "OpenCode Go" with "Altimate Code Go" in dialog-provider.tsx

Add CI enforcement:
- New `branding-check.yml` workflow with branding audit + test jobs
- Add `branding` job to existing `ci.yml` for PR checks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: remove separate branding CI workflow — tests run with `bun test`

The branding unit tests already run as part of the existing `typescript`
CI job via `bun test`. A separate workflow and branding audit on every
commit is overkill — the `analyze.ts --branding` audit is a merge-time
tool, not a per-commit check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: persist version snapshot in merge state for `--continue` flow

The `--continue` handler was re-snapshotting versions after the merge
commit, capturing upstream versions instead of our pre-merge versions.
Now the snapshot is saved in `.upstream-merge-state.json` and restored
when resuming.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address code review — XSS, branding leaks, regex safety

- Fix XSS in OAuth error page: escape HTML in error parameter
- Fix branding: `client_name` and HTML titles/text in OAuth pages
  now show "Altimate Code" instead of "OpenCode"
- Fix Anthropic plugin regex: add word boundaries (`\b`) to prevent
  replacing "opencode" inside URLs, paths, and identifiers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* ci: retry CI run

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Bumps [@modelcontextprotocol/sdk](https://github.com/modelcontextprotocol/typescript-sdk) from 1.25.2 to 1.26.0.
- [Release notes](https://github.com/modelcontextprotocol/typescript-sdk/releases)
- [Commits](modelcontextprotocol/typescript-sdk@v1.25.2...v1.26.0)

---
updated-dependencies:
- dependency-name: "@modelcontextprotocol/sdk"
  dependency-version: 1.26.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… namespaces, citations, and audit logging (#136)

* feat: add Letta-style persistent memory blocks for cross-session agent context

Adds a file-based persistent memory system that allows the AI agent to
retain and recall context across sessions — warehouse configurations,
naming conventions, team preferences, and past analysis decisions.

Three new tools: memory_read, memory_write, memory_delete with global
and project scoping, YAML frontmatter format, atomic writes, size/count
limits, and system prompt injection support.

Closes #135

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: rebrand to Altimate Memory, add comprehensive docs and side-effect analysis

- Rename tool IDs to altimate_memory_read/write/delete
- Add comprehensive documentation at docs/data-engineering/tools/memory-tools.md
- Document context window impact, stale memory risks, wrong memory detection,
  security considerations, and mitigation strategies
- Add altimate_change markers consistent with codebase conventions
- Update tools index to include Altimate Memory category

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add TTL expiration, hierarchical namespaces, dedup detection, audit logging, citations, session extraction, and global opt-out to Altimate Memory

Implements P0/P1 improvements:
- TTL expiration via optional `expires` field with automatic filtering
- Hierarchical namespace IDs with slash-separated paths mapped to subdirectories
- Deduplication detection on write with tag-overlap warnings
- Audit log for all CREATE/UPDATE/DELETE operations
- Citation-backed memories with file/line/note references
- Session-end batch extraction tool (opt-in via ALTIMATE_MEMORY_AUTO_EXTRACT)
- Global opt-out via ALTIMATE_DISABLE_MEMORY environment variable
- Comprehensive tests: 175 tests covering all new features

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden Altimate Memory against path traversal, add adversarial tests

Security fixes:
- Replace permissive ID regex with segment-based validation that rejects
  '..', '.', '//', and all path traversal patterns (a/../b, a/./b, etc.)
- Use unique temp file names (timestamp + random suffix) to prevent race
  condition crashes during concurrent writes to the same block ID

The old regex /^[a-z0-9][a-z0-9_/.-]*[a-z0-9]$/ allowed dangerous IDs
like "a/../b" or "a/./b" that could escape the memory directory via
path.join(). The new regex validates each path segment individually.

Adds 71 adversarial tests covering:
- Path traversal attacks (10 tests)
- Frontmatter injection and parsing edge cases (9 tests)
- Unicode and special character handling (6 tests)
- TTL/expiration boundary conditions (6 tests)
- Deduplication edge cases (7 tests)
- Concurrent operations and race conditions (4 tests)
- ID validation gaps (11 tests)
- Malformed files on disk (7 tests)
- Serialization round-trip edge cases (5 tests)
- Schema validation with adversarial inputs (6 tests)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: expired blocks no longer count against capacity limit, add path guard

Addresses PR review comments:

1. Expired blocks counted against capacity (sentry[bot] MEDIUM):
   - write() now only counts non-expired blocks against MEMORY_MAX_BLOCKS_PER_SCOPE
   - Auto-cleans expired blocks from disk when total file count hits capacity
   - Users no longer see "scope full" errors when all blocks are expired

2. Path traversal defense-in-depth (sentry[bot] CRITICAL):
   - Added runtime path.resolve() guard in blockPath() to verify the resolved
     path stays within the memory directory, as a second layer behind the
     segment-based ID regex from the previous commit

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address consensus code review findings for Altimate Memory

- Add schema validation on disk reads (MemoryBlockSchema.safeParse)
- Add safe ID regex to MemoryReadTool and MemoryDeleteTool parameters
- Fix include_expired ignored when reading by specific ID
- Fix duplicate tags inflating dedup overlap count (dedupe with Set)
- Move expired block cleanup to after successful write
- Eliminate double directory scan in write() by passing preloaded blocks
- Fix docs/code mismatch: max ID length 128 -> 256
- Add 22 new tests covering all fixes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: wire up memory injection into system prompt with telemetry

- Inject memory blocks into system prompt at session start, gated by
  ALTIMATE_DISABLE_MEMORY flag
- Add memory_operation and memory_injection telemetry events to App Insights
- Add memory tool categorization for telemetry
- Document disabling memory for benchmarks/CI
- Add injection integration tests and telemetry event tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add a built-in skill that uses Context7 CLI to fetch up-to-date,
version-specific documentation for data engineering tools. Covers
dbt, Airflow, Spark, Snowflake, BigQuery, Databricks, Kafka,
SQLAlchemy, Polars, and Great Expectations. Ships out of the box
with pre-mapped library IDs — no user configuration needed.

https://claude.ai/code/session_01NZPdvEHNXDcmhgJt9RLMu1
…greSQL, ClickHouse)

Context7 doesn't index Snowflake/Databricks platform docs (SQL reference,
DDL, functions). Neither platform exposes a documentation MCP server that
works without account credentials.

Solution: use webfetch for platform SQL docs alongside ctx7 for library
docs. Added curated URL mappings for Snowflake, Databricks, DuckDB,
PostgreSQL, ClickHouse, and BigQuery official documentation. Also added
Python connector library IDs (duckdb, psycopg, clickhouse-connect) and
dbt adapters (dbt-duckdb, dbt-clickhouse) to the Context7 mappings.

https://claude.ai/code/session_01NZPdvEHNXDcmhgJt9RLMu1
Create a dedicated docs_lookup tool that wraps Context7 CLI and web
fetch with full telemetry integration. Tracks every documentation
lookup (tool_id, method, status, duration, errors) via the existing
Azure Application Insights pipeline.

Changes:
- New docs_lookup tool (src/altimate/tools/docs-lookup.ts)
  - Tries ctx7 first for library docs, falls back to webfetch for
    platform docs
  - Logs success, not_found, and error statuses with duration
  - Surfaces unknown tools and network failures clearly
- New telemetry event type: docs_lookup
  - Tracks: tool_id, method (ctx7/webfetch), status, duration_ms,
    error message, source_url
  - Added "docs" category to tool categorization
- Registered in tool registry alongside other altimate tools
- Updated data-docs skill to use docs_lookup tool instead of raw
  bash/webfetch calls

https://claude.ai/code/session_01NZPdvEHNXDcmhgJt9RLMu1
Fetch documentation directly from official docs sites (docs.snowflake.com,
duckdb.org, postgresql.org, etc.) by default. No user data is sent to
third-party services unless explicitly opted in via ALTIMATE_DOCS_PROVIDER=ctx7.

Key changes:
- Webfetch is now the default provider (direct to official docs)
- Context7 is opt-in only via ALTIMATE_DOCS_PROVIDER=ctx7 env var
- Added smart page matching: query keywords are matched against curated
  URL mappings to find the most relevant documentation page
- Library tools (e.g. duckdb, psycopg2) now fall back to their platform's
  official docs when no ctx7 provider is configured
- Extracted fetchFromWebsite and fetchFromCtx7 helper functions
- Updated SKILL.md with privacy section and updated instructions

https://claude.ai/code/session_01NZPdvEHNXDcmhgJt9RLMu1
@github-actions
Copy link

Hey! Your PR title Add docs_lookup tool for fetching data engineering documentation doesn't follow conventional commit format.

Please update it to start with one of:

  • feat: or feat(scope): new feature
  • fix: or fix(scope): bug fix
  • docs: or docs(scope): documentation changes
  • chore: or chore(scope): maintenance tasks
  • refactor: or refactor(scope): code refactoring
  • test: or test(scope): adding or updating tests

Where scope is the package name (e.g., app, desktop, opencode).

See CONTRIBUTING.md for details.

@github-actions
Copy link

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

  • PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

Comment on lines +236 to +238
.quiet()
.timeout(30_000)
.text()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The code calls .timeout() on a Bun shell command result, but this method does not exist, which will cause a TypeError at runtime, breaking the ctx7 provider.
Severity: HIGH

Suggested Fix

Replace the .timeout(30_000) call with a supported timeout mechanism. Either use Bun.spawn() which supports timeouts directly, or pass an AbortSignal.timeout(30_000) to the shell command, consistent with patterns like fetchFromWebsite() in the same file.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: packages/opencode/src/altimate/tools/docs-lookup.ts#L236-L238

Potential issue: The `fetchFromCtx7` function attempts to call a non-existent
`.timeout()` method on the result of a Bun shell command (`$`). This will cause a
`TypeError` at runtime whenever the `ctx7` documentation provider is used. Although the
error is caught and the system falls back to another method, this renders the
`ALTIMATE_DOCS_PROVIDER=ctx7` feature completely non-functional. Existing code uses
`AbortSignal.timeout()` for similar timeout requirements.

Did we get this right? 👍 / 👎 to inform future reviews.

@dev-punia-altimate
Copy link

🤖 Behavioral Analysis — 2 Finding(s)

🟡 Warnings (1)

  • F1 packages/opencode/src/altimate/tools/docs-lookup.ts:392 [Silent functional failure — library-only tools return misleading error with default provider]
    19 of the ~38 listed tools are completely non-functional with the default webfetch provider, but the error message hides this. When toolLower is a library-only entry (e.g. dbt-core, airflow, pyspark, sqlalchemy, polars, pandas, confluent-kafka, all dbt adapters, etc.) it exists in CTX7_LIBRARIES but has no corresponding entry in PLATFORM_DOCS or LIBRARY_TO_PLATFORM. Trace: (1) ctx7Id is defined e.g. /dbt-labs/dbt-core; (2) platform = PLATFORM_DOCS[toolLower] = undefined; (3) platformFromLibrary = undefined; (4) resolvedPlatform = undefined, fetchUrl = undefined; (5) ctx7 block skipped (provider is webfetch); (6) webfetch block skipped (fetchUrl is undefined); (7) notFound = !ctx7Id && !resolvedPlatform && !hasUrl = false. Because notFound is false, the code falls through and returns 'Documentation fetch failed (network error or rate limit)' — blaming the network when the actual reason is that these library-only tools have no webfetch URL at all. Affected tools: dbt-core, airflow, pyspark, confluent-kafka, sqlalchemy, polars, pandas, great-expectations, dbt-utils, dbt-expectations, dbt-snowflake, dbt-bigquery, dbt-databricks, dbt-postgres, dbt-redshift, dbt-spark, dbt-duckdb, dbt-clickhouse, elementary.

🔵 Nits (1)

  • F2 packages/opencode/src/altimate/tools/docs-lookup.ts:393 — When a library-only tool (e.g. dbt-core) is called with the default webfetch provider, no telemetry event is emitted for the failure. The if (notFound) block is skipped because notFound is false. The webfetch and ctx7 try/catch blocks were never entered (fetchUrl is undefined, provider is webfetch). Result: failures for 19 tools are completely invisible to monitoring — they show zero events instead of a not_found or error count.

Analysis run | Powered by QA Autopilot

@dev-punia-altimate
Copy link

✅ Tests — All Passed

TypeScript — passed

Python — passed

Tested at 0395d758 | Run log | Powered by QA Autopilot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.