feat: env-based skill selection — run LLM once, cache for session by kulvirgit · Pull Request #192 · AltimateAI/altimate-code

kulvirgit · 2026-03-16T20:24:33Z

Summary

Run the LLM skill selector once per session using environment fingerprint only (not per-turn user message), then cache the result — subsequent turns get 0ms latency and zero API cost
Trim fingerprint detections to data-engineering only (dbt, sql, profiles.yml adapters, airflow, databricks); remove generic detections (node, python, docker, ci-cd, etc.)
Add env_fingerprint_skill_selection config toggle under experimental (default: true; set false to skip LLM selection entirely)

Test plan

bun test test/altimate/skill-filtering.test.ts — 18 tests (selection, caching, fallbacks, prompt content)
bun test test/altimate/fingerprint.test.ts — 5 tests (detect, cache, refresh, dedup)
Manual: enable dynamic_skills, verify LLM called once on first turn, cached on second turn (check logs)
Manual: set env_fingerprint_skill_selection: false, verify all skills returned without LLM call

🤖 Generated with Claude Code

packages/opencode/src/altimate/skill-selector.ts

+  if (cachedResult) {
+    log.info("returning cached skill selection", {
+      count: cachedResult.length,
+    })
+    return cachedResult
+  }


packages/opencode/src/altimate/skill-selector.ts

+}
+
+async function defaultResolveModel(): Promise<LanguageModelV2 | undefined> {
+  const { providerID, modelID } = await Provider.defaultModel()


packages/opencode/src/altimate/skill-selector.ts

+  if (cachedResult && cwd === cachedCwd) {
+    log.info("returning cached skill selection", {
+      count: cachedResult.length,
+    })
+    return cachedResult
+  }


packages/opencode/src/altimate/observability/tracing.ts

packages/opencode/src/altimate/skill-selector.ts

+    const result = await Promise.race([
+      generate(params),
+      new Promise<never>((_, reject) =>
+        setTimeout(() => reject(new Error("skill selection timeout")), TIMEOUT_MS),
+      ),
+    ])


sentry · 2026-03-16T23:46:58Z

packages/opencode/src/altimate/fingerprint/index.ts

+    return detect(previousCwd)
+  }
+
+  export async function detect(cwd: string, root?: string): Promise<Result> {


Bug: The fingerprint cache is not invalidated between sessions for the same directory, leading to stale environment data being used for skill selection.
_{Severity: MEDIUM}

Suggested Fix

The cache should be made session-aware. One approach is to call Fingerprint.refresh() at the start of each new session to force re-detection. Alternatively, the cache could be invalidated between sessions or the cache key could be modified to include a session-specific identifier.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: packages/opencode/src/altimate/fingerprint/index.ts#L28 Potential issue: A module-level cache for environment fingerprints is keyed only by the current working directory (`cwd`). When a user starts a new session in the same directory, `Fingerprint.detect()` is called with the same `cwd` and returns the cached result from the previous session. If the project's dependencies or configuration (e.g., adding a `databricks.yml` file) changed between sessions, the stale cache is used. This leads to incorrect environment detection and subsequent inaccurate skill selection, as the system will not be aware of the updated project state.

packages/opencode/src/altimate/observability/tracing.ts

packages/opencode/src/altimate/skill-selector.ts

+    log.info("returning cached skill selection", {
+      count: cachedResult.length,
+    })
+    Tracer.active?.logSpan({
+      name: "skill-selection",
+      startTime,
+      endTime: Date.now(),
+      input: { fingerprint: fingerprint?.tags, source: "cache" },
+      output: { count: cachedResult.length, skills: cachedResult.map((s) => s.name) },
+    })
+    return cachedResult


sentry · 2026-03-17T01:19:51Z

🚧 Skipped: PR exceeds review size limit.

Please split into smaller PRs and re-run.
_{Reference ID: 11858376}

sentry · 2026-03-17T01:24:54Z

🚧 Skipped: PR exceeds review size limit.

Please split into smaller PRs and re-run.
_{Reference ID: 11858525}

sentry · 2026-03-17T01:27:45Z

🚧 Skipped: PR exceeds review size limit.

Please split into smaller PRs and re-run.
_{Reference ID: 11858643}

Run LLM skill selector once per session using environment fingerprint, cache by working directory, and apply filtering to both system prompt and tool description. Adds tracing spans for fingerprint, skill selection, and system prompt. - Use LLM.stream for skill selection (proper provider auth) - Plain text response parsing (one skill name per line) - Cache keyed by cwd — invalidates on project change - Filter skills in both SystemPrompt.skills() and SkillTool - Add env_fingerprint_skill_selection config (default: true) - Trim fingerprint to data-engineering detections only - Add tracing for fingerprint, skill-selection, and system-prompt spans Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

sentry · 2026-03-17T01:32:24Z

packages/opencode/src/altimate/fingerprint/index.ts

+      try {
+        const sqlFiles = await Glob.scan("*.sql", {
+          cwd: dir,
+          include: "file",
+        })
+        if (sqlFiles.length > 0) {


Bug: The glob pattern *.sql only scans the root directory for SQL files, failing to detect them in subdirectories, which is standard for dbt projects.
_{Severity: MEDIUM}

Suggested Fix

To correctly identify SQL files in a typical dbt project structure, update the glob pattern in fingerprint/index.ts from *.sql to **/*.sql. This will enable a recursive search through all subdirectories.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: packages/opencode/src/altimate/fingerprint/index.ts#L114-L119 Potential issue: The SQL detection logic uses `Glob.scan("*.sql", ...)` to identify SQL files for project fingerprinting. This pattern does not recursively search subdirectories. Since dbt projects typically organize SQL models within subdirectories like `models/`, the detection will fail for these standard project structures. This results in an incomplete project fingerprint, as the `sql` tag will be missing. Consequently, the LLM skill selector receives inaccurate environment information, leading to suboptimal skill selection. The failure is silent due to an exception handler.