Skip to content

Add impact analysis, training import tools, and CI validation features#350

Open
anandgupta42 wants to merge 3 commits intomainfrom
claude/improve-product-value-IjYGP
Open

Add impact analysis, training import tools, and CI validation features#350
anandgupta42 wants to merge 3 commits intomainfrom
claude/improve-product-value-IjYGP

Conversation

@anandgupta42
Copy link
Contributor

@anandgupta42 anandgupta42 commented Mar 21, 2026

Summary

This PR adds three major features to enhance dbt DAG analysis, team knowledge management, and CI/CD integration:

  1. Impact Analysis Tool (impact-analysis.ts): New tool that analyzes downstream impact of model/column changes across the dbt DAG. Combines manifest parsing with column-level lineage to show affected models, tests, and exposures. Includes severity classification (SAFE/LOW/MEDIUM/HIGH) and actionable recommendations for different change types (remove, rename, retype, add, modify).

  2. Training Import Tool (training-import.ts): Bulk import training entries from markdown documents (style guides, naming conventions, glossaries, playbooks). Parses markdown sections and imports them into the training system with capacity management and dry-run preview mode.

  3. CI Check Template (ci-check.txt): New command template for pre-merge SQL validation that analyzes changed files, checks dbt project integrity, runs schema validation, and generates CI-friendly reports.

  4. Enhanced CLI Features:

    • Added --max-turns budget limit option to run command for CI/enterprise governance
    • Improved yolo mode to respect explicit deny rules from session config
    • Increased training limits: TRAINING_MAX_PATTERNS_PER_KIND from 20 to 50 entries and TRAINING_BUDGET from 16KB to 48KB for enterprise teams
  5. Improved Onboarding:

    • Updated quickstart guide with practical examples and command reference table
    • Enhanced TUI tips with prioritized beginner and data engineering tips
    • Added first-time user onboarding hint on home screen
    • Expanded /discover command to detect additional cloud warehouse credentials (Snowflake, BigQuery, PostgreSQL, Databricks, Redshift)
  6. Tool Registry: Registered new ImpactAnalysisTool and TrainingImportTool in the tool registry.

Test Plan

  • Impact analysis tool tested via manifest parsing and downstream traversal logic with sample dbt projects
  • Training import tool tested with markdown parsing and capacity management
  • CLI max-turns enforcement tested with session abort logic
  • Yolo mode deny rule checking tested with permission reply logic
  • Existing tests pass; new tools follow established Tool.define() patterns

Checklist

  • New tools follow established patterns (Tool.define with zod schemas)
  • Documentation updated (quickstart.md, command templates)
  • Tool registry updated with new tools
  • Error handling and user feedback included
  • Enterprise limits increased for team scalability

https://claude.ai/code/session_01M6rR2wXn4PfMoUASy1qghV

Summary by CodeRabbit

  • New Features

    • Added impact analysis tool to assess downstream effects of model changes
    • New training import tool for persisting team-specific knowledge
    • CI check template for pre-merge validation workflows
    • Enhanced onboarding experience with beginner-focused tips and credential detection
    • CLI max-turns option for controlling session length
  • Improvements

    • Expanded data engineering command references in documentation
    • Increased training capacity for team knowledge retention
    • Enhanced permission handling with deny rules support

claude added 2 commits March 21, 2026 20:34
…g scaling, CI checks

- Fix first-time user experience: show tips for new users (was hidden),
  add onboarding hint with /connect and /discover guidance
- Add prioritized beginner tips for data engineers (12 curated tips
  shown to new users instead of random selection from 107)
- Increase training limits from 20→50 entries/kind and 16KB→48KB budget
  to support enterprise teams with rich glossaries and standards
- Add impact_analysis tool: dbt DAG-aware change assessment showing
  downstream blast radius, affected tests, and column-level impact
- Add training_import tool: bulk import from markdown docs (style guides,
  glossaries, playbooks) into the training system
- Add /ci-check skill template for pre-merge SQL quality validation
- Expand /discover to detect cloud warehouse credentials (Snowflake
  config, BigQuery service accounts, DATABASE_URL, Databricks tokens)
- Rewrite quickstart docs with progressive verification steps and
  feature discovery table

https://claude.ai/code/session_01M6rR2wXn4PfMoUASy1qghV
…rns budget

- Fix yolo mode to respect explicit deny rules from session config
  instead of blindly auto-approving all permissions. Deny rules now
  block even in yolo mode with a clear "BLOCKED by deny rule" message.
- Add --max-turns CLI flag for CI/headless budget enforcement. Aborts
  the session when the assistant exceeds the configured turn limit,
  preventing runaway agents from burning API credits indefinitely.

Addresses enterprise governance gaps identified by platform eng review:
teams of 15+ engineers running CI pipelines need spend controls and
safety guarantees that yolo mode won't bypass critical deny rules.

https://claude.ai/code/session_01M6rR2wXn4PfMoUASy1qghV
Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review.

@coderabbitai
Copy link

coderabbitai bot commented Mar 21, 2026

📝 Walkthrough

Walkthrough

This pull request introduces two new tools (impact_analysis for dbt DAG traversal and impact calculation, and training_import for markdown-based training imports), enhances CLI with max-turns limiting and deny-rule-based permission handling, upgrades training capacity limits, adds first-time user onboarding UI, and expands documentation and templates.

Changes

Cohort / File(s) Summary
New dbt Impact Analysis Tool
packages/opencode/src/altimate/tools/impact-analysis.ts
Implements DAG-aware tool to compute downstream model dependencies, affected tests, and optional column-level lineage impact; includes blast-radius severity labels and formatted reporting with change-type-specific recommendations.
New Training Import Tool & Configuration
packages/opencode/src/altimate/tools/training-import.ts, packages/opencode/src/altimate/training/types.ts
Adds tool to import training entries from markdown files with dry-run preview, capacity checking, and budget tracking; increases TRAINING_MAX_PATTERNS_PER_KIND (20→50) and TRAINING_BUDGET (16000→48000).
Tool Registry Integration
packages/opencode/src/tool/registry.ts
Registers ImpactAnalysisTool unconditionally and TrainingImportTool when training is enabled.
CLI Enhancements
packages/opencode/src/cli/cmd/run.ts
Adds --max-turns option to limit assistant turns; implements deny-rule-based permission blocking (checks session rules before auto-approving with --yolo).
First-Time User Onboarding
packages/opencode/src/cli/cmd/tui/component/tips.tsx, packages/opencode/src/cli/cmd/tui/routes/home.tsx
Introduces BEGINNER_TIPS array and conditional logic in Tips component; adds first-time-user UI block with /connect, /discover hints in home route.
Documentation & Templates
docs/docs/quickstart.md, packages/opencode/src/command/template/ci-check.txt, packages/opencode/src/command/template/discover.txt
Updates quickstart with "Verify It Works" (Step 4) and "Explore Data Engineering Features" (Step 5); adds new multi-step CI validation template; expands discover template with credential detection and new suggested commands (/ci-check, /train).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Suggested labels

contributor

Suggested reviewers

  • suryaiyer95
  • mdesmet

Poem

🐰 Two shiny tools hop into view,
Impact paths and training too!
First-time users now see the way,
With deny rules and turns at play,
The docs dance bright—hooray, hooray! 🌟

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main additions: impact analysis tool, training import tool, and CI validation features, which matches the primary changes.
Description check ✅ Passed The description includes all required template sections: a detailed Summary explaining the six areas of changes, a Test Plan covering major components, and a complete Checklist with items marked as done.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/improve-product-value-IjYGP

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/docs/quickstart.md`:
- Around line 76-90: Update the three fenced code blocks that currently lack a
language tag by adding the language specifier `text` to each opening fence;
specifically change the blocks containing the lines "What SQL anti-patterns does
this query have: SELECT * FROM orders o JOIN customers c ON o.id = c.order_id
WHERE UPPER(c.name) = 'ACME'", "Show me the tables in my warehouse", and "Scan
my dbt project and summarize the models" so their opening fences read ```text
instead of ``` to satisfy markdownlint.

In `@packages/opencode/src/altimate/tools/impact-analysis.ts`:
- Around line 92-145: When args.column is provided, narrow the reported blast
radius to only models that actually consume that column instead of using the
full downstream set: after computing columnImpact, derive a filteredDownstream
set (use columnImpact entries to identify affected downstream model names) and,
if filteredDownstream is non-empty, use it to compute totalAffected, severity,
title and metadata counts (direct/transitive/test counts and column_impact) and
to filter affectedTests to tests that target those filtered models; if
columnImpact is empty, fall back to the existing model-level analysis. Update
any usages of downstream, affectedTests, totalAffected, severity, and the
title/metadata before calling formatImpactReport so the report reflects
column-level consumers when args.column is set.
- Around line 10-14: The impact analysis currently collects models, tests and
columnImpact but never traverses manifest.exposures or manifest.sources, so
update the execution path (the code that builds the impact result alongside
columnImpact and impacted models/tests) to also iterate manifest.exposures and
manifest.sources, compute whether each exposure/source is affected by the same
lineage/columnImpact logic, and include them in the final result object (the
impact summary returned by the function that aggregates
models/tests/columnImpact). Use the same mapping/filtering logic you apply to
models and tests so exposures and sources are added to the impactedConsumers
list (or equivalent fields) and ensure the final output includes these new
arrays.
- Around line 57-90: Select and use the canonical manifest node ID throughout
the DAG traversal: first resolve targetModel by preferring an exact match on
m.name and if you must fall back to a leaf-name match capture its canonical id
into a variable (e.g. canonicalId = targetModel.name), then pass canonicalId
into findDownstream and use exact equality comparisons (===) when filtering
downstream nodes and affected tests (replace dep.includes(...) and d.name
comparisons with dep === canonicalId or d.name === dep). Apply the same exact-ID
comparison fix where downstream is later used (the section mentioned at 166-192)
so all DAG and test matching uses canonical manifest IDs only.

In `@packages/opencode/src/altimate/tools/training-import.ts`:
- Around line 113-119: The import currently truncates sections by slicing
content to 1800 chars in training-import.ts before calling TrainingStore.save,
causing silent data loss; change this so you either split long section.content
into chunks and call TrainingStore.save for each chunk (preserving
section.name/scope/source and adding a chunk index or suffix) or detect
content.length > 1800 and include an explicit truncation warning in the import
result/report instead of silently slicing; update the logic around the slice in
training-import.ts to use chunking or to surface a truncation flag, and keep
TrainingStore.save calls and identifiers (kind, name, scope, source, content)
consistent with the existing TrainingStore.save API.
- Around line 177-230: The parsing emits section objects using
slugify(currentName) which can collide across different H1 contexts (e.g., "#
Orders / ## Naming" vs "# Customers / ## Naming"); update the code paths that
push sections (all places calling slugify(currentName) — including the final
save and H2/H1 save blocks) to incorporate the currentH1 into the stored name
before slugifying (for example prefix or join H1 and H2 like `${currentH1} -
${currentName}`) or alternatively detect existing slugs in the sections array
and disambiguate (append a numeric suffix) to prevent overwrites; ensure slugify
is still used on the combined string so IDs remain normalized.

In `@packages/opencode/src/cli/cmd/run.ts`:
- Around line 620-625: The budget-overrun branch currently only aborts the
session and breaks the loop, letting execute() return success; change it so the
command fails: after calling sdk.session.abort({ sessionID }) in the error
branch where UI.println prints the danger message, rethrow the error (or throw a
new Error describing the budget overrun) so execute() does not return normally
and the process exits non-zero; locate the block around UI.println(...) and
sdk.session.abort({ sessionID }) in run.ts and ensure an exception is thrown
(rather than just break) to propagate failure to the caller.
- Around line 693-701: Replace the naive deny check that uses the local rules
array and String.includes with a proper check against the effective deny rules
and correct pattern matching: in the block that sets isDenied (currently
referencing rules and using permission.patterns.some with String.includes), use
the resolved effective rules collection (e.g., effectiveRules or whatever
variable holds the merged/active rules for the session) and test each deny
rule's pattern against the request's patterns with proper semantics (treat an
empty permission.patterns as a wildcard match, treat rule.pattern === "*" as
global deny, and perform real pattern matching rather than substring includes —
call the existing pattern-matching helper if present or implement a
matchRulePattern(rule.pattern, requestPattern) that handles wildcards and
exact/partial logic). Ensure the code checks deny rules even when
permission.patterns is empty so a global deny still applies.

In `@packages/opencode/src/cli/cmd/tui/component/tips.tsx`:
- Around line 50-53: The Tips component currently reads props.isFirstTime into
top-level consts (pool and parts) which capture the initial value and lose
reactivity; change those to Solid createMemo hooks so they recompute when
props.isFirstTime changes: replace the top-level const pool = ... with const
pool = createMemo(() => props.isFirstTime ? BEGINNER_TIPS : TIPS) and replace
parts = parse(...) with const parts = createMemo(() =>
parse(pool()[Math.floor(Math.random() * pool().length)])); update downstream
usages to call pool() and parts() accordingly so the tip selection reacts to
prop changes.

In `@packages/opencode/src/cli/cmd/tui/routes/home.tsx`:
- Around line 43-45: The showTips memo currently returns !tipsHidden()
immediately, causing a flash before the KV store is hydrated; update the memo to
also check the KV hydration state from useKV() (e.g., kv.ready / kv.hydrated /
kv.isHydrated) so it only shows tips when the store is ready—i.e., change
createMemo(() => !tipsHidden()) to something like createMemo(() =>
kv.isHydrated() && !tipsHidden()) using the kv instance returned by useKV() to
avoid the initial flash for users who previously hid tips.

In `@packages/opencode/src/command/template/ci-check.txt`:
- Around line 16-20: The current flow passes changed file paths directly to
impact_analysis, causing "MODEL NOT FOUND" errors; after running dbt_manifest
inside the dbt-detection branch (i.e., where project_scan and dbt_manifest are
invoked), build a mapping from manifest nodes to their canonical dbt model names
(using node.path/file info, alias, package and unique identifier as needed),
then for each changed file path resolve it to the manifest node/model name and
call impact_analysis with that resolved model name; if resolution fails, log a
clear warning and skip that file to avoid false negatives.
- Around line 3-5: Replace the brittle `git diff --name-only HEAD~1...HEAD`
invocation with a merge-base-aware approach: resolve the PR merge base from the
CI environment variable if present (e.g., GITHUB_BASE_REF or a provided
MERGE_BASE) and otherwise compute it via `git merge-base origin/main HEAD`, then
run `git diff --name-only <merge-base>...HEAD -- '*.sql' '*.yml' '*.yaml'`;
update the template text in ci-check.txt where the diff command appears so it
documents using the resolved merge base fallback instead of `HEAD~1`.

In `@packages/opencode/src/command/template/discover.txt`:
- Around line 16-26: Update the Step 3 text in the discover template to never
echo raw credential material: when inspecting items listed in Step 3 (e.g.,
~/.snowsql/config, GOOGLE_APPLICATION_CREDENTIALS, DATABASE_URL,
PGHOST/PGUSER/PGDATABASE, DATABRICKS_HOST/TOKEN, ~/.bigqueryrc,
~/.aws/credentials) redact secrets and tokens and only present non-sensitive
connection metadata (host/account/project/profile name, connection type, and
source file or env var) and clearly mask any strings that contain secrets (e.g.,
replace with "<REDACTED>"); ask the user whether to add the connection using
that redacted metadata and ensure the template explicitly instructs agents not
to log or display full connection strings, private keys, or tokens.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6068526c-0c67-4e4b-b33a-e0fe98c9cc09

📥 Commits

Reviewing files that changed from the base of the PR and between 623d1ac and 6140b8f.

📒 Files selected for processing (10)
  • docs/docs/quickstart.md
  • packages/opencode/src/altimate/tools/impact-analysis.ts
  • packages/opencode/src/altimate/tools/training-import.ts
  • packages/opencode/src/altimate/training/types.ts
  • packages/opencode/src/cli/cmd/run.ts
  • packages/opencode/src/cli/cmd/tui/component/tips.tsx
  • packages/opencode/src/cli/cmd/tui/routes/home.tsx
  • packages/opencode/src/command/template/ci-check.txt
  • packages/opencode/src/command/template/discover.txt
  • packages/opencode/src/tool/registry.ts

Comment on lines +76 to +90
```
What SQL anti-patterns does this query have: SELECT * FROM orders o JOIN customers c ON o.id = c.order_id WHERE UPPER(c.name) = 'ACME'
```

Look at my snowflake account and do a comprehensive Analysis our Snowflake credit consumption over the last 30 days. After doing this generate a dashboard for my consumption.
If you connected a warehouse with `/discover`, try:

```

Show me the tables in my warehouse
```

Build me a real time, interactive dashboard for my macbook system metrics and health. Use python, iceberg, dbt for various time slices.
If you have a dbt project, try:

```
Scan my dbt project and summarize the models
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add languages to these fenced examples.

markdownlint is already flagging Lines 76, 82, and 88 for missing fence languages. Mark them as text so the docs stay lint-clean.

Suggested markdown fix
-```
+```text
 What SQL anti-patterns does this query have: SELECT * FROM orders o JOIN customers c ON o.id = c.order_id WHERE UPPER(c.name) = 'ACME'

@@
- +text
Show me the tables in my warehouse

@@
-```
+```text
Scan my dbt project and summarize the models
</details>

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.21.0)</summary>

[warning] 76-76: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

[warning] 82-82: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

[warning] 88-88: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/docs/quickstart.md around lines 76 - 90, Update the three fenced code
blocks that currently lack a language tag by adding the language specifier
text to each opening fence; specifically change the blocks containing the
lines "What SQL anti-patterns does this query have: SELECT * FROM orders o JOIN
customers c ON o.id = c.order_id WHERE UPPER(c.name) = 'ACME'", "Show me the
tables in my warehouse", and "Scan my dbt project and summarize the models" so
their opening fences read text instead of to satisfy markdownlint.


</details>

<!-- fingerprinting:phantom:medusa:grasshopper -->

<!-- This is an auto-generated comment by CodeRabbit -->

Comment on lines +10 to +14
description: [
"Analyze the downstream impact of a model or column change across the dbt DAG.",
"Combines dbt manifest parsing with column-level lineage to show all affected",
"models, tests, exposures, and sources. Use before making breaking changes to",
"understand blast radius.",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The tool never reports exposures or sources.

The description says this covers affected “exposures, and sources”, but the execution path only gathers models, tests, and columnImpact. That under-reports real consumers of breaking changes and makes the output misleading until manifest.exposures and sources are traversed too.

Also applies to: 115-145

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opencode/src/altimate/tools/impact-analysis.ts` around lines 10 -
14, The impact analysis currently collects models, tests and columnImpact but
never traverses manifest.exposures or manifest.sources, so update the execution
path (the code that builds the impact result alongside columnImpact and impacted
models/tests) to also iterate manifest.exposures and manifest.sources, compute
whether each exposure/source is affected by the same lineage/columnImpact logic,
and include them in the final result object (the impact summary returned by the
function that aggregates models/tests/columnImpact). Use the same
mapping/filtering logic you apply to models and tests so exposures and sources
are added to the impactedConsumers list (or equivalent fields) and ensure the
final output includes these new arrays.

Comment on lines +57 to +90
const targetModel = manifest.models.find(
(m: { name: string }) => m.name === args.model || m.name.endsWith(`.${args.model}`),
)

if (!targetModel) {
const available = manifest.models
.slice(0, 10)
.map((m: { name: string }) => m.name)
.join(", ")
return {
title: "Impact: MODEL NOT FOUND",
metadata: { success: false },
output: `Model "${args.model}" not found in manifest. Available models: ${available}${manifest.models.length > 10 ? ` (+${manifest.models.length - 10} more)` : ""}`,
}
}

// Step 3: Build the dependency graph and find all downstream models
const modelsByName = new Map<string, any>()
for (const m of manifest.models) {
modelsByName.set(m.name, m)
}

// Find all models that depend on the target (direct + transitive)
const downstream = findDownstream(args.model, manifest.models)
const direct = downstream.filter((d) => d.depth === 1)
const transitive = downstream.filter((d) => d.depth > 1)

// Step 4: Find affected tests
const affectedTests = (manifest.tests ?? []).filter((t: { depends_on: string[] }) =>
t.depends_on?.some(
(dep: string) =>
dep.includes(args.model) || downstream.some((d) => dep.includes(d.name)),
),
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use exact manifest IDs for DAG traversal.

This mixes endsWith, split(".").pop(), and includes() against model names. In multi-package dbt repos that conflates nodes sharing the same leaf name, and the substring check can even match unrelated IDs like stg_orders_archive, so the reported downstream models/tests can be wrong. Resolve the selected node's canonical manifest ID once and compare exact IDs throughout traversal.

Also applies to: 166-192

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opencode/src/altimate/tools/impact-analysis.ts` around lines 57 -
90, Select and use the canonical manifest node ID throughout the DAG traversal:
first resolve targetModel by preferring an exact match on m.name and if you must
fall back to a leaf-name match capture its canonical id into a variable (e.g.
canonicalId = targetModel.name), then pass canonicalId into findDownstream and
use exact equality comparisons (===) when filtering downstream nodes and
affected tests (replace dep.includes(...) and d.name comparisons with dep ===
canonicalId or d.name === dep). Apply the same exact-ID comparison fix where
downstream is later used (the section mentioned at 166-192) so all DAG and test
matching uses canonical manifest IDs only.

Comment on lines +92 to +145
// Step 5: If column specified, attempt column-level lineage
let columnImpact: string[] = []
if (args.column && targetModel.sql) {
try {
const lineageResult = await Dispatcher.call("lineage.check", {
sql: targetModel.sql,
dialect: args.dialect,
})
if (lineageResult.data?.column_dict) {
// Find which downstream columns reference our target column
for (const [outCol, sources] of Object.entries(lineageResult.data.column_dict)) {
const srcArray = Array.isArray(sources) ? sources : [sources]
if (srcArray.some((s: any) => JSON.stringify(s).includes(args.column!))) {
columnImpact.push(outCol)
}
}
}
} catch {
// Column lineage is best-effort — continue without it
}
}

// Step 6: Format the impact report
const output = formatImpactReport({
model: args.model,
column: args.column,
changeType: args.change_type,
direct,
transitive,
affectedTests,
columnImpact,
totalModels: manifest.model_count,
})

const totalAffected = downstream.length
const severity =
totalAffected === 0
? "SAFE"
: totalAffected <= 3
? "LOW"
: totalAffected <= 10
? "MEDIUM"
: "HIGH"

return {
title: `Impact: ${severity} — ${totalAffected} downstream model${totalAffected !== 1 ? "s" : ""} affected`,
metadata: {
success: true,
severity,
direct_count: direct.length,
transitive_count: transitive.length,
test_count: affectedTests.length,
column_impact: columnImpact.length,
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Column-specific runs still report model-level blast radius.

When args.column is set, downstream, affectedTests, and the severity/title are still computed from every descendant of the model before any column filtering happens. A rename or removal of an unused column will therefore be reported as impacting the whole subtree; this needs to narrow the downstream set to actual column consumers or fall back to model-level analysis only when no column is provided.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opencode/src/altimate/tools/impact-analysis.ts` around lines 92 -
145, When args.column is provided, narrow the reported blast radius to only
models that actually consume that column instead of using the full downstream
set: after computing columnImpact, derive a filteredDownstream set (use
columnImpact entries to identify affected downstream model names) and, if
filteredDownstream is non-empty, use it to compute totalAffected, severity,
title and metadata counts (direct/transitive/test counts and column_impact) and
to filter affectedTests to tests that target those filtered models; if
columnImpact is empty, fall back to the existing model-level analysis. Update
any usages of downstream, affectedTests, totalAffected, severity, and the
title/metadata before calling formatImpactReport so the report reflects
column-level consumers when args.column is set.

Comment on lines +113 to +119
await TrainingStore.save({
kind: args.kind,
name: section.name,
scope: args.scope,
content: section.content.slice(0, 1800), // Enforce max content length
source: args.file_path,
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Imported sections are silently truncated at 1800 chars.

Line 117 slices every section before save, but packages/opencode/src/altimate/training/store.ts:28-74 shows TrainingStore.save() accepts the full content it is given. That means long standards and playbooks are imported incomplete with no user-visible warning; either chunk long sections or report truncation explicitly in the result.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opencode/src/altimate/tools/training-import.ts` around lines 113 -
119, The import currently truncates sections by slicing content to 1800 chars in
training-import.ts before calling TrainingStore.save, causing silent data loss;
change this so you either split long section.content into chunks and call
TrainingStore.save for each chunk (preserving section.name/scope/source and
adding a chunk index or suffix) or detect content.length > 1800 and include an
explicit truncation warning in the import result/report instead of silently
slicing; update the logic around the slice in training-import.ts to use chunking
or to surface a truncation flag, and keep TrainingStore.save calls and
identifiers (kind, name, scope, source, content) consistent with the existing
TrainingStore.save API.

Comment on lines +50 to +53
export function Tips(props: { isFirstTime?: boolean }) {
const theme = useTheme().theme
const parts = parse(TIPS[Math.floor(Math.random() * TIPS.length)])
const pool = props.isFirstTime ? BEGINNER_TIPS : TIPS
const parts = parse(pool[Math.floor(Math.random() * pool.length)])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

In SolidJS, if a component reads props.isFirstTimeinto a top-level local const instead of insidecreateMemo or JSX, does that derived value update when the parent signal changes?

💡 Result:

No, the derived value does not update when the parent signal changes if read into a top-level local const outside of createMemo or JSX. In SolidJS, component functions execute only once during initial setup to establish reactivity. Top-level const assignments, like const isFirst = props.isFirstTime;, capture the initial value statically and do not re-evaluate. Props are reactive via getters, but accessing them outside tracking scopes (JSX, createEffect, createMemo) breaks reactivity—no subscription occurs, so changes do not propagate. To make it reactive: - Use directly in JSX: {props.isFirstTime} - Wrap in createMemo: const isFirst = createMemo( => props.isFirstTime); - Or as accessor: const isFirst = => props.isFirstTime; This is confirmed across official docs and discussions: components run once, reactivity tracks only in primitives/JSX, destructuring/top-level reads lose reactivity.

Citations:


Wrap pool and parts in createMemo to track isFirstTime changes.

Lines 50–53 read props.isFirstTime into a top-level const, which captures the initial value and loses reactivity. If the parent's first-time flag changes after hydration, the component stays stuck on the wrong tip set. Use createMemo to make the pool selection reactive.

Proposed fix
 export function Tips(props: { isFirstTime?: boolean }) {
   const theme = useTheme().theme
-  const pool = props.isFirstTime ? BEGINNER_TIPS : TIPS
-  const parts = parse(pool[Math.floor(Math.random() * pool.length)])
+  const parts = createMemo(() => {
+    const pool = props.isFirstTime ? BEGINNER_TIPS : TIPS
+    return parse(pool[Math.floor(Math.random() * pool.length)])
+  })
 
   return (
     <box flexDirection="row" maxWidth="100%">
       <text flexShrink={0} style={{ fg: theme.warning }}>
         ● Tip{" "}
       </text>
       <text flexShrink={1}>
-        <For each={parts}>
+        <For each={parts()}>
           {(part) => <span style={{ fg: part.highlight ? theme.text : theme.textMuted }}>{part.text}</span>}
         </For>
       </text>
     </box>
   )
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opencode/src/cli/cmd/tui/component/tips.tsx` around lines 50 - 53,
The Tips component currently reads props.isFirstTime into top-level consts (pool
and parts) which capture the initial value and lose reactivity; change those to
Solid createMemo hooks so they recompute when props.isFirstTime changes: replace
the top-level const pool = ... with const pool = createMemo(() =>
props.isFirstTime ? BEGINNER_TIPS : TIPS) and replace parts = parse(...) with
const parts = createMemo(() => parse(pool()[Math.floor(Math.random() *
pool().length)])); update downstream usages to call pool() and parts()
accordingly so the tip selection reacts to prop changes.

Comment on lines 43 to 45
const showTips = createMemo(() => {
// Don't show tips for first-time users
if (isFirstTimeUser()) return false
// Always show tips — first-time users need guidance the most
return !tipsHidden()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Wait for KV hydration before showing tips.

useKV() hydrates kv.json asynchronously, so tipsHidden() is temporarily false on startup. With this change, users who already hid tips will still see them flash until the store becomes ready.

Proposed fix
   const showTips = createMemo(() => {
-    // Always show tips — first-time users need guidance the most
-    return !tipsHidden()
+    // Respect the persisted preference only after KV hydration completes.
+    return kv.ready && !tipsHidden()
   })
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const showTips = createMemo(() => {
// Don't show tips for first-time users
if (isFirstTimeUser()) return false
// Always show tips — first-time users need guidance the most
return !tipsHidden()
const showTips = createMemo(() => {
// Respect the persisted preference only after KV hydration completes.
return kv.ready && !tipsHidden()
})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opencode/src/cli/cmd/tui/routes/home.tsx` around lines 43 - 45, The
showTips memo currently returns !tipsHidden() immediately, causing a flash
before the KV store is hydrated; update the memo to also check the KV hydration
state from useKV() (e.g., kv.ready / kv.hydrated / kv.isHydrated) so it only
shows tips when the store is ready—i.e., change createMemo(() => !tipsHidden())
to something like createMemo(() => kv.isHydrated() && !tipsHidden()) using the
kv instance returned by useKV() to avoid the initial flash for users who
previously hid tips.

Comment on lines +3 to +5
Step 1 — Identify changed files:
Run bash to find changed files: `git diff --name-only HEAD~1...HEAD -- '*.sql' '*.yml' '*.yaml'`
If a base branch is available (e.g., main), use: `git diff --name-only origin/main...HEAD -- '*.sql' '*.yml' '*.yaml'`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "HEAD~1...HEAD"
git diff --name-only HEAD~1...HEAD -- '*.sql' '*.yml' '*.yaml' || true

base="$(git merge-base origin/main HEAD 2>/dev/null || true)"
if [ -n "$base" ]; then
  echo
  echo "$base...HEAD"
  git diff --name-only "$base"...HEAD -- '*.sql' '*.yml' '*.yaml'
fi

Repository: AltimateAI/altimate-code

Length of output: 134


🏁 Script executed:

# Check if the file exists and read its contents
if [ -f "packages/opencode/src/command/template/ci-check.txt" ]; then
  echo "File exists. Reading contents:"
  cat -n "packages/opencode/src/command/template/ci-check.txt"
else
  echo "File not found at: packages/opencode/src/command/template/ci-check.txt"
  # Try to find it
  find . -name "ci-check.txt" -type f 2>/dev/null || echo "File not found in repository"
fi

Repository: AltimateAI/altimate-code

Length of output: 2330


Use the PR merge base instead of HEAD~1 to identify changed files.

git diff HEAD~1...HEAD only compares the most recent commit to its parent, missing SQL/dbt files changed in earlier commits on a multi-commit PR. Use the PR merge-base instead: resolve it from the CI environment variable (if available) or fall back to git merge-base origin/main HEAD, then diff against that commit.

Suggested template tweak
-Step 1 — Identify changed files:
-Run bash to find changed files: `git diff --name-only HEAD~1...HEAD -- '*.sql' '*.yml' '*.yaml'`
-If a base branch is available (e.g., main), use: `git diff --name-only origin/main...HEAD -- '*.sql' '*.yml' '*.yaml'`
+Step 1 — Identify changed files:
+Resolve the PR base first (prefer the CI-provided base SHA, otherwise `git merge-base origin/main HEAD`).
+Run bash to find changed files: `git diff --name-only <base>...HEAD -- '*.sql' '*.yml' '*.yaml'`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opencode/src/command/template/ci-check.txt` around lines 3 - 5,
Replace the brittle `git diff --name-only HEAD~1...HEAD` invocation with a
merge-base-aware approach: resolve the PR merge base from the CI environment
variable if present (e.g., GITHUB_BASE_REF or a provided MERGE_BASE) and
otherwise compute it via `git merge-base origin/main HEAD`, then run `git diff
--name-only <merge-base>...HEAD -- '*.sql' '*.yml' '*.yaml'`; update the
template text in ci-check.txt where the diff command appears so it documents
using the resolved merge base fallback instead of `HEAD~1`.

Comment on lines +16 to +20
Step 3 — Check dbt project integrity (if dbt detected):
- Run `project_scan` to detect dbt project
- If manifest exists, run `dbt_manifest` to parse it
- For each changed model, run `impact_analysis` to assess downstream risk
- Flag any changes with HIGH blast radius (>10 downstream models)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Resolve dbt model names from the manifest before calling impact_analysis.

This flow only has changed file paths, but impact_analysis takes a dbt model name. Aliases, packages, and nested paths mean basename and model name often differ, so this will produce false “MODEL NOT FOUND” results unless the manifest is used to map each changed file to its node first.

Suggested template tweak
 - Run `project_scan` to detect dbt project
 - If manifest exists, run `dbt_manifest` to parse it
-- For each changed model, run `impact_analysis` to assess downstream risk
+- Map each changed file path to the corresponding dbt model name(s) from the manifest
+- For each resolved model name, run `impact_analysis` to assess downstream risk
 - Flag any changes with HIGH blast radius (>10 downstream models)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opencode/src/command/template/ci-check.txt` around lines 16 - 20,
The current flow passes changed file paths directly to impact_analysis, causing
"MODEL NOT FOUND" errors; after running dbt_manifest inside the dbt-detection
branch (i.e., where project_scan and dbt_manifest are invoked), build a mapping
from manifest nodes to their canonical dbt model names (using node.path/file
info, alias, package and unique identifier as needed), then for each changed
file path resolve it to the manifest node/model name and call impact_analysis
with that resolved model name; if resolution fails, log a clear warning and skip
that file to avoid false negatives.

Comment on lines +16 to +26
Step 3 — Check for additional cloud warehouse credentials:
Beyond what project_scan found, also check for:
- `~/.snowsql/config` for Snowflake connections (parse [connections] sections)
- `GOOGLE_APPLICATION_CREDENTIALS` env var for BigQuery service accounts
- `DATABASE_URL` env var for PostgreSQL/MySQL/Redshift connection strings
- `PGHOST`/`PGUSER`/`PGDATABASE` env vars for PostgreSQL connections
- `DATABRICKS_HOST`/`DATABRICKS_TOKEN` env vars for Databricks
- `~/.bigqueryrc` or `gcloud` config for BigQuery project ID
- `~/.aws/credentials` for Redshift connections (if combined with cluster endpoint env vars)

Present any new credentials found and ask the user if they want to add them.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Never echo raw credential material in /discover.

This step tells the agent to inspect token-bearing files and env vars and then “present any new credentials found”. That can leak passwords, access tokens, private keys, or full connection strings into the transcript and logs. Only show redacted connection metadata such as host/account/project/profile name when asking whether to add a connection.

Suggested template tweak
- Present any new credentials found and ask the user if they want to add them.
+ Present any new connections found using redacted metadata only (host/account/project/profile name).
+ Never print raw tokens, passwords, private keys, service-account JSON, or full connection strings.
+ Ask the user if they want to add the redacted connection.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/opencode/src/command/template/discover.txt` around lines 16 - 26,
Update the Step 3 text in the discover template to never echo raw credential
material: when inspecting items listed in Step 3 (e.g., ~/.snowsql/config,
GOOGLE_APPLICATION_CREDENTIALS, DATABASE_URL, PGHOST/PGUSER/PGDATABASE,
DATABRICKS_HOST/TOKEN, ~/.bigqueryrc, ~/.aws/credentials) redact secrets and
tokens and only present non-sensitive connection metadata
(host/account/project/profile name, connection type, and source file or env var)
and clearly mask any strings that contain secrets (e.g., replace with
"<REDACTED>"); ask the user whether to add the connection using that redacted
metadata and ensure the template explicitly instructs agents not to log or
display full connection strings, private keys, or tokens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants