Skip to content

feat: Add HCL/Terraform language support#109

Open
kenrick-g wants to merge 2 commits intojustrach:mainfrom
kenrick-g:feature/hcl-support
Open

feat: Add HCL/Terraform language support#109
kenrick-g wants to merge 2 commits intojustrach:mainfrom
kenrick-g:feature/hcl-support

Conversation

@kenrick-g
Copy link
Copy Markdown

Closes #108

Summary

  • Add hcl to Language enum with detection for .tf, .tfvars, .hcl extensions
  • Implement parseHclLine() recognising 7 top-level block types: resource, data, module, variable, output, provider, locals/terraform/moved
  • Record module source = "..." as imports for dependency graph
  • Add HCL comment support (#, //, /* */) to isCommentOrBlank
  • Add .terraform and .terragrunt-cache to skip_dirs in watcher.zig
  • Add 11 test cases covering all block types, comments, findSymbol, and language detection

Files touched

File Lines Change
src/watcher.zig +2 Add .terraform, .terragrunt-cache to skip_dirs
src/explore.zig +240 Language enum, detection, parseHclLine(), helpers
src/tests.zig +251 11 HCL test cases
Total +493

Tests run

$ zig build test
(exit 0, all tests pass including 11 new HCL tests)

Before (on a ~20K-file Terraform monorepo)

  • 19,487 files indexed (17K+ from .terraform/ caches)
  • .tf files detected as unknown, 0 symbols extracted
  • outline, symbol, deps return nothing for Terraform files
  • Search results polluted by cached provider/module files

After

  • 1,836 files indexed (.terraform/ excluded)
  • 93ms snapshot time (down from 18s)
  • Full symbol extraction: resource, data, variable, output, module, provider, locals, terraform
  • find azurerm_kubernetes_cluster.main → exact match at cluster.tf:60
  • Clean search results, no cache noise

Nearby non-regression

Existing Python, TypeScript, Rust, and Zig parser tests all pass unchanged. No modifications to existing parser logic.

Branch

Rebased onto current main. No generated files, no lockfile changes.

Confirmation

This submission matches CONTRIBUTING.md.

- Add `hcl` to Language enum with detection for .tf, .tfvars, .hcl extensions
- Implement parseHclLine() recognising 7 top-level block types: resource, data,
  module, variable, output, provider, locals/terraform/moved
- Record module `source = "..."` as imports for dependency graph
- Add HCL comment support (#, //, /* */) to isCommentOrBlank
- Add .terraform and .terragrunt-cache to skip_dirs in watcher.zig
- Add 11 test cases covering all block types, comments, findSymbol, and language detection

Tested on a ~20K-file Terraform monorepo: file count dropped from 19,487 to 1,836,
snapshot time from 18s to 93ms, and symbol extraction works across all HCL block types.
@kenrick-g kenrick-g changed the title Add HCL/Terraform language support feat: Add HCL/Terraform language support Apr 2, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6e2ff44035

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

…blocks

Addresses review feedback:

1. Track /* ... */ block comment state across lines so commented-out
   resource/variable blocks are not indexed as live symbols.

2. Only extract `source = "..."` as an import when inside a module block
   (tracked via brace depth), preventing false dep graph edges from
   provisioner/file source attributes.

Adds 2 test cases covering both fixes.
@justrach
Copy link
Copy Markdown
Owner

justrach commented Apr 2, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 87728695ee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1105 to +1107
if (std.mem.indexOf(u8, line, "{") != null) {
in_module.* = true;
brace_depth.* = 1;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enter module context when block brace is on next line

in_module is only enabled when the module declaration line already contains {, so valid Terraform formatted as module "x" on one line and { on the next never enters module-tracking mode. In that case, subsequent source = "..." inside the module is not captured in outline.imports, which makes dependency graph results incomplete for a supported HCL style.

Useful? React with 👍 / 👎.

Comment on lines +1217 to +1219
for (line) |ch| {
if (ch == '{') brace_depth.* += 1;
if (ch == '}') brace_depth.* -= 1;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Ignore braces inside strings when tracking module depth

Module depth is updated by counting every { and } character in the raw line, including braces inside quoted values, so a line like description = "}" can prematurely close module context (or "{" can keep it open). This causes source extraction to be missed inside real modules or to leak into following non-module blocks, producing incorrect dependency edges.

Useful? React with 👍 / 👎.

@justrach
Copy link
Copy Markdown
Owner

justrach commented Apr 2, 2026

Pushed fixes addressing all Codex findings:

  1. P2 fix — next-line brace: Module context now enters even when { is on the line after module "name" (common Terraform formatting)
  2. P2 fix — string-aware brace tracking: Braces inside double-quoted strings (e.g. description = "}") no longer corrupt module depth tracking

Both fixes have passing tests. The P1 (block comments) and original P2 (source scoping) were already fixed in the previous commit.

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 87728695ee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1105 to +1107
if (std.mem.indexOf(u8, line, "{") != null) {
in_module.* = true;
brace_depth.* = 1;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enter module scope when opening brace is on next line

Set in_module when a module "..." block starts even if { is placed on the following line. As written, module context is only enabled when { is on the same line, so common formatting like module "x" then newline { causes all nested source = "..." values to be skipped, leaving dependency imports incomplete.

Useful? React with 👍 / 👎.

Comment on lines +1217 to +1219
for (line) |ch| {
if (ch == '{') brace_depth.* += 1;
if (ch == '}') brace_depth.* -= 1;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Ignore quoted braces when updating module brace depth

The module brace counter currently increments/decrements on every {/} byte, including braces inside string literals. In module blocks, values like description = "}" can prematurely drop brace_depth to zero and exit module scope, so a later source = "..." is missed (or scope may drift and capture non-module source lines). Brace tracking needs to skip quoted content.

Useful? React with 👍 / 👎.

@justrach
Copy link
Copy Markdown
Owner

justrach commented Apr 2, 2026

Hey @kenrick-g, Codex flagged 4 issues (you can see them in the review comments above). The good news is your second commit already fixes 2 of them (P1 block comments + P2 source scoping). Two remain:

P2: Module { on next line not entering module context

When Terraform is formatted as:

module "vpc"
{
  source = "..."
}

The { on the next line doesn't trigger in_module, so source isn't captured.

Fix — in parseHclLine, change the module block tracking (around the if (std.mem.indexOf(u8, line, "{") != null) section) to always enter module mode:

                // Track that we entered a module block for source extraction
                if (std.mem.indexOf(u8, line, "{") != null) {
                    in_module.* = true;
                    brace_depth.* = 1;
                } else {
                    // Brace may be on the next line (common Terraform formatting)
                    in_module.* = true;
                    brace_depth.* = 0;
                }

P2: Braces inside strings corrupt module depth

A line like description = "}" decrements brace_depth, which can prematurely close module context.

Fix — replace the raw brace-counting loop with a string-aware version:

        // Track brace depth for module context (string-aware)
        if (in_module.*) {
            var in_string: bool = false;
            var escaped: bool = false;
            for (line) |ch| {
                if (in_string) {
                    if (escaped) {
                        escaped = false;
                    } else if (ch == '\\') {
                        escaped = true;
                    } else if (ch == '"') {
                        in_string = false;
                    }
                    continue;
                }
                if (ch == '"') {
                    in_string = true;
                } else if (ch == '{') {
                    brace_depth.* += 1;
                } else if (ch == '}') {
                    brace_depth.* -= 1;
                }
            }

Both fixes have tests we verified locally. Once you push these, comment @codex review again and it should come back clean. Also please rebase onto current main since we just merged #87 (PHP support). Thanks for the great work on this!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 87728695ee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1105 to +1108
if (std.mem.indexOf(u8, line, "{") != null) {
in_module.* = true;
brace_depth.* = 1;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enter module scope even when { is on next line

The parser only sets in_module when the module declaration line already contains {, so formatted Terraform like module "vpc" followed by { on the next line never enters module scope. In that case source = "..." inside the module is skipped, so dependency imports are silently missing for a common HCL style.

Useful? React with 👍 / 👎.

Comment on lines +1217 to +1219
for (line) |ch| {
if (ch == '{') brace_depth.* += 1;
if (ch == '}') brace_depth.* -= 1;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Ignore braces inside strings when tracking module depth

Brace depth is updated by scanning raw characters, which counts {/} inside quoted values (for example description = "}"). That can prematurely drop brace_depth to 0 and exit module scope before the real closing brace, causing later source = "..." lines in the same module to be missed.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add HCL/Terraform language support

2 participants