Gortex currently indexes 256 languages. Each language has an extractor that
walks the source, emits symbols (functions, methods, types, interfaces,
variables) into the graph, and records imports / calls edges.
Three engine tiers are used, in order of decreasing extraction depth:
- bespoke tree-sitter (~30 languages) — full concrete syntax tree via a vendored grammar with hand-tuned S-expression queries. Produces high-fidelity symbols, resolved call edges, ORM/contract/dataflow extraction, and accurate node ranges. Languages: Go, TypeScript, JavaScript, Python, Rust, Java, C#, Kotlin, Swift, Scala, PHP, Ruby, Elixir, C, C++, Dart, OCaml, Lua, Bash, SQL, HTML, CSS, Markdown, OrgMode, Protobuf, YAML, TOML, HCL, Dockerfile.
- regex (~60 languages) — pattern-matched line scanning with indent / brace / keyword block heuristics. Captures top-level symbols and imports; call edges vary per language. Used where no upstream tree-sitter grammar is available (Verse, AL, SAS, Stata, AutoHotkey, CoffeeScript) or for legacy / niche languages where the regex path was sufficient (ABAP, COBOL, Fortran, …).
- forest signature-only (~165 languages) — generic
*ts.Language-driven extractor wrappinggithub.com/alexaandru/go-sitter-forest. Reads the grammar's bundledtags.scm(nvim-treesitter convention) when present and falls back to a node-kind heuristic walker otherwise. Emits definitions (function / method / type / interface / variable / constant / module) plusEdgeDefinesfrom the file.@reference.call/@reference.functioncaptures route to the enclosing function. No ORM / contract / dataflow / scope-aware resolution — graduate a language to the bespoke tier when those matter.
For sixteen of these languages an LSP server can additionally upgrade
edges from ast_inferred to lsp_resolved and unlock the
on-demand action tools (get_diagnostics, get_code_actions,
apply_code_action, fix_all_in_file). See lsp.md for the
server matrix, install commands, lifecycle knobs, and config schema.
| Category | Count | Languages |
|---|---|---|
| Core programming | 10 | Go, TypeScript, JavaScript, Python, Rust, Java, C#, C, C++, Kotlin |
| JVM, .NET, systems | 10 | Scala, Swift, PHP, Ruby, Groovy, F#, D, Zig, Vala, Objective-C |
| Scripting & shell | 10 | Bash, PowerShell, Batch, Perl, Raku, Lua, Tcl, VimScript, AutoHotkey, CoffeeScript |
| Functional | 8 | Haskell, OCaml, Elixir, Clojure, Erlang, Racket, Gleam, Emacs Lisp |
| Systems / emerging | 8 | Nim, Crystal, Mojo, Odin, V, Hare, Carbon, ReScript |
| Scientific & enterprise | 12 | Julia, R, MATLAB, Mathematica, SAS, Stata, Fortran, COBOL, Ada, Pascal, ABAP, Apex |
| Mobile & game | 4 | Dart, GDScript, Verse, ActionScript |
| Blockchain / smart contracts | 6 | Solidity, Move, Cairo, Noir, Tact, Ballerina |
| Template engines | 8 | Blade, EJS, Handlebars, Jinja, Twig, ERB, Liquid, Pug |
| Data, config, build | 12 | JSON, YAML, TOML, HCL/Terraform, SQL, Protobuf, Markdown, HTML, CSS, Dockerfile, Makefile, CMake |
| Niche / domain | 4 | Nix, AL (Business Central), Assembly (NASM/GAS/ARM/WLA-DX/CA65), Shaders (GLSL/HLSL) |
| Forest — frontend / templates | ~16 | Vue, Svelte, Astro, htmldjango, gotmpl, Haml, Slim, Glimmer, Razor, Templ, Tera, Mustache, Vento, SuperHTML, HEEx |
| Forest — schemas / IaC / IDLs | ~25 | GraphQL, Prisma, Jsonnet, Dhall, CUE, Pkl, Nickel, KCL, Bicep, Smithy, Cap'n Proto, Thrift, KDL, RON, TypeSpec, DBML, HJSON, HOCON, INI, JSON5, JSONC, Properties, SCFG, YANG, XML, DTD, EditorConfig, dotenv, Desktop, Devicetree, Kconfig, Linker script |
| Forest — shaders / hardware | ~14 | WGSL, GLSL, HLSL, CUDA, ISPC, VHDL, SystemVerilog, MLIR, LLVM, Jasmin, QBE, FIRRTL, PIO ASM, GDShader |
| Forest — docs / typesetting | 8 | LaTeX, Typst, AsciiDoc, Djot, Mermaid, Norg, BibTeX, PlantUML |
| Forest — functional / niche | ~26 | Agda, Idris, PureScript, Roc, Gren, Elm, Fennel, Janet, Hack, Haxe, Pony, C3, Aiken, Effekt, Eiffel, Jule, Koka, Luau, MoonBit, Motoko, Ralph, Scheme, SML, Wing, Common Lisp |
| Forest — build / DSL / testing | ~16 | Meson, Just, Beancount, Ledger, Gherkin, Hurl, Robot, Earthfile, Ninja, BitBake, Caddy, Snakemake, GN, Cooklang, Requirements, Cedar, CEL, Circom, Clarity, Rego, TLA+, Quint, Structurizr, GritQL, QL |
| Forest — DB / query | 8 | SPARQL, SurrealQL, PromQL, Kusto, SOQL, SOSL, PRQL, Turtle |
| Forest — data / lockfiles / shells / configs | ~28 | TSV, PSV, textproto, .po, PGN, todo.txt, go.mod / go.sum / go.work, godot_resource, Fish, Nushell, jq, Awk, Elvish, gitconfig / gitattributes / gitcommit / gitignore, Hyprlang, nftables, passwd, PEM, PoE filter, Puppet, ssh_config, sxhkdrc, tmux |
| Forest — misc | ~14 | DOT, gnuplot, GPG, Strace, VRL, Zeek, Ziggy + Schema, Starlark, SourcePawn, SCSS, RBS, OCamllex, DataWeave, USD, WIT |
| Total | 256 |
Tree-sitter-backed languages with the most thorough extraction. Meta["methods"]
on interface nodes stores the expected method set for implementation matching.
| Language | Functions | Methods + MemberOf | Types | Interfaces | Imports | Calls | Variables |
|---|---|---|---|---|---|---|---|
| Go | Full | Full (receiver) | Full | Full + Meta["methods"] | Full | Full | Full |
| TypeScript | Full | Full | Full | Full + Meta["methods"] | Full | Full | Full |
| JavaScript | Full | Full | Full | - | Full | Full | Full |
| Python | Full | Full | Full | - | Full | Full | Partial |
| Rust | Full | Full (impl blocks) | Full | Full + Meta["methods"] | Full | Full | Full |
| Java | Full | Full | Full | Full + Meta["methods"] | Full | Full | Fields |
| C# | Full | Full | Full | Full + Meta["methods"] | Full | Full | Fields |
| Kotlin | Full | Full | Full | Full | Full | Full | Properties |
| Scala | Full | Full | Full | Full + Meta["methods"] | Full | Full | - |
| Swift | Full | Full | Full | Full + Meta["methods"] | Full | Full | - |
| PHP | Full | Full | Full | Full | Full | Full | - |
| Ruby | Full | Full | Full | - | Full | Full | Constants |
| Elixir | Full | Full (defmodule) | Modules | - | Full | Full | Attributes |
| C | Full | - | Structs/Enums | - | Full | Full | Globals |
| C++ | Full | Full | Classes/Structs | - | Full | Full | - |
| Dart | Full | Full | Classes/Enums/Mixins/Extensions | Abstract interface | Full | Full | Full |
| OCaml | Full | Full (class) | Types/Modules | Module types | open | Full | Full |
| Lua | Full | Full (M.func/M:method) | - | - | require() | Full | Full |
| Language | Extensions | What it extracts |
|---|---|---|
| JSON | .json, .json5, .jsonc |
Top-level keys |
| YAML | .yaml, .yml |
Top-level keys |
| TOML | .toml |
Tables, key-value pairs |
| HCL / Terraform | .tf, .tfvars, .hcl |
Resource / data / module / variable / output blocks |
| SQL | .sql |
Tables (with columns), views, functions, indexes, triggers |
| Protobuf | .proto |
Messages (with fields), services + RPCs, enums, imports |
| Markdown | .md |
Headings, local file links, code-block languages |
| HTML | .html, .htm |
Script / link references, element IDs |
| CSS | .css |
Class selectors, ID selectors, custom properties, @import |
| Dockerfile | Dockerfile, Containerfile, .dockerfile |
FROM (base images), ENV / ARG variables |
| Makefile | Makefile, GNUmakefile, .mk, .make |
Targets, define…endef, VAR = …, include / -include |
| CMake | CMakeLists.txt, .cmake |
function(…), macro(…), add_library, add_executable, include(…), set(…) |
| Language | Extensions | What it extracts |
|---|---|---|
| Blade (Laravel) | .blade, .blade.php |
@section / @yield / @component / @include; @extends → import |
| EJS | .ejs |
JS function / arrow inside <% … %>; include('x') → import |
| Handlebars / Mustache | .hbs, .handlebars, .mustache |
{{#block}} as function; {{> partial}} → import; helper calls as edges |
| Jinja | .jinja, .jinja2, .j2 |
{% block %} / {% macro %}; extends / include / import / from … import |
| Twig | .twig |
Same shape as Jinja |
| ERB | .erb, .rhtml, .html.erb, .js.erb, .css.erb, .json.erb |
Ruby def / class inside <% … %>; render 'x' → import |
| Liquid | .liquid |
{% capture %} as function; {% assign %} as variable; {% include/render %} → import |
| Pug | .pug, .jade |
mixin / block NAME as function; extends / include → import |
| Language | Extensions | What it extracts |
|---|---|---|
| Solidity | .sol |
Contracts, functions, events, modifiers, structs |
| Move (Sui/Aptos) | .move |
module, fun / public fun / entry fun, struct, use X::Y |
| Cairo (StarkNet) | .cairo |
fn, struct / enum / trait / mod, use X::Y |
| Noir (Aztec) | .nr |
fn, struct / trait / impl / mod, use dep::X::Y |
| Tact (TON) | .tact |
contract / trait / message / struct, fun / receive / init, import "X" |
| Ballerina | .bal |
function, service NAME on …, type NAME record {…}, class, import X/Y |
| Language | Extensions | What it extracts |
|---|---|---|
| Julia | .jl |
function, struct, module, using / import |
| R | .r, .R |
Function defs; library / require / source |
| MATLAB | .mlx |
function (end-terminated), classdef, import a.b.c |
| Mathematica | .wl, .wls, .nb |
name[args_] := body, SetDelayed, Get[…] / Needs[…] |
| SAS | .sas |
proc / %macro as function, data as variable, %include / libname |
| Stata | .do, .ado |
program define, local / global, use / do / include |
| Fortran | .f, .f90, .f95, .f03, .f08 |
subroutine / function / module, use X |
| COBOL | .cob, .cbl, .cpy |
Programs, paragraphs, sections, COPY |
| Ada | .ada, .adb, .ads |
Packages, procedures, functions, with |
| Pascal / Delphi | .pas, .pp, .dpr |
Units, procedures, functions, classes |
| ABAP (SAP) | .abap |
FORM / FUNCTION / METHOD / CLASS…DEFINITION, INCLUDE |
| Apex (Salesforce) | .cls, .trigger, .apex |
Classes, triggers, methods |
| Language | Extensions | What it extracts |
|---|---|---|
| Mojo | .mojo, .🔥 |
fn / def, struct / trait, from … import / import |
| Odin | .odin |
name :: proc, name :: struct / enum / union, import "X" / foreign import |
| V | .v, .vsh |
fn, struct / interface / enum / type, import, module |
| Hare | .ha |
[export] fn, type X = struct / union / enum, use X; |
| Carbon | .carbon |
fn, class / interface / adapter / choice, import |
| ReScript | .res, .resi |
let (function / variable), type, module, open / include |
| Gleam | .gleam |
[pub] fn, [pub] type, import X/Y / import X.{Y} |
| Language | Extensions | What it extracts |
|---|---|---|
| Bash / Zsh | .sh, .bash, .zsh |
Function defs, source / . |
| PowerShell | .ps1, .psm1, .psd1 |
function, class, using |
| Batch | .bat, .cmd |
:LABEL as function, call :LABEL / goto as call edges |
| Perl | .pl, .pm, .t |
sub, package, use / require |
| Raku | .raku, .rakumod, .p6 |
sub, class, use |
| Lua | .lua |
Full tree-sitter (see core matrix) |
| Tcl | .tcl |
proc, package require, source |
| VimScript | .vim, .vimrc |
function, command, source |
| AutoHotkey | .ahk, .ahk1, .ahk2 |
Hotkeys, labels, functions (v1 + v2) |
| CoffeeScript | .coffee |
name = (args) -> / =>, class, require 'X' |
| Language | Extensions | What it extracts |
|---|---|---|
| Haskell | .hs, .lhs |
Full (see core matrix) |
| OCaml | .ml, .mli |
Full (see core matrix) |
| Clojure | .clj, .cljs, .cljc, .edn |
defn, defrecord / deftype, defprotocol, require / use |
| Erlang | .erl, .hrl |
Functions, -type / -record, -import |
| Elixir | .ex, .exs |
Full (see core matrix) |
| Racket | .rkt, .ss |
define, struct, require |
| F# | .fs, .fsi, .fsx |
let, type, module, open |
| Emacs Lisp | .el |
defun, defvar, defmacro, require |
| Language | Extensions | What it extracts |
|---|---|---|
| D | .d, .di |
struct / class / interface / enum / union / template, import X.Y |
| Zig | .zig, .zon |
Structs / enums / unions, @import, functions, globals |
| Nim | .nim, .nims, .nimble |
proc / func / method / iterator / template / macro, type defs, import |
| Crystal | .cr |
def, class, module, require |
| Vala | .vala, .vapi |
namespace / class / interface / struct / enum, methods, using X; |
| Groovy / Gradle | .groovy, .gvy, .gy, .gradle |
Classes, def, imports |
| Objective-C(++) | .m, .mm |
@interface / @protocol / @implementation, method selectors, #import / @import |
| ActionScript | .as |
package, classes, interfaces, function, import X.Y.*; |
| Dart | .dart |
Full (see core matrix) |
| Swift | .swift |
Full (see core matrix) |
| GDScript | .gd |
func, class, signals |
| Verse (UEFN) | .verse |
class / struct / enum / interface, functions with specifier blocks, using { /Path } |
| Nix | .nix |
Attribute sets, functions, import / <nixpkgs> |
| AL (Business Central) | .al |
Tables, pages, codeunits, procedures |
| Assembly | .asm, .s, .S, .nasm, .masm, .inc, .a65 |
Labels as functions; call / jsr / bl / jmp as edges; NASM/MASM/GAS/WLA-DX/CA65/ARM |
| Shaders | .glsl, .vert, .frag, .hlsl, .compute |
Functions, uniforms, #include |
A few extensions conflict across languages; the registration order in
internal/parser/languages/register.go decides which extractor wins.
| Extension | Registered as | Alternative |
|---|---|---|
.m |
Objective-C | MATLAB (uses .mlx instead) |
.v |
V | Verilog / Coq (not yet supported) |
.d |
D | D import files (.di) |
.as |
ActionScript | AssemblyScript (not supported) |
Three paths, in order of decreasing effort:
-
Bespoke tree-sitter (deep extraction). Add a new sub-package under
internal/parser/tsitter/wrapping the C grammar, then a hand-tuned extractor underinternal/parser/languages/that compiles per-language S-expression queries. Usegolang.goas a reference. Justified for languages where you need ORM / contract / dataflow / scope-aware call resolution. -
Regex (simple structural). Use
nim.goorabap.goas templates. Pick this when no upstream grammar exists and signature-only is acceptable. Shared helpers inhelpers_indent.go(findBlockEnd,findIndentedBlockEnd,findKeywordBlockEnd,lineAt). -
Forest signature-only (cheapest, broadest). If the language already has a grammar in
alexaandru/go-sitter-forest, addgithub.com/alexaandru/go-sitter-forest/<lang>togo.modand append one row toforestLanguagesinforest_registrations.go:{"<name>", []string{".<ext>"}, <pkg>.GetLanguage, <pkg>.GetQuery}.registerForestLanguagesskips the row at runtime if the name or any extension is already claimed by a hand-written extractor. The framework reads the grammar's bundledtags.scmwhen present and falls back to a generic node-kind walker otherwise — seeinternal/parser/forest/for the implementation.
All three paths must ship a _test.go with at least a happy-path and
empty-input case.