This note explains:
- current pain points when adding a new construct
- current compiler/LSP flow
- what must be updated today for cases like:
- new data type
- new expression/operator such as pipe
- proposal to reduce “many places to remember”
- proposed future flow
This is design/proposal only.
Current compiler architecture is phase-correct, but extension work is easy to miss in cross-cutting places.
Main pain:
- one new syntax node can require updates in many layers
- some of those are semantic phase updates
- some are mechanical traversal/registration updates
- forgetting one often causes:
- silent missing behavior
- partial support
- hover/analysis drift
- lowering/backend surprises later
Important distinction:
- multiple semantic phases are good
- multiple manual cross-cutting lists are the real problem
So goal should not be “merge phases”.
Goal should be:
- keep explicit compiler pipeline
- reduce manual touchpoints
- make forgotten updates fail loudly
Today, adding a new construct usually means touching both:
- semantic phase handlers
- mechanical cross-cutting wiring
These are legitimate places to update because each phase answers a different question:
- parser: syntax shape
- collector: top-level symbols/types
- resolver: names/scope
- typechecker: meaning/rules
- HIR lowering: canonical executable IR
- MIR/backend: runtime/codegen details
These are where most “forgot to add it” bugs come from:
- AST inspection recursion
- LSP traversal
- hover node classification
- type-position detection helpers
- hover type/body formatting switches
- fingerprints / surfaces if relevant
- test helper coverage
Current concrete hotspot files/functions:
internal/frontend/ast/inspect.gointernal/lsp/handlers.go:isTypeExprPositioninternal/lsp/handlers.go:formatHoverTypeBodyinternal/lsp/handlers.go:formatHoverTypeInlineinternal/lsp/handlers.go:walkModuleASTinternal/lsp/handlers.go:HandleRename
Important note:
walkModuleASTalready centralizes one class of traversalHandleRenamestill does its own manualast.Inspectwalk- so the repo currently has both improved and still-manual traversal styles
flowchart LR
A[Lexer] --> B[Parser AST]
B --> C[Collector]
C --> D[Resolver]
D --> E[Typechecker]
E --> F[HIR Lowering]
F --> G[CFG Analysis]
F --> H[MIR Lowering]
H --> I[LLVM Backend]
B --> J[LSP AST Traversal]
E --> J
This phase chain is correct and should stay explicit.
Problem is not this flow itself.
Problem is the number of manual extension points around it.
This depends on what kind of construct you add.
Example:
- new type expression
- union/result/optional-like type
- new pointer/container type form
Likely touchpoints:
internal/frontend/ast- node/type struct
- location fields
internal/frontend/ast/inspect.go
internal/frontend/parser/parser.go- maybe expression parser too if type appears in casts/composites
If top-level declarations or symbol shape change:
internal/semantics/collector/collector.go
If type references contain names or nested paths:
internal/semantics/resolver/resolver.go
Always if type has real semantics:
internal/semantics/typechecker/typechecker.go
If runtime representation matters:
internal/ir/hir_lower/lower.go- maybe
internal/ir/mir/model.go - maybe backend lowering
If hover/type rendering should understand it specially:
internal/lsp/handlers.go
- parser tests
- typechecker tests
- pipeline tests
- maybe LSP hover tests
Example:
- pipe operator
- new postfix/prefix/infix expression
Likely touchpoints:
If token is new:
internal/frontend/lexer
- new expression node in
internal/frontend/ast inspect.go
internal/frontend/parser/parse_expr.go- precedence
- associativity
- recovery
If name binding/desugaring changes:
internal/semantics/resolver/resolver.go
internal/semantics/typechecker/typechecker.go
internal/ir/hir_lower/lower.go
Only if HIR cannot normalize/desugar it away
- hover expression typing normally comes from
ExprTypes - custom rendering only if needed
- parser
- typechecker
- pipeline
- maybe hover
Example:
traitmodule- new top-level declaration kind
Likely touchpoints:
- AST struct
- parser top-level declaration parsing
ast.Inspect- collector
- resolver
- typechecker
- maybe pipeline/lowering if executable semantics exist
- LSP declaration hover logic
- tests
Current “new construct” rollout looks like this:
flowchart TD
N[New Construct] --> A[AST Node/Type]
A --> B[ast.Inspect]
A --> C[Parser]
A --> D[Collector maybe]
A --> E[Resolver]
A --> F[Typechecker]
A --> G[HIR Lowering]
A --> H[MIR/Backend maybe]
A --> I[LSP traversal/hover maybe]
A --> J[Tests]
The risky part is not Resolver or Typechecker existing.
The risky part is:
Inspectis manual- phase switches are manual
- LSP special handling is manual
- there is no single “extension checklist” encoded in repo structure
Most dangerous current omission point:
internal/frontend/ast/inspect.gohas no fail-loud default branch- a new node can be silently skipped by every consumer using
ast.Inspect
So forgetting is mostly a coordination problem.
Keep explicit phases.
Reduce manual cross-cutting extension points.
Treat new constructs as one of:
ExprTypeExprDeclStmt
For each category, create a stable checklist document and/or local template.
Example:
Must check:
- lexer token if needed
- AST node
ast.Inspect- parser precedence/recovery
- resolver
- typechecker
- HIR lowering
- tests
- LSP if special behavior exists
Must check:
- AST node
ast.Inspect- parser type parsing
- resolver/type lookup
- typechecker rules
- lowering/runtime layout if needed
- tests
This does not reduce phases, but it reduces uncertainty.
Recommended storage:
docs/checklists/expr.mddocs/checklists/type_expr.mddocs/checklists/decl.mddocs/checklists/stmt.md
And reference them from RULES.md, otherwise they will drift into side-doc territory and stop helping.
Current repo already improved this in LSP with:
- shared
walkModuleAST
Similar direction should continue:
- one canonical AST traversal policy
- fewer hand-maintained recursive walks outside
ast.Inspect
Long-term best improvement:
- reduce direct ad hoc tree walking
- push consumers through shared traversal helpers
This reduces “forgot to descend into child node X”.
Concrete existing debt:
HandleRenameshould migrate to shared traversal instead of doing its own import/stmt walk withast.Inspect- any future cursor-driven feature should prefer shared traversal first
Silent omission is worst outcome.
This is directly aligned with RULES.md §5:
- internal invariant violations should panic
- silent skip on unknown internal node kinds is wrong behavior
Preferred behavior:
- exhaustive switches where practical
- explicit panic on impossible internal unhandled node kinds in lowering/backends
- tests that assert newly introduced constructs survive full pipeline
If new node is not handled:
- fail immediately
- not silently degrade to nil or unknown
Highest-priority concrete target:
internal/frontend/ast/inspect.go
Reason:
- it is shared by many consumers
- it is currently easy to forget
- silent skip there propagates into resolver/collector/LSP traversals indirectly
So first concrete follow-up should be:
- make
ast.Inspectfail loudly on unhandled node types, or introduce another equally strict exhaustiveness mechanism
Good use of central tables:
- node category registration
- hover/declaration classification
- mechanical child traversal metadata
- type-position slot metadata
- render-shape metadata for hover formatting
Bad use:
- collapsing resolver/typechecker/lowering into one mega “node handler registry”
Reason:
- that would remove phase clarity
- semantics differ by phase
- it becomes one hidden second architecture
So centralize mechanical metadata, not semantic meaning.
Where it should live:
- AST-owned traversal/child metadata should live with
ast - LSP-owned hover/render classification should live with
lsp - avoid a separate generic registry package that becomes hidden architecture
For every new construct, add at least:
- parser test
- semantic test
- pipeline/lowering test
Optional:
- LSP hover test
This ensures construct is not “parsed only”.
Proposed extension workflow:
flowchart TD
A[Choose construct category Expr/TypeExpr/Decl/Stmt]
A --> B[Use category checklist]
B --> C[Add AST node]
C --> D[Register shared traversal/mechanical metadata]
D --> E[Update parser]
E --> F[Update only relevant semantic phases]
F --> G[Add full-pipeline tests]
G --> H[Optional LSP special handling]
H --> I[Review against checklist]
This is better because:
- semantic phases stay explicit
- extension work starts from category checklist
- mechanical wiring becomes more centralized
- omissions become easier to spot
Suppose we add:
value |> f()
Best shape:
- parse as dedicated
PipeExpr - define precedence and associativity clearly
- either typecheck as native operator rule
- or desugar mentally to call form during checking
Prefer normalize here:
- lower
PipeExpr(lhs, rhs)into normal call-shaped HIR
If HIR is normalized, then:
- MIR does not need dedicated pipe concept
- backend does not need pipe concept
That is good reduction of downstream touchpoints.
So principle is:
- keep syntax distinct in parser/AST
- normalize early in lowering if semantic shape matches existing construct
That is real source-of-truth reduction.
Suppose we add:
type Maybe = union {
Some(i32),
None,
}
This likely cannot be normalized away cheaply.
So updates would naturally span more phases:
- parser must know syntax
- resolver must know referenced names
- typechecker must know compatibility/matching rules
- lowering must know runtime representation
- backend may need layout/codegen changes
That is not duplication.
That is legitimate multi-phase meaning.
So for types, best improvement is not “merge logic”.
Best improvement is:
- checklist
- shared traversal
- exhaustive handling
- tighter tests
I would recommend:
- Keep explicit phase chain exactly as architecture intends.
- Add repo-level extension checklists for
Expr,TypeExpr,Decl,Stmt. - Make
ast.Inspectomission impossible or loudly failing. - Migrate remaining manual traversals like
HandleRenametoward shared traversal. - Centralize more mechanical traversal/metadata.
- Normalize syntax sugar early in HIR lowering where possible.
- Add full-pipeline tests for every new construct.
Your concern is valid.
But root issue is not “too many phases”.
Root issue is:
- too many manual cross-cutting lists
- not enough structured rollout checklist
So best solution is:
- fewer manual mechanical touchpoints
- not fewer semantic phases
That gives:
- less chance to forget updates
- cleaner compiler architecture
- no shortcut collapse of parser/resolver/typechecker/lowering responsibilities