Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 15 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,11 @@
# SemLang Packages

This public repository contains SemLang package resources for agent tools while
SemLang core source remains private in `Unsupervisedcom/semlang-core`.
This public repository contains SemLang package resources for agent tools while SemLang core source remains private in `Unsupervisedcom/semlang-core`.

Public SemLang documentation lives in `docs/`, supplementary design notes live
in `design-docs/`, and shared documentation/test examples live in `examples/`.
The private core repository consumes these examples directly from this public
package checkout so the examples used in docs and tests stay in one place.
Public SemLang documentation lives in `docs/`, supplementary design notes live in `design-docs/`, and shared documentation/test examples live in `examples/`.
The private core repository consumes these examples directly from this public package checkout so the examples used in docs and tests stay in one place.

The `skills/` directory is generated from the authoritative
`skills_for_cli_packages/` directory in `semlang-core` during core releases.
The `skills/` directory is generated from the authoritative `skills_for_cli_packages/` directory in `semlang-core` during core releases.

## Pi

Expand All @@ -19,39 +15,31 @@ Install the Pi package from this repository:
pi install git:https://github.com/Unsupervisedcom/semlang
```

For local development from a `semlang-core` checkout that has this repository as
a submodule:
For local development from a `semlang-core` checkout that has this repository as a submodule:

```sh
pi install "$PWD/packages/semlang"
```

The package explicitly loads the bundled `pi-mcp-adapter` extension from
`node_modules/pi-mcp-adapter/index.ts`. The Pi package exposes all SemLang skills
in `skills/`, including `semlang-setup`, `semlang`, and
`initial-ontology-creation`.
The package explicitly loads the bundled `pi-mcp-adapter` extension from `node_modules/pi-mcp-adapter/index.ts`.
The Pi package exposes all SemLang skills in `skills/`, including `semlang-setup`, `semlang`, and `initial-ontology-creation`.

Use the `semlang-setup` skill to inspect or add SemLang MCP configuration. After
MCP config changes, run `/reload` or restart Pi.
Use the `semlang-setup` skill to inspect or add SemLang MCP configuration.
After MCP config changes, run `/reload` or restart Pi.

## Claude Code

This repository includes the Claude Code plugin manifest in
`.claude-plugin/plugin.json`, an MCP server config in `.mcp.json`, and SemLang
skills in `skills/`.
This repository includes the Claude Code plugin manifest in `.claude-plugin/plugin.json`, an MCP server config in `.mcp.json`, and SemLang skills in `skills/`.

After adding the public SemLang package repository to a Claude Code plugin
marketplace, install it with:
After adding the public SemLang package repository to a Claude Code plugin marketplace, install it with:

```sh
claude plugin install semlang@semlang
```

If your marketplace entry uses a different marketplace name, replace the final
`semlang` after `@` with that name.
If your marketplace entry uses a different marketplace name, replace the final `semlang` after `@` with that name.

SemLang MCP starts with the published SemLang package, pinned to this package
version, for example `npx -y semlang@0.1.2 mcp`.
SemLang MCP starts with the published SemLang package, pinned to this package version, for example `npx -y semlang@0.1.2 mcp`.

SemLang MCP respects `SEMLANG_*` environment settings. Run `semlang setup` to
inspect resolved SemLang project settings.
SemLang MCP respects `SEMLANG_*` environment settings.
Run `semlang setup` to inspect resolved SemLang project settings.
15 changes: 10 additions & 5 deletions design-docs/actions-requirements.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# SemLang Actions Requirements

This document defines the first implementation slice for SemLang actions. The language reference in `packages/semlang/docs/language-reference/actions.md` is the user-facing contract; this file is the implementation checklist.
This document defines the first implementation slice for SemLang actions.
The language reference in `packages/semlang/docs/language-reference/actions.md` is the user-facing contract; this file is the implementation checklist.

## Goals

Expand Down Expand Up @@ -168,13 +169,16 @@ Recommended diagnostic codes:
- `WRITEABLE_DIMENSION_REQUIRES_MAPPING`
- `INVALID_WRITE_MAPPING`

Validation can defer expression type-checking for guard predicates, edit expressions, write expressions, and agent metadata. Those expressions should still be preserved exactly enough for a future manifest emitter.
Validation can defer expression type-checking for guard predicates, edit expressions, write expressions, and agent metadata.
Those expressions should still be preserved exactly enough for a future manifest emitter.

## Lowering Requirements

Malloy emission must ignore actions and write mappings. Existing Malloy output for read models should remain stable except for harmless formatting changes around parsed declarations.
Malloy emission must ignore actions and write mappings.
Existing Malloy output for read models should remain stable except for harmless formatting changes around parsed declarations.

The MCP `invoke_action` adapter may lower supported actions to SQL through the configured Malloy connection. SQL action lowering must remain separate from Malloy read/query lowering, avoid dialect-specific `RETURNING`, `UPDATE ... FROM`, and `DELETE ... USING` constructs in the default path, quote schema-qualified table path components separately, and reject write selectors that can fan out one subject identity into multiple rows.
The MCP `invoke_action` adapter may lower supported actions to SQL through the configured Malloy connection.
SQL action lowering must remain separate from Malloy read/query lowering, avoid dialect-specific `RETURNING`, `UPDATE ... FROM`, and `DELETE ... USING` constructs in the default path, quote schema-qualified table path components separately, and reject write selectors that can fan out one subject identity into multiple rows.

The first implementation does not need to expose a public action manifest emitter, but the AST and semantic model should be structured so a manifest emitter can be added without reparsing action bodies.

Expand All @@ -191,4 +195,5 @@ Add focused tests for:
- rejecting a writeable dimension with no mapping
- ensuring Malloy output ignores actions and still emits the concept source

Update fixture examples with realistic actions in at least the manufacturing, retail, healthcare, banking, and SaaS examples. Examples should compile without diagnostics.
Update fixture examples with realistic actions in at least the manufacturing, retail, healthcare, banking, and SaaS examples.
Examples should compile without diagnostics.
67 changes: 47 additions & 20 deletions design-docs/architecture.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# SemLang Compiler Architecture

This document explains the current SemLang compiler strategy for future implementers. The compiler is intentionally conservative: it accepts the subset of SemLang described in `packages/semlang/design-docs/language.md`, builds a semantic model, and emits deterministic Malloy text. When a construct cannot be parsed, resolved, validated, expanded through lenses, or emitted predictably, the compiler should report diagnostics instead of guessing.
This document explains the current SemLang compiler strategy for future implementers.
The compiler is intentionally conservative: it accepts the subset of SemLang described in `packages/semlang/design-docs/language.md`, builds a semantic model, and emits deterministic Malloy text.
When a construct cannot be parsed, resolved, validated, expanded through lenses, or emitted predictably, the compiler should report diagnostics instead of guessing.

## Public Entry Points

Expand All @@ -17,13 +19,16 @@ The public API is exported from `src/index.ts`:

The main public artifacts are:

- `SemLangAst`: the syntactic tree returned by the parser. It preserves source locations and declaration structure.
- `SemanticModel`: the resolved package graph. It indexes types, concepts, lenses, and queries by their compiler names.
- `SemLangAst`: the syntactic tree returned by the parser.
It preserves source locations and declaration structure.
- `SemanticModel`: the resolved package graph.
It indexes types, concepts, lenses, and queries by their compiler names.
- `ResolvedConcept`: a concept declaration plus compiler metadata such as `sourceName` and `roleBaseNames`.
- `CompileResult`: the aggregate result containing any available `ast`, `model`, emitted `malloy`, emitted `jsonSchema`, and all diagnostics.
- `Diagnostic`: a stable error or warning record with severity, code, message, and optional source location.

Callers should treat diagnostics as part of the API contract. If `ast`, `model`, `malloy`, or `jsonSchema` is absent, the diagnostics explain why that stage could not continue.
Callers should treat diagnostics as part of the API contract.
If `ast`, `model`, `malloy`, or `jsonSchema` is absent, the diagnostics explain why that stage could not continue.

## Pipeline Overview

Expand All @@ -35,11 +40,14 @@ Callers should treat diagnostics as part of the API contract. If `ast`, `model`,
4. Emit Malloy from the validated semantic model.
5. Emit JSON Schema from the validated semantic model.

The implementation keeps the stages separate even where the resolver currently owns both resolution and validation. That separation is useful: future work can add richer parsing, semantic passes, emitters, or tooling without changing the high-level contract.
The implementation keeps the stages separate even where the resolver currently owns both resolution and validation.
That separation is useful: future work can add richer parsing, semantic passes, emitters, or tooling without changing the high-level contract.

## Parse Stage

The parser in `src/parser.ts` is a pragmatic line/block parser rather than a full grammar-driven parser. It recognizes top-level package, include, ignored, source, type, concept, lens, and query declarations. Concept and refinement bodies are parsed into common member structures so normal concepts and lens refinements share the same declaration shape.
The parser in `src/parser.ts` is a pragmatic line/block parser rather than a full grammar-driven parser.
It recognizes top-level package, include, ignored, source, type, concept, lens, and query declarations.
Concept and refinement bodies are parsed into common member structures so normal concepts and lens refinements share the same declaration shape.

Parser responsibilities:

Expand All @@ -48,11 +56,15 @@ Parser responsibilities:
- Reject invalid declaration shapes early with parse diagnostics.
- Return no AST when any parse error is present.

The parser should stay syntax-focused. It should not need global knowledge of known types, concepts, roles, joins, or lenses. Those checks belong in resolution and validation.
The parser should stay syntax-focused.
It should not need global knowledge of known types, concepts, roles, joins, or lenses.
Those checks belong in resolution and validation.

## Resolve Stage

Resolution lives in `src/resolver.ts`. `resolveSemLang` first loads the include graph through the configured `PackageLoader`. Includes are resolved before the including file, and include cycles are reported as `INCLUDE_CYCLE`.
Resolution lives in `src/resolver.ts`.
`resolveSemLang` first loads the include graph through the configured `PackageLoader`.
Includes are resolved before the including file, and include cycles are reported as `INCLUDE_CYCLE`.

After loading, `mergeAst` builds a `SemanticModel`:

Expand All @@ -63,11 +75,14 @@ After loading, `mergeAst` builds a `SemanticModel`:
- Queries are appended to `model.queries`.
- Duplicate package-level names are diagnosed during merge.

The current compiler uses compiler names directly as model keys for package-level declarations. Roles are concept-local: each role has a canonical qualified name such as `Customer.Active`, while short role names are resolved through the tested path when possible. There is not yet a general import alias model beyond file includes, so added language features should still be careful about symbol visibility and collision behavior.
The current compiler uses compiler names directly as model keys for package-level declarations.
Roles are concept-local: each role has a canonical qualified name such as `Customer.Active`, while short role names are resolved through the tested path when possible.
There is not yet a general import alias model beyond file includes, so added language features should still be careful about symbol visibility and collision behavior.

## Validate Stage

Validation currently runs inside `validateModel` in the resolver. It validates the base model and then validates each query against the query model produced by lens expansion.
Validation currently runs inside `validateModel` in the resolver.
It validates the base model and then validates each query against the query model produced by lens expansion.

Validation responsibilities:

Expand All @@ -80,28 +95,36 @@ Validation responsibilities:
- Check query roots and path expressions.
- Reject aggregate aliases that reference raw row-level fields outside aggregate functions.

Expression validation is intentionally lightweight. It tokenizes paths and recognizes a small set of aggregate functions, scalar functions, scalar date/time properties, and expression keywords. This keeps the compiler useful for current examples without pretending to be a full Malloy or SQL expression analyzer.
Expression validation is intentionally lightweight.
It tokenizes paths and recognizes a small set of aggregate functions, scalar functions, scalar date/time properties, and expression keywords.
This keeps the compiler useful for current examples without pretending to be a full Malloy or SQL expression analyzer.

## Lens Expansion

Lenses are query-time semantic overlays. They do not mutate the base `SemanticModel`.
Lenses are query-time semantic overlays.
They do not mutate the base `SemanticModel`.

`applyQueryLenses` clones the model, then applies the lenses listed on the query from left to right. `applyLens` recursively applies parent lenses before the child lens and tracks the current stack to report `LENS_CYCLE`.
`applyQueryLenses` clones the model, then applies the lenses listed on the query from left to right.
`applyLens` recursively applies parent lenses before the child lens and tracks the current stack to report `LENS_CYCLE`.

Current lens behavior:

- Lens-defined types are added to the cloned model if the name is not already present.
- Each `refine: Concept extend { ... }` block appends its members to the cloned target concept.
- Lens `where:` refinements become additional concept filters.
- Multiple lens filters compose by conjunction during emission because each filter emits as a separate Malloy `where:`.
- Lens filters apply across the query-local concept graph. A query rooted at `Customer` can join to a lens-expanded `SaleLine` source, and measures such as `sale_lines.sum(net_sales_amount)` aggregate over the filtered sale-line source rather than the base source.
- Lens filters apply across the query-local concept graph.
A query rooted at `Customer` can join to a lens-expanded `SaleLine` source, and measures such as `sale_lines.sum(net_sales_amount)` aggregate over the filtered sale-line source rather than the base source.
- The base model remains available for other queries and for non-lensed emission.

Emission also uses lens expansion. For lensed queries, the emitter generates query-local sources with names like `source_name__query_name` so the lens-expanded concept graph can coexist with the base sources in the same Malloy output.
Emission also uses lens expansion.
For lensed queries, the emitter generates query-local sources with names like `source_name__query_name` so the lens-expanded concept graph can coexist with the base sources in the same Malloy output.

## Emit Stage

`src/emitter.ts` lowers a validated `SemanticModel` to Malloy text. It emits named raw sources, concept sources, query-backed sources, queries, and finally concepts backed by query results. If a query uses lenses, it emits the lens-expanded sources for that query before the query itself.
`src/emitter.ts` lowers a validated `SemanticModel` to Malloy text.
It emits named raw sources, concept sources, query-backed sources, queries, and finally concepts backed by query results.
If a query uses lenses, it emits the lens-expanded sources for that query before the query itself.

Important lowering rules:

Expand All @@ -113,7 +136,8 @@ Important lowering rules:
- Dimensions, measures, views, and queries preserve the SemLang declaration shape where possible.
- Currency metadata may emit a Malloy annotation when the compiler can infer it from referenced fields.

The emitter assumes it receives a validated model. It may still append diagnostics for lens expansion failures discovered while emitting lensed queries, but semantic failures should normally have been found before emission.
The emitter assumes it receives a validated model.
It may still append diagnostics for lens expansion failures discovered while emitting lensed queries, but semantic failures should normally have been found before emission.

## Diagnostics Philosophy

Expand All @@ -131,7 +155,8 @@ The current stage boundaries are:
- Resolution or validation errors prevent `model` and Malloy output from being returned by `compileSemLang`.
- Emission diagnostics are accumulated into `CompileResult.diagnostics` alongside earlier diagnostics.

Warnings are supported by the type system but are not heavily used yet. Before adding warnings, decide whether callers should treat them as advisory metadata or as policy signals.
Warnings are supported by the type system but are not heavily used yet.
Before adding warnings, decide whether callers should treat them as advisory metadata or as policy signals.

## Test Strategy

Expand All @@ -156,7 +181,8 @@ Use integration tests for:
- Boundaries with optional Malloy/DuckDB runtime packages.
- Verifying generated text remains suitable for downstream runtime loading without depending on unstable Malloy internals.

When adding a language feature, prefer one small semantic test that isolates the feature and one fixture-oriented assertion if the examples rely on it. Diagnostics tests should assert codes and important locations, not full prose, unless the wording itself is the behavior being protected.
When adding a language feature, prefer one small semantic test that isolates the feature and one fixture-oriented assertion if the examples rely on it.
Diagnostics tests should assert codes and important locations, not full prose, unless the wording itself is the behavior being protected.

## Known Limitations

Expand All @@ -173,4 +199,5 @@ The current compiler is deliberately V1-shaped:
- The compiler does not verify that referenced physical table columns exist.
- Runtime execution is outside the compiler boundary; tests mostly verify generated Malloy text and optional runtime package availability.

Future implementers should keep these limitations visible. The safest evolution path is to replace narrow heuristics with explicit intermediate representations only when a feature needs that precision, while preserving the current public artifacts and diagnostic behavior for callers.
Future implementers should keep these limitations visible.
The safest evolution path is to replace narrow heuristics with explicit intermediate representations only when a feature needs that precision, while preserving the current public artifacts and diagnostic behavior for callers.
Loading