From 7f75fd1681bd5b2fd7da6e1f6faaaa7616a67707 Mon Sep 17 00:00:00 2001 From: Noah Horton Date: Thu, 28 May 2026 15:46:03 -0600 Subject: [PATCH] Format Markdown with snapper --- README.md | 42 ++++----- design-docs/actions-requirements.md | 15 ++-- design-docs/architecture.md | 67 ++++++++++----- design-docs/language.md | 84 ++++++++++++------ design-docs/semlang-vs-palantir.md | 9 +- design-docs/supported_malloy_features.md | 6 +- docs/language-reference/actions.md | 52 ++++++++---- docs/language-reference/concepts.md | 37 +++++--- docs/language-reference/declarations.md | 22 +++-- .../diagnostics-lowering.md | 8 +- docs/language-reference/expressions.md | 25 ++++-- docs/language-reference/index.md | 10 ++- docs/language-reference/lenses.md | 19 +++-- docs/language-reference/schema-vocabulary.md | 10 ++- docs/language-reference/sources.md | 6 +- .../supported_malloy_features.md | 6 +- docs/mcp-server/configuration.md | 27 ++++-- docs/mcp-server/index.md | 10 ++- docs/mcp-server/lens-tools.md | 12 ++- docs/mcp-server/malloy-connections.md | 31 ++++--- docs/mcp-server/ontology-tools.md | 21 +++-- docs/mcp-server/query-and-action-tools.md | 23 +++-- docs/mcp-server/reasoning-tools.md | 3 +- docs/mcp-server/source-and-search.md | 16 ++-- docs/mcp-server/tools-overview.md | 14 ++- docs/semlang-concepts.md | 15 ++-- .../about.md | 6 +- .../about.md | 6 +- .../about.md | 6 +- .../about.md | 7 +- .../saas-product-usage-and-revenue/about.md | 6 +- skills/initial-ontology-creation/SKILL.md | 65 +++++++++----- skills/semlang-setup/SKILL.md | 31 +++---- skills/semlang/SKILL.md | 85 +++++++++++++------ 34 files changed, 525 insertions(+), 277 deletions(-) diff --git a/README.md b/README.md index 836a849..8ee0fe0 100644 --- a/README.md +++ b/README.md @@ -1,15 +1,11 @@ # SemLang Packages -This public repository contains SemLang package resources for agent tools while -SemLang core source remains private in `Unsupervisedcom/semlang-core`. +This public repository contains SemLang package resources for agent tools while SemLang core source remains private in `Unsupervisedcom/semlang-core`. -Public SemLang documentation lives in `docs/`, supplementary design notes live -in `design-docs/`, and shared documentation/test examples live in `examples/`. -The private core repository consumes these examples directly from this public -package checkout so the examples used in docs and tests stay in one place. +Public SemLang documentation lives in `docs/`, supplementary design notes live in `design-docs/`, and shared documentation/test examples live in `examples/`. +The private core repository consumes these examples directly from this public package checkout so the examples used in docs and tests stay in one place. -The `skills/` directory is generated from the authoritative -`skills_for_cli_packages/` directory in `semlang-core` during core releases. +The `skills/` directory is generated from the authoritative `skills_for_cli_packages/` directory in `semlang-core` during core releases. ## Pi @@ -19,39 +15,31 @@ Install the Pi package from this repository: pi install git:https://github.com/Unsupervisedcom/semlang ``` -For local development from a `semlang-core` checkout that has this repository as -a submodule: +For local development from a `semlang-core` checkout that has this repository as a submodule: ```sh pi install "$PWD/packages/semlang" ``` -The package explicitly loads the bundled `pi-mcp-adapter` extension from -`node_modules/pi-mcp-adapter/index.ts`. The Pi package exposes all SemLang skills -in `skills/`, including `semlang-setup`, `semlang`, and -`initial-ontology-creation`. +The package explicitly loads the bundled `pi-mcp-adapter` extension from `node_modules/pi-mcp-adapter/index.ts`. +The Pi package exposes all SemLang skills in `skills/`, including `semlang-setup`, `semlang`, and `initial-ontology-creation`. -Use the `semlang-setup` skill to inspect or add SemLang MCP configuration. After -MCP config changes, run `/reload` or restart Pi. +Use the `semlang-setup` skill to inspect or add SemLang MCP configuration. +After MCP config changes, run `/reload` or restart Pi. ## Claude Code -This repository includes the Claude Code plugin manifest in -`.claude-plugin/plugin.json`, an MCP server config in `.mcp.json`, and SemLang -skills in `skills/`. +This repository includes the Claude Code plugin manifest in `.claude-plugin/plugin.json`, an MCP server config in `.mcp.json`, and SemLang skills in `skills/`. -After adding the public SemLang package repository to a Claude Code plugin -marketplace, install it with: +After adding the public SemLang package repository to a Claude Code plugin marketplace, install it with: ```sh claude plugin install semlang@semlang ``` -If your marketplace entry uses a different marketplace name, replace the final -`semlang` after `@` with that name. +If your marketplace entry uses a different marketplace name, replace the final `semlang` after `@` with that name. -SemLang MCP starts with the published SemLang package, pinned to this package -version, for example `npx -y semlang@0.1.2 mcp`. +SemLang MCP starts with the published SemLang package, pinned to this package version, for example `npx -y semlang@0.1.2 mcp`. -SemLang MCP respects `SEMLANG_*` environment settings. Run `semlang setup` to -inspect resolved SemLang project settings. +SemLang MCP respects `SEMLANG_*` environment settings. +Run `semlang setup` to inspect resolved SemLang project settings. diff --git a/design-docs/actions-requirements.md b/design-docs/actions-requirements.md index 389071f..d1b9bb3 100644 --- a/design-docs/actions-requirements.md +++ b/design-docs/actions-requirements.md @@ -1,6 +1,7 @@ # SemLang Actions Requirements -This document defines the first implementation slice for SemLang actions. The language reference in `packages/semlang/docs/language-reference/actions.md` is the user-facing contract; this file is the implementation checklist. +This document defines the first implementation slice for SemLang actions. +The language reference in `packages/semlang/docs/language-reference/actions.md` is the user-facing contract; this file is the implementation checklist. ## Goals @@ -168,13 +169,16 @@ Recommended diagnostic codes: - `WRITEABLE_DIMENSION_REQUIRES_MAPPING` - `INVALID_WRITE_MAPPING` -Validation can defer expression type-checking for guard predicates, edit expressions, write expressions, and agent metadata. Those expressions should still be preserved exactly enough for a future manifest emitter. +Validation can defer expression type-checking for guard predicates, edit expressions, write expressions, and agent metadata. +Those expressions should still be preserved exactly enough for a future manifest emitter. ## Lowering Requirements -Malloy emission must ignore actions and write mappings. Existing Malloy output for read models should remain stable except for harmless formatting changes around parsed declarations. +Malloy emission must ignore actions and write mappings. +Existing Malloy output for read models should remain stable except for harmless formatting changes around parsed declarations. -The MCP `invoke_action` adapter may lower supported actions to SQL through the configured Malloy connection. SQL action lowering must remain separate from Malloy read/query lowering, avoid dialect-specific `RETURNING`, `UPDATE ... FROM`, and `DELETE ... USING` constructs in the default path, quote schema-qualified table path components separately, and reject write selectors that can fan out one subject identity into multiple rows. +The MCP `invoke_action` adapter may lower supported actions to SQL through the configured Malloy connection. +SQL action lowering must remain separate from Malloy read/query lowering, avoid dialect-specific `RETURNING`, `UPDATE ... FROM`, and `DELETE ... USING` constructs in the default path, quote schema-qualified table path components separately, and reject write selectors that can fan out one subject identity into multiple rows. The first implementation does not need to expose a public action manifest emitter, but the AST and semantic model should be structured so a manifest emitter can be added without reparsing action bodies. @@ -191,4 +195,5 @@ Add focused tests for: - rejecting a writeable dimension with no mapping - ensuring Malloy output ignores actions and still emits the concept source -Update fixture examples with realistic actions in at least the manufacturing, retail, healthcare, banking, and SaaS examples. Examples should compile without diagnostics. +Update fixture examples with realistic actions in at least the manufacturing, retail, healthcare, banking, and SaaS examples. +Examples should compile without diagnostics. diff --git a/design-docs/architecture.md b/design-docs/architecture.md index 342c7e4..54b3943 100644 --- a/design-docs/architecture.md +++ b/design-docs/architecture.md @@ -1,6 +1,8 @@ # SemLang Compiler Architecture -This document explains the current SemLang compiler strategy for future implementers. The compiler is intentionally conservative: it accepts the subset of SemLang described in `packages/semlang/design-docs/language.md`, builds a semantic model, and emits deterministic Malloy text. When a construct cannot be parsed, resolved, validated, expanded through lenses, or emitted predictably, the compiler should report diagnostics instead of guessing. +This document explains the current SemLang compiler strategy for future implementers. +The compiler is intentionally conservative: it accepts the subset of SemLang described in `packages/semlang/design-docs/language.md`, builds a semantic model, and emits deterministic Malloy text. +When a construct cannot be parsed, resolved, validated, expanded through lenses, or emitted predictably, the compiler should report diagnostics instead of guessing. ## Public Entry Points @@ -17,13 +19,16 @@ The public API is exported from `src/index.ts`: The main public artifacts are: -- `SemLangAst`: the syntactic tree returned by the parser. It preserves source locations and declaration structure. -- `SemanticModel`: the resolved package graph. It indexes types, concepts, lenses, and queries by their compiler names. +- `SemLangAst`: the syntactic tree returned by the parser. + It preserves source locations and declaration structure. +- `SemanticModel`: the resolved package graph. + It indexes types, concepts, lenses, and queries by their compiler names. - `ResolvedConcept`: a concept declaration plus compiler metadata such as `sourceName` and `roleBaseNames`. - `CompileResult`: the aggregate result containing any available `ast`, `model`, emitted `malloy`, emitted `jsonSchema`, and all diagnostics. - `Diagnostic`: a stable error or warning record with severity, code, message, and optional source location. -Callers should treat diagnostics as part of the API contract. If `ast`, `model`, `malloy`, or `jsonSchema` is absent, the diagnostics explain why that stage could not continue. +Callers should treat diagnostics as part of the API contract. +If `ast`, `model`, `malloy`, or `jsonSchema` is absent, the diagnostics explain why that stage could not continue. ## Pipeline Overview @@ -35,11 +40,14 @@ Callers should treat diagnostics as part of the API contract. If `ast`, `model`, 4. Emit Malloy from the validated semantic model. 5. Emit JSON Schema from the validated semantic model. -The implementation keeps the stages separate even where the resolver currently owns both resolution and validation. That separation is useful: future work can add richer parsing, semantic passes, emitters, or tooling without changing the high-level contract. +The implementation keeps the stages separate even where the resolver currently owns both resolution and validation. +That separation is useful: future work can add richer parsing, semantic passes, emitters, or tooling without changing the high-level contract. ## Parse Stage -The parser in `src/parser.ts` is a pragmatic line/block parser rather than a full grammar-driven parser. It recognizes top-level package, include, ignored, source, type, concept, lens, and query declarations. Concept and refinement bodies are parsed into common member structures so normal concepts and lens refinements share the same declaration shape. +The parser in `src/parser.ts` is a pragmatic line/block parser rather than a full grammar-driven parser. +It recognizes top-level package, include, ignored, source, type, concept, lens, and query declarations. +Concept and refinement bodies are parsed into common member structures so normal concepts and lens refinements share the same declaration shape. Parser responsibilities: @@ -48,11 +56,15 @@ Parser responsibilities: - Reject invalid declaration shapes early with parse diagnostics. - Return no AST when any parse error is present. -The parser should stay syntax-focused. It should not need global knowledge of known types, concepts, roles, joins, or lenses. Those checks belong in resolution and validation. +The parser should stay syntax-focused. +It should not need global knowledge of known types, concepts, roles, joins, or lenses. +Those checks belong in resolution and validation. ## Resolve Stage -Resolution lives in `src/resolver.ts`. `resolveSemLang` first loads the include graph through the configured `PackageLoader`. Includes are resolved before the including file, and include cycles are reported as `INCLUDE_CYCLE`. +Resolution lives in `src/resolver.ts`. +`resolveSemLang` first loads the include graph through the configured `PackageLoader`. +Includes are resolved before the including file, and include cycles are reported as `INCLUDE_CYCLE`. After loading, `mergeAst` builds a `SemanticModel`: @@ -63,11 +75,14 @@ After loading, `mergeAst` builds a `SemanticModel`: - Queries are appended to `model.queries`. - Duplicate package-level names are diagnosed during merge. -The current compiler uses compiler names directly as model keys for package-level declarations. Roles are concept-local: each role has a canonical qualified name such as `Customer.Active`, while short role names are resolved through the tested path when possible. There is not yet a general import alias model beyond file includes, so added language features should still be careful about symbol visibility and collision behavior. +The current compiler uses compiler names directly as model keys for package-level declarations. +Roles are concept-local: each role has a canonical qualified name such as `Customer.Active`, while short role names are resolved through the tested path when possible. +There is not yet a general import alias model beyond file includes, so added language features should still be careful about symbol visibility and collision behavior. ## Validate Stage -Validation currently runs inside `validateModel` in the resolver. It validates the base model and then validates each query against the query model produced by lens expansion. +Validation currently runs inside `validateModel` in the resolver. +It validates the base model and then validates each query against the query model produced by lens expansion. Validation responsibilities: @@ -80,13 +95,17 @@ Validation responsibilities: - Check query roots and path expressions. - Reject aggregate aliases that reference raw row-level fields outside aggregate functions. -Expression validation is intentionally lightweight. It tokenizes paths and recognizes a small set of aggregate functions, scalar functions, scalar date/time properties, and expression keywords. This keeps the compiler useful for current examples without pretending to be a full Malloy or SQL expression analyzer. +Expression validation is intentionally lightweight. +It tokenizes paths and recognizes a small set of aggregate functions, scalar functions, scalar date/time properties, and expression keywords. +This keeps the compiler useful for current examples without pretending to be a full Malloy or SQL expression analyzer. ## Lens Expansion -Lenses are query-time semantic overlays. They do not mutate the base `SemanticModel`. +Lenses are query-time semantic overlays. +They do not mutate the base `SemanticModel`. -`applyQueryLenses` clones the model, then applies the lenses listed on the query from left to right. `applyLens` recursively applies parent lenses before the child lens and tracks the current stack to report `LENS_CYCLE`. +`applyQueryLenses` clones the model, then applies the lenses listed on the query from left to right. +`applyLens` recursively applies parent lenses before the child lens and tracks the current stack to report `LENS_CYCLE`. Current lens behavior: @@ -94,14 +113,18 @@ Current lens behavior: - Each `refine: Concept extend { ... }` block appends its members to the cloned target concept. - Lens `where:` refinements become additional concept filters. - Multiple lens filters compose by conjunction during emission because each filter emits as a separate Malloy `where:`. -- Lens filters apply across the query-local concept graph. A query rooted at `Customer` can join to a lens-expanded `SaleLine` source, and measures such as `sale_lines.sum(net_sales_amount)` aggregate over the filtered sale-line source rather than the base source. +- Lens filters apply across the query-local concept graph. + A query rooted at `Customer` can join to a lens-expanded `SaleLine` source, and measures such as `sale_lines.sum(net_sales_amount)` aggregate over the filtered sale-line source rather than the base source. - The base model remains available for other queries and for non-lensed emission. -Emission also uses lens expansion. For lensed queries, the emitter generates query-local sources with names like `source_name__query_name` so the lens-expanded concept graph can coexist with the base sources in the same Malloy output. +Emission also uses lens expansion. +For lensed queries, the emitter generates query-local sources with names like `source_name__query_name` so the lens-expanded concept graph can coexist with the base sources in the same Malloy output. ## Emit Stage -`src/emitter.ts` lowers a validated `SemanticModel` to Malloy text. It emits named raw sources, concept sources, query-backed sources, queries, and finally concepts backed by query results. If a query uses lenses, it emits the lens-expanded sources for that query before the query itself. +`src/emitter.ts` lowers a validated `SemanticModel` to Malloy text. +It emits named raw sources, concept sources, query-backed sources, queries, and finally concepts backed by query results. +If a query uses lenses, it emits the lens-expanded sources for that query before the query itself. Important lowering rules: @@ -113,7 +136,8 @@ Important lowering rules: - Dimensions, measures, views, and queries preserve the SemLang declaration shape where possible. - Currency metadata may emit a Malloy annotation when the compiler can infer it from referenced fields. -The emitter assumes it receives a validated model. It may still append diagnostics for lens expansion failures discovered while emitting lensed queries, but semantic failures should normally have been found before emission. +The emitter assumes it receives a validated model. +It may still append diagnostics for lens expansion failures discovered while emitting lensed queries, but semantic failures should normally have been found before emission. ## Diagnostics Philosophy @@ -131,7 +155,8 @@ The current stage boundaries are: - Resolution or validation errors prevent `model` and Malloy output from being returned by `compileSemLang`. - Emission diagnostics are accumulated into `CompileResult.diagnostics` alongside earlier diagnostics. -Warnings are supported by the type system but are not heavily used yet. Before adding warnings, decide whether callers should treat them as advisory metadata or as policy signals. +Warnings are supported by the type system but are not heavily used yet. +Before adding warnings, decide whether callers should treat them as advisory metadata or as policy signals. ## Test Strategy @@ -156,7 +181,8 @@ Use integration tests for: - Boundaries with optional Malloy/DuckDB runtime packages. - Verifying generated text remains suitable for downstream runtime loading without depending on unstable Malloy internals. -When adding a language feature, prefer one small semantic test that isolates the feature and one fixture-oriented assertion if the examples rely on it. Diagnostics tests should assert codes and important locations, not full prose, unless the wording itself is the behavior being protected. +When adding a language feature, prefer one small semantic test that isolates the feature and one fixture-oriented assertion if the examples rely on it. +Diagnostics tests should assert codes and important locations, not full prose, unless the wording itself is the behavior being protected. ## Known Limitations @@ -173,4 +199,5 @@ The current compiler is deliberately V1-shaped: - The compiler does not verify that referenced physical table columns exist. - Runtime execution is outside the compiler boundary; tests mostly verify generated Malloy text and optional runtime package availability. -Future implementers should keep these limitations visible. The safest evolution path is to replace narrow heuristics with explicit intermediate representations only when a feature needs that precision, while preserving the current public artifacts and diagnostic behavior for callers. +Future implementers should keep these limitations visible. +The safest evolution path is to replace narrow heuristics with explicit intermediate representations only when a feature needs that precision, while preserving the current public artifacts and diagnostic behavior for callers. diff --git a/design-docs/language.md b/design-docs/language.md index c9208a5..f35b6e1 100644 --- a/design-docs/language.md +++ b/design-docs/language.md @@ -1,8 +1,10 @@ # SemLang V1 Language Specification -SemLang is a semantic modeling language that stays close to Malloy so it can be compiled into Malloy for query execution. It adds an ontology layer inspired by gUFO and OntoUML: business concepts, roles, relators, situations, temporal axes, and validation predicates live beside the analytical model instead of in a separate diagram. +SemLang is a semantic modeling language that stays close to Malloy so it can be compiled into Malloy for query execution. +It adds an ontology layer inspired by gUFO and OntoUML: business concepts, roles, relators, situations, temporal axes, and validation predicates live beside the analytical model instead of in a separate diagram. -V1 is defined by the retail SemLang examples in `packages/semlang/examples/retail-omnichannel-margin-and-returns` and by the recurring Malloy patterns in the banking, healthcare, manufacturing, retail, and SaaS examples. The compiler is intentionally conservative: every accepted construct must lower to deterministic Malloy or produce diagnostics. +V1 is defined by the retail SemLang examples in `packages/semlang/examples/retail-omnichannel-margin-and-returns` and by the recurring Malloy patterns in the banking, healthcare, manufacturing, retail, and SaaS examples. +The compiler is intentionally conservative: every accepted construct must lower to deterministic Malloy or produce diagnostics. ## Packages and Includes @@ -18,7 +20,9 @@ Files may include other SemLang files by relative path: include "./example.semlang" ``` -Includes are loaded before the including file is resolved. Each resolved include file is merged once per compilation, so shared files can be safely included through multiple paths in the include graph. Include cycles are invalid. +Includes are loaded before the including file is resolved. +Each resolved include file is merged once per compilation, so shared files can be safely included through multiple paths in the include graph. +Include cycles are invalid. ## Semantic Types @@ -32,9 +36,14 @@ type: Dollars is currency { } ``` -V1 primitive bases are `string`, `number`, `date`, `timestamp`, `currency`, and `boolean`. Type bodies are metadata maps. Recognized JSON Schema-style metadata includes `description`, `enum`, `const`, `default`, `examples`, numeric and string bounds, `pattern`, and `format`. SemLang-specific metadata includes `scale_type`, `identifies`, `identifies_role`, `currency`, `unit`, and `render_format`. Unknown metadata is preserved in the AST and semantic model but does not affect Malloy emission. +V1 primitive bases are `string`, `number`, `date`, `timestamp`, `currency`, and `boolean`. +Type bodies are metadata maps. +Recognized JSON Schema-style metadata includes `description`, `enum`, `const`, `default`, `examples`, numeric and string bounds, `pattern`, and `format`. +SemLang-specific metadata includes `scale_type`, `identifies`, `identifies_role`, `currency`, `unit`, and `render_format`. +Unknown metadata is preserved in the AST and semantic model but does not affect Malloy emission. -Field annotations use Malloy-like `::` syntax. A trailing `?` marks nullable values: +Field annotations use Malloy-like `::` syntax. +A trailing `?` marks nullable values: ```semlang customer_id :: CustomerId? @@ -50,7 +59,8 @@ concept SaleLine is situation from duckdb.table('retail_line_items') { } ``` -The `from` clause follows Malloy source semantics. It can reference a table or view through a named connection, a SQL source, a named source, a concept source, or a query result: +The `from` clause follows Malloy source semantics. +It can reference a table or view through a named connection, a SQL source, a named source, a concept source, or a query result: ```semlang source: recent_sales is duckdb.sql("""select * from sales where sold_at >= '2026-01-01'""") @@ -81,7 +91,9 @@ ignored duckdb.table('staging_customer_raw') { } ``` -`reason` is required. Ignored declarations are metadata only: they do not produce concepts, fields, sources, queries, or any Malloy output. Tooling can read them from the resolved semantic model and JSON Schema metadata to distinguish deliberately excluded tables from tables that have not yet been modeled. +`reason` is required. +Ignored declarations are metadata only: they do not produce concepts, fields, sources, queries, or any Malloy output. +Tooling can read them from the resolved semantic model and JSON Schema metadata to distinguish deliberately excluded tables from tables that have not yet been modeled. V1 concept stereotypes are: @@ -91,7 +103,9 @@ V1 concept stereotypes are: - `relator`: relationship object such as a promotion allocation. - `phase of Parent`: temporal/specialized state of a parent concept. -`identity` declares one or more source-backed key fields. Composite identities are comma-separated. Malloy emission maps a single identity to `primary_key: field`; composite identities lower through a deterministic generated dimension, with `primary_key:` pointing at that generated field. +`identity` declares one or more source-backed key fields. +Composite identities are comma-separated. +Malloy emission maps a single identity to `primary_key: field`; composite identities lower through a deterministic generated dimension, with `primary_key:` pointing at that generated field. ## Fields, Joins, and Temporal Axes @@ -105,9 +119,11 @@ field: } ``` -Identities, fields, dimensions, and measures may include a block-level `description`. Field and definition blocks can also carry write mappings where write behavior is declared. +Identities, fields, dimensions, and measures may include a block-level `description`. +Field and definition blocks can also carry write mappings where write behavior is declared. -Identity and field names that match SemLang keywords, such as `measure`, are accepted in unambiguous declarations but reported as validation lint warnings during ontology loading. Reference the name wherever an expression is expected, such as `where: measure > 0` or `dimension: measurement_value is measure`; only the section header form with a colon, such as `measure:`, is parsed as language syntax. +Identity and field names that match SemLang keywords, such as `measure`, are accepted in unambiguous declarations but reported as validation lint warnings during ontology loading. +Reference the name wherever an expression is expected, such as `where: measure > 0` or `dimension: measurement_value is measure`; only the section header form with a colon, such as `measure:`, is parsed as language syntax. `join_one` and `join_many` declare Malloy joins and semantic participation: @@ -119,10 +135,13 @@ join_one profile: duckdb.table('customer_profiles') on customer_id = profile.cus join_cross fiscal_calendar: FiscalCalendar ``` -The `?` marker means participation is optional. It is semantic metadata; Malloy emission still uses the appropriate join kind. +The `?` marker means participation is optional. +It is semantic metadata; Malloy emission still uses the appropriate join kind. `with` joins use Malloy's foreign-key shorthand and require the target concept to have an identity when the target is known. -When a one-to-one auxiliary table should enrich a concept without changing the concept's master row population, `join_one` may target an inline named-connection source expression. This mirrors Malloy source-extension syntax and keeps the primary `from` source as the concept's master list. -Inline filters are not part of `join_one` syntax. To filter an auxiliary source before joining it, declare a named source query: +When a one-to-one auxiliary table should enrich a concept without changing the concept's master row population, `join_one` may target an inline named-connection source expression. +This mirrors Malloy source-extension syntax and keeps the primary `from` source as the concept's master list. +Inline filters are not part of `join_one` syntax. +To filter an auxiliary source before joining it, declare a named source query: ```semlang source: active_profiles is duckdb.table('customer_profiles') -> { @@ -180,11 +199,15 @@ Roles are usable in expressions: customer is Customer.Loyalty ``` -The canonical role name is the owning concept plus the local role name, such as `Customer.Loyalty`. Bare role names are also accepted when the tested path identifies the owning concept, such as `customer is Loyalty` when `customer` joins to `Customer`. If a bare role name is ambiguous, use the qualified form. +The canonical role name is the owning concept plus the local role name, such as `Customer.Loyalty`. +Bare role names are also accepted when the tested path identifies the owning concept, such as `customer is Loyalty` when `customer` joins to `Customer`. +If a bare role name is ambiguous, use the qualified form. -Role `label` and `aliases` metadata support discovery and presentation. Array-valued metadata may use either bracketed literals or top-level comma-separated values, so `aliases: ["Rewards Customer", "Member Customer"]` and `aliases: "Rewards Customer", "Member Customer"` are equivalent. +Role `label` and `aliases` metadata support discovery and presentation. +Array-valued metadata may use either bracketed literals or top-level comma-separated values, so `aliases: ["Rewards Customer", "Member Customer"]` and `aliases: "Rewards Customer", "Member Customer"` are equivalent. -During Malloy emission, role tests lower to their predicates with the correct path prefix. A join target may name a role, including a qualified role such as `Customer.Loyalty`; V1 resolves it to the role's base concept and applies the role predicate as part of validation and expression lowering. +During Malloy emission, role tests lower to their predicates with the correct path prefix. +A join target may name a role, including a qualified role such as `Customer.Loyalty`; V1 resolves it to the role's base concept and applies the role predicate as part of validation and expression lowering. ## Dimensions, Measures, Views, and Queries @@ -234,9 +257,13 @@ query: monthly_margin_and_returns is SaleLine -> { max_possible_unique_customers is identified_customers + unrecognized_cash_sales ``` -Aliases may reference visible measures, aggregate functions, and earlier aggregate aliases. Raw row-level fields must appear inside aggregate functions. +Aliases may reference visible measures, aggregate functions, and earlier aggregate aliases. +Raw row-level fields must appear inside aggregate functions. -Query and view bodies support the Malloy clauses `where:`, `select:`/`project:`, `group_by:`, `aggregate:`, `having:`, `calculate:`, `nest:`, `index:`, `order_by:`, and `limit:`/`top:`. `select:` creates projection-style views and queries. `project:` is accepted for Malloy compatibility and emitted as `select:`. `calculate:` is passed through as Malloy analytic/window calculation syntax after expression validation. +Query and view bodies support the Malloy clauses `where:`, `select:`/`project:`, `group_by:`, `aggregate:`, `having:`, `calculate:`, `nest:`, `index:`, `order_by:`, and `limit:`/`top:`. +`select:` creates projection-style views and queries. +`project:` is accepted for Malloy compatibility and emitted as `select:`. +`calculate:` is passed through as Malloy analytic/window calculation syntax after expression validation. ## Validations @@ -250,7 +277,8 @@ validation: } ``` -V1 preserves validations in the semantic model. They are not emitted into analytical Malloy queries by default. +V1 preserves validations in the semantic model. +They are not emitted into analytical Malloy queries by default. ## Lenses @@ -278,11 +306,11 @@ A query applies lenses with `with`: query: western_margin is SaleLine with western_region -> { ... } ``` -V1 lens application copies the semantic model for the query, applies lenses left-to-right, merges `refine: X extend { ... }` members into concept `X`, and treats `where:` refinements as source filters. Multiple filters compose by conjunction. +V1 lens application copies the semantic model for the query, applies lenses left-to-right, merges `refine: X extend { ... }` members into concept `X`, and treats `where:` refinements as source filters. +Multiple filters compose by conjunction. -Lens filters are not limited to the query root. They apply to the query-local -concept graph before the query body is lowered, so a root-grain query can -aggregate through a filtered joined grain. +Lens filters are not limited to the query root. +They apply to the query-local concept graph before the query body is lowered, so a root-grain query can aggregate through a filtered joined grain. ```semlang concept ProductSKU is kind from duckdb.table('products') { @@ -333,10 +361,9 @@ query: young_adult_apple_value is Customer with apple_products, young_adult_cust } ``` -This query is rooted at `Customer`, but `apple_product_spend` aggregates through -`sale_lines`. With the lenses applied, the generated customer source joins the -query-local `SaleLine` source, and that `SaleLine` source carries the Apple -filter. The young-adult filter applies at the customer source at the same time. +This query is rooted at `Customer`, but `apple_product_spend` aggregates through `sale_lines`. +With the lenses applied, the generated customer source joins the query-local `SaleLine` source, and that `SaleLine` source carries the Apple filter. +The young-adult filter applies at the customer source at the same time. The base model remains unchanged for other queries. ## Malloy Lowering @@ -347,7 +374,8 @@ Source expressions emit in Malloy's connection-qualified form: source: retail_line_items is duckdb.table('retail_line_items') extend { ... } ``` -SQL sources emit as `connection.sql("""...""")`, named source references emit by name, and concepts backed by queries are emitted after the query declaration they reference. The compiler may emit semantically equivalent Malloy rather than byte-for-byte matching hand-written fixtures. +SQL sources emit as `connection.sql("""...""")`, named source references emit by name, and concepts backed by queries are emitted after the query declaration they reference. +The compiler may emit semantically equivalent Malloy rather than byte-for-byte matching hand-written fixtures. Semantic-only constructs lower as follows: diff --git a/design-docs/semlang-vs-palantir.md b/design-docs/semlang-vs-palantir.md index 7bdaf11..30e72fc 100644 --- a/design-docs/semlang-vs-palantir.md +++ b/design-docs/semlang-vs-palantir.md @@ -1,6 +1,7 @@ # SemLang vs Palantir Ontology -This comparison focuses on ontology-system functionality: modeling, querying, operational writeback, governance, developer access, and application/runtime surfaces. SemLang status is based on the local V1 language and action documentation; Palantir status is based on public Palantir Foundry documentation reviewed in May 2026. +This comparison focuses on ontology-system functionality: modeling, querying, operational writeback, governance, developer access, and application/runtime surfaces. +SemLang status is based on the local V1 language and action documentation; Palantir status is based on public Palantir Foundry documentation reviewed in May 2026. Legend: `✓` = supported, `△` = partial or planned, `X` = not supported or not documented as a comparable capability. @@ -48,6 +49,8 @@ Legend: `✓` = supported, `△` = partial or planned, `X` = not supported or no ## High-Level Gap Summary -SemLang is strongest as a compact, text-first semantic modeling and analytical query layer. It has richer explicit ontological classifiers than Palantir's public Ontology model and a clean Malloy lowering story for read analytics. +SemLang is strongest as a compact, text-first semantic modeling and analytical query layer. +It has richer explicit ontological classifiers than Palantir's public Ontology model and a clean Malloy lowering story for read analytics. -Palantir is strongest as an operational ontology platform. The biggest gaps for SemLang relative to Palantir are durable object storage/indexing, governed writeback, function-backed actions, object-level security, end-user applications, generated SDKs, branching/proposal workflows, materializations, lineage, observability, and semantic/vector search over object data. +Palantir is strongest as an operational ontology platform. +The biggest gaps for SemLang relative to Palantir are durable object storage/indexing, governed writeback, function-backed actions, object-level security, end-user applications, generated SDKs, branching/proposal workflows, materializations, lineage, observability, and semantic/vector search over object data. diff --git a/design-docs/supported_malloy_features.md b/design-docs/supported_malloy_features.md index 6542959..fd6ca78 100644 --- a/design-docs/supported_malloy_features.md +++ b/design-docs/supported_malloy_features.md @@ -1,8 +1,10 @@ # Supported Malloy Features in SemLang -This audit compares SemLang's current compiler surface with the official Malloy documentation. It focuses on what the SemLang parser, resolver, and emitter accept today and whether the emitted Malloy preserves the documented Malloy behavior. +This audit compares SemLang's current compiler surface with the official Malloy documentation. +It focuses on what the SemLang parser, resolver, and emitter accept today and whether the emitted Malloy preserves the documented Malloy behavior. -`Supported in SemLang` is exactly `Supported` only when the feature works as expected. Other statuses call out reduced syntax, validation limits, or intentionally deferred areas. +`Supported in SemLang` is exactly `Supported` only when the feature works as expected. +Other statuses call out reduced syntax, validation limits, or intentionally deferred areas. ## Official Malloy Sources Reviewed diff --git a/docs/language-reference/actions.md b/docs/language-reference/actions.md index 5aa9b97..a52706c 100644 --- a/docs/language-reference/actions.md +++ b/docs/language-reference/actions.md @@ -3,9 +3,11 @@ title: Actions sidebar_position: 6 --- -Actions describe permitted write operations on ontology objects. They are concept-local because every action has a subject: an existing object, a new object of the owning concept, or a collection of owning-concept objects. +Actions describe permitted write operations on ontology objects. +They are concept-local because every action has a subject: an existing object, a new object of the owning concept, or a collection of owning-concept objects. -SemLang still lowers analytical reads to Malloy. Actions lower to a separate action manifest for a runtime adapter, API gateway, MCP server, or app surface that can validate parameters, evaluate guards, perform writes, and record an action log. +SemLang still lowers analytical reads to Malloy. +Actions lower to a separate action manifest for a runtime adapter, API gateway, MCP server, or app surface that can validate parameters, evaluate guards, perform writes, and record an action log. ## Concept-Local Actions @@ -43,7 +45,8 @@ concept SupplierLot is kind from duckdb.table('supplier_lots') { } ``` -The implicit `this` binding is the subject object for `subject: single` and each item under evaluation for `subject: collection`. For `subject: new`, `this` is bound by the `insert` edit. +The implicit `this` binding is the subject object for `subject: single` and each item under evaluation for `subject: collection`. +For `subject: new`, `this` is bound by the `insert` edit. ## Subject @@ -55,7 +58,9 @@ subject: new subject: collection ``` -`single` means the action targets one existing object of the owning concept. `new` means the action creates one object of the owning concept. `collection` means the action targets a list of existing owning-concept objects. +`single` means the action targets one existing object of the owning concept. +`new` means the action creates one object of the owning concept. +`collection` means the action targets a list of existing owning-concept objects. Collection subjects can add execution semantics: @@ -66,7 +71,8 @@ subject: collection { } ``` -`atomic: true` means all items commit or none commit. `atomic: false` allows per-item success and failure reporting. +`atomic: true` means all items commit or none commit. +`atomic: false` allows per-item success and failure reporting. ## Parameters @@ -93,11 +99,13 @@ action quarantine { } ``` -The action manifest exports parameter schemas using the JSON Schema metadata declared on semantic types. Action-local parameter metadata can refine the type when needed, but it cannot relax the named type. +The action manifest exports parameter schemas using the JSON Schema metadata declared on semantic types. +Action-local parameter metadata can refine the type when needed, but it cannot relax the named type. ## Guards -Guards are submission criteria. They must be true before the edit plan can run: +Guards are submission criteria. +They must be true before the edit plan can run: ```semlang guard: @@ -108,7 +116,8 @@ guard: else "Only quality managers can quarantine lots." ``` -Guards can reference `this`, parameters, fields, dimensions, joins, roles, and user context exposed by the runtime. For `subject: collection`, guards are evaluated for each item unless the guard is explicitly marked as collection-level by the runtime manifest. +Guards can reference `this`, parameters, fields, dimensions, joins, roles, and user context exposed by the runtime. +For `subject: collection`, guards are evaluated for each item unless the guard is explicitly marked as collection-level by the runtime manifest. ## Writeable Fields and Dimensions @@ -125,7 +134,8 @@ For a source-backed field, `writeable` implies the default write implementation: write: column status = value ``` -where `status` is both the semantic field name and the physical column name. The runtime owns the `UPDATE`, `WHERE`, transaction, parameter binding, and authorization checks. +where `status` is both the semantic field name and the physical column name. +The runtime owns the `UPDATE`, `WHERE`, transaction, parameter binding, and authorization checks. Derived dimensions are not writeable unless they declare an explicit write mapping: @@ -138,7 +148,8 @@ dimension: } ``` -`value` is the value assigned by the action. The compiler rejects assignments to non-writeable fields, derived dimensions without write mappings, measures, joins, roles, and aggregate values. +`value` is the value assigned by the action. +The compiler rejects assignments to non-writeable fields, derived dimensions without write mappings, measures, joins, roles, and aggregate values. ## Custom Write Mappings @@ -173,7 +184,8 @@ field: } ``` -Raw SQL write mappings are assignment fragments, not full statements. The runtime must parameterize `{value}` and must not string-interpolate user input. +Raw SQL write mappings are assignment fragments, not full statements. +The runtime must parameterize `{value}` and must not string-interpolate user input. ## Edits @@ -186,7 +198,8 @@ edit: set quarantined_at = current_time ``` -For `subject: single`, `set field = expression` assigns a writeable member on `this`. For `subject: new`, use `insert`: +For `subject: single`, `set field = expression` assigns a writeable member on `this`. +For `subject: new`, use `insert`: ```semlang concept RecallCampaign is kind from duckdb.table('recall_campaigns') { @@ -223,7 +236,8 @@ effect after_commit: } ``` -`before_commit` effects can block the transaction. `after_commit` effects run after durable writes and are logged independently. +`before_commit` effects can block the transaction. +`after_commit` effects run after durable writes and are logged independently. Action logs describe the audit object emitted by the runtime: @@ -261,7 +275,8 @@ action quarantine { } ``` -When this is added, raw execution blocks must declare their semantic write scope so agents, reviewers, audit tools, and policy checks can reason about the change. Whole-action raw SQL execution is not part of the first parser and validation slice. +When this is added, raw execution blocks must declare their semantic write scope so agents, reviewers, audit tools, and policy checks can reason about the change. +Whole-action raw SQL execution is not part of the first parser and validation slice. ## Agent Exposure @@ -275,11 +290,13 @@ agent: idempotency_key: concat('quarantine:', this.supplier_lot_id) ``` -Agent metadata is not authorization. It tells tool surfaces how to present the action, whether confirmation is required, and how to avoid accidental duplicate submissions. +Agent metadata is not authorization. +It tells tool surfaces how to present the action, whether confirmation is required, and how to avoid accidental duplicate submissions. ## Lowering -Actions do not lower to Malloy. The compiler emits Malloy for reads and an action manifest for writes: +Actions do not lower to Malloy. +The compiler emits Malloy for reads and an action manifest for writes: ```text SemLang @@ -288,4 +305,5 @@ SemLang -> action manifest ``` -The action manifest contains the parameter JSON Schema, subject mode, guards, writable-member mappings, edit plan, side-effect plan, log configuration, and agent metadata. Runtime adapters turn the manifest into SQL, API calls, queue messages, or other write mechanisms. +The action manifest contains the parameter JSON Schema, subject mode, guards, writable-member mappings, edit plan, side-effect plan, log configuration, and agent metadata. +Runtime adapters turn the manifest into SQL, API calls, queue messages, or other write mechanisms. diff --git a/docs/language-reference/concepts.md b/docs/language-reference/concepts.md index 4f3bc7d..aeebafa 100644 --- a/docs/language-reference/concepts.md +++ b/docs/language-reference/concepts.md @@ -3,7 +3,8 @@ title: Concepts sidebar_position: 2 --- -Concepts are SemLang's main modeling unit. A concept declares an ontological classifier and the Malloy source expression that backs it. +Concepts are SemLang's main modeling unit. +A concept declares an ontological classifier and the Malloy source expression that backs it. ```semlang concept SaleLine is situation from duckdb.table('retail_line_items') { @@ -11,7 +12,8 @@ concept SaleLine is situation from duckdb.table('retail_line_items') { } ``` -The compiler emits each concept as a Malloy source. Semantic members such as roles, temporal axes, and validations enrich that source before or during lowering. +The compiler emits each concept as a Malloy source. +Semantic members such as roles, temporal axes, and validations enrich that source before or during lowering. Concept `from` clauses use Malloy source description: @@ -34,7 +36,8 @@ concept SaleStatus is situation from sales_by_status { } ``` -Use explicit connection names, such as `duckdb.table('customers')` and `duckdb.sql("""...""")`. Named sources, concept sources, and query declarations can also be used as source references. +Use explicit connection names, such as `duckdb.table('customers')` and `duckdb.sql("""...""")`. +Named sources, concept sources, and query declarations can also be used as source references. ## Stereotypes @@ -71,7 +74,8 @@ Composite identities are comma-separated: identity store_id :: StoreId, snapshot_date :: BusinessDate ``` -When a concept lowers to Malloy, a single identity becomes `primary_key: field`. Composite identities lower through a deterministic generated dimension, with `primary_key:` pointing at that generated field. +When a concept lowers to Malloy, a single identity becomes `primary_key: field`. +Composite identities lower through a deterministic generated dimension, with `primary_key:` pointing at that generated field. ## Fields @@ -86,10 +90,12 @@ field: } ``` -The trailing `?` marks a nullable value. The optional `unique` marker records uniqueness metadata on a field. +The trailing `?` marks a nullable value. +The optional `unique` marker records uniqueness metadata on a field. Identities, fields, dimensions, and measures may include a block-level `description`; descriptions are preserved for schema export and MCP ontology introspection. -Identity and field names that match SemLang keywords, such as `measure`, are accepted in unambiguous declarations but reported as validation lint warnings during ontology loading. Reference the name wherever an expression is expected, such as `where: measure > 0` or `dimension: measurement_value is measure`; only the section header form with a colon, such as `measure:`, is parsed as language syntax. +Identity and field names that match SemLang keywords, such as `measure`, are accepted in unambiguous declarations but reported as validation lint warnings during ontology loading. +Reference the name wherever an expression is expected, such as `where: measure > 0` or `dimension: measurement_value is measure`; only the section header form with a colon, such as `measure:`, is parsed as language syntax. ## Joins @@ -103,12 +109,16 @@ join_one profile: duckdb.table('customer_profiles') on customer_id = profile.cus join_cross fiscal_calendar: FiscalCalendar ``` -The `?` marker after the join name means participation is optional. It is semantic metadata; Malloy emission still uses the declared join kind. +The `?` marker after the join name means participation is optional. +It is semantic metadata; Malloy emission still uses the declared join kind. `with` joins use Malloy's foreign-key shorthand and require a target identity when SemLang can resolve the target concept. -A join target can also name a role. V1 resolves the role to its base concept and applies the role predicate as part of validation and expression lowering. -For one-to-one auxiliary tables, `join_one` can target an inline named-connection source expression. The owning concept's `from` source remains the master row population; the inline source is a Malloy-shaped enrichment join. -Inline filters are not part of `join_one` syntax. To filter an auxiliary source before joining it, declare a named source query: +A join target can also name a role. +V1 resolves the role to its base concept and applies the role predicate as part of validation and expression lowering. +For one-to-one auxiliary tables, `join_one` can target an inline named-connection source expression. +The owning concept's `from` source remains the master row population; the inline source is a Malloy-shaped enrichment join. +Inline filters are not part of `join_one` syntax. +To filter an auxiliary source before joining it, declare a named source query: ```semlang source: active_profiles is duckdb.table('customer_profiles') -> { @@ -169,8 +179,11 @@ Roles can be tested in expressions: customer is Customer.Loyalty ``` -The canonical role name is the owning concept plus the local role name, such as `Customer.Loyalty`. Bare role names are accepted when the tested path identifies the owning concept, such as `customer is Loyalty` when `customer` joins to `Customer`. If a bare role name is ambiguous, use the qualified form. +The canonical role name is the owning concept plus the local role name, such as `Customer.Loyalty`. +Bare role names are accepted when the tested path identifies the owning concept, such as `customer is Loyalty` when `customer` joins to `Customer`. +If a bare role name is ambiguous, use the qualified form. -Role `label` and `aliases` metadata support discovery and presentation. Array-valued metadata may use either bracketed literals or top-level comma-separated values, so `aliases: ["Rewards Customer", "Member Customer"]` and `aliases: "Rewards Customer", "Member Customer"` are equivalent. +Role `label` and `aliases` metadata support discovery and presentation. +Array-valued metadata may use either bracketed literals or top-level comma-separated values, so `aliases: ["Rewards Customer", "Member Customer"]` and `aliases: "Rewards Customer", "Member Customer"` are equivalent. During Malloy emission, role tests lower to their predicates with the correct path prefix. diff --git a/docs/language-reference/declarations.md b/docs/language-reference/declarations.md index 508664b..e21fa15 100644 --- a/docs/language-reference/declarations.md +++ b/docs/language-reference/declarations.md @@ -3,7 +3,8 @@ title: Declarations sidebar_position: 3 --- -SemLang declarations define packages, reusable semantic types, concepts, analytical members, validations, lenses, and queries. Declarations use a Malloy-like shape but carry additional semantic information for SemLang resolution and lowering. +SemLang declarations define packages, reusable semantic types, concepts, analytical members, validations, lenses, and queries. +Declarations use a Malloy-like shape but carry additional semantic information for SemLang resolution and lowering. ## Package and Include @@ -19,7 +20,9 @@ Use `include` to load another SemLang file before resolving the current file: include "./shared-types.semlang" ``` -Includes are relative paths. Each resolved include file is merged once per compilation, so shared files can be included by both a root file and downstream domain files. Include cycles are invalid. +Includes are relative paths. +Each resolved include file is merged once per compilation, so shared files can be included by both a root file and downstream domain files. +Include cycles are invalid. ## Semantic Types @@ -42,7 +45,10 @@ V1 primitive bases are: - `currency` - `boolean` -Type bodies are metadata maps. Recognized JSON Schema-style metadata includes `description`, `enum`, `const`, `default`, `examples`, numeric and string bounds, `pattern`, and `format`. SemLang-specific metadata includes `scale_type`, `identifies`, `identifies_role`, `currency`, `unit`, and `render_format`. Unknown metadata is preserved in the AST and semantic model but does not affect Malloy emission. +Type bodies are metadata maps. +Recognized JSON Schema-style metadata includes `description`, `enum`, `const`, `default`, `examples`, numeric and string bounds, `pattern`, and `format`. +SemLang-specific metadata includes `scale_type`, `identifies`, `identifies_role`, `currency`, `unit`, and `render_format`. +Unknown metadata is preserved in the AST and semantic model but does not affect Malloy emission. ## Sources and Concepts @@ -65,7 +71,9 @@ concept Store is kind from store_rows { } ``` -The source expression uses Malloy's named connection forms. Use `duckdb.table('stores')`, `bigquery.table('dataset.table')`, `duckdb.sql("""select ...""")`, or a named source/query reference. SemLang does not invent an implicit connection for `table('stores')`. +The source expression uses Malloy's named connection forms. +Use `duckdb.table('stores')`, `bigquery.table('dataset.table')`, `duckdb.sql("""select ...""")`, or a named source/query reference. +SemLang does not invent an implicit connection for `table('stores')`. Concept bodies can contain identities, temporal axes, fields, joins, roles, dimensions, measures, views, validations, and `where` filters. @@ -90,7 +98,8 @@ measure: gross_sales :: Dollars is sum(gross_sales_amount) ``` -Definitions may include a block-level `description`. Descriptions on identities, fields, dimensions, and measures are preserved in the semantic model and are exposed through JSON Schema export and MCP ontology introspection. +Definitions may include a block-level `description`. +Descriptions on identities, fields, dimensions, and measures are preserved in the semantic model and are exposed through JSON Schema export and MCP ontology introspection. ## Views @@ -120,7 +129,8 @@ validation: } ``` -V1 preserves validations in the semantic model. They are not emitted into analytical Malloy queries by default. +V1 preserves validations in the semantic model. +They are not emitted into analytical Malloy queries by default. ## Queries diff --git a/docs/language-reference/diagnostics-lowering.md b/docs/language-reference/diagnostics-lowering.md index bf20058..0c6aa57 100644 --- a/docs/language-reference/diagnostics-lowering.md +++ b/docs/language-reference/diagnostics-lowering.md @@ -15,7 +15,9 @@ source: retail_line_items is duckdb.table('retail_line_items') extend { } ``` -Source declarations must use real Malloy source syntax, including named connections such as `duckdb.table('retail_line_items')` or `duckdb.sql("""select ...""")`. Unqualified `table('...')` is diagnosed because it would hide a connection decision in the SemLang compiler. The compiler may emit semantically equivalent Malloy rather than byte-for-byte matching hand-written fixtures. +Source declarations must use real Malloy source syntax, including named connections such as `duckdb.table('retail_line_items')` or `duckdb.sql("""select ...""")`. +Unqualified `table('...')` is diagnosed because it would hide a connection decision in the SemLang compiler. +The compiler may emit semantically equivalent Malloy rather than byte-for-byte matching hand-written fixtures. Semantic-only constructs lower as follows: @@ -42,7 +44,9 @@ query: monthly_margin is SaleLine -> { } ``` -Lowering resolves the root concept to the generated Malloy source name and emits a Malloy query. Query and view bodies preserve Malloy-shaped `where:`, `select:`/`project:`, `group_by:`, `aggregate:`, `having:`, `calculate:`, `nest:`, `index:`, `order_by:`, and `limit:`/`top:` clauses. When a query applies lenses, the compiler creates a query-specific semantic model, emits lens-refined sources for that query, and points the query at the refined root source. +Lowering resolves the root concept to the generated Malloy source name and emits a Malloy query. +Query and view bodies preserve Malloy-shaped `where:`, `select:`/`project:`, `group_by:`, `aggregate:`, `having:`, `calculate:`, `nest:`, `index:`, `order_by:`, and `limit:`/`top:` clauses. +When a query applies lenses, the compiler creates a query-specific semantic model, emits lens-refined sources for that query, and points the query at the refined root source. ## Diagnostics diff --git a/docs/language-reference/expressions.md b/docs/language-reference/expressions.md index b86c711..a81c84d 100644 --- a/docs/language-reference/expressions.md +++ b/docs/language-reference/expressions.md @@ -3,7 +3,8 @@ title: Expressions sidebar_position: 5 --- -SemLang expressions intentionally stay close to Malloy expressions. The compiler preserves row-level and aggregate expressions where possible, while adding semantic lowering for role tests, temporal joins, lenses, and query aliases. +SemLang expressions intentionally stay close to Malloy expressions. +The compiler preserves row-level and aggregate expressions where possible, while adding semantic lowering for role tests, temporal joins, lenses, and query aliases. ## Typed Names @@ -14,7 +15,8 @@ customer_id :: CustomerId closed_date :: BusinessDate? ``` -The trailing `?` marks a nullable value. Typed names appear in identities, fields, and optional type annotations on dimensions and measures. +The trailing `?` marks a nullable value. +Typed names appear in identities, fields, and optional type annotations on dimensions and measures. ## Definitions @@ -30,7 +32,8 @@ measure: } ``` -Definitions can wrap onto continuation lines when the expression is long. They may also include a block-level `description`, which is preserved for schema export, MCP introspection, and semantic search. +Definitions can wrap onto continuation lines when the expression is long. +They may also include a block-level `description`, which is preserved for schema export, MCP introspection, and semantic search. ## Role Tests @@ -43,7 +46,9 @@ dimension: loyalty_segment is case when customer is Customer.Loyalty then 'Loyalty' else 'Other' end ``` -During lowering, the role test is replaced by the role predicate. If the test uses a path such as `customer is Customer.Loyalty`, field references inside the predicate are prefixed with that path. Bare role names can be used when the tested path identifies the owning concept, such as `customer is Loyalty`. +During lowering, the role test is replaced by the role predicate. +If the test uses a path such as `customer is Customer.Loyalty`, field references inside the predicate are prefixed with that path. +Bare role names can be used when the tested path identifies the owning concept, such as `customer is Loyalty`. ## Join Conditions @@ -54,7 +59,8 @@ join_one store: Store on store_id join_many returns: ReturnLine on line_item_id = original_line_item_id ``` -If the condition is a single field name, lowering treats it as equality between the source field and the same field on the join target. Explicit equality conditions can name source and target fields directly. +If the condition is a single field name, lowering treats it as equality between the source field and the same field on the join target. +Explicit equality conditions can name source and target fields directly. Temporal joins can add `at expression`: @@ -86,7 +92,8 @@ Lens filters compose by conjunction when multiple lenses or refinements apply. ## Query Items and Aliases -`select:`, `group_by:`, `aggregate:`, `calculate:`, and `order_by:` sections contain expressions. Aggregate entries may define query-local aliases: +`select:`, `group_by:`, `aggregate:`, `calculate:`, and `order_by:` sections contain expressions. +Aggregate entries may define query-local aliases: ```semlang aggregate: @@ -94,8 +101,10 @@ aggregate: max_possible_unique_customers is identified_customers + unrecognized_cash_sales ``` -Aliases may reference visible measures, aggregate functions, and earlier aggregate aliases. Raw row-level fields must appear inside aggregate functions. +Aliases may reference visible measures, aggregate functions, and earlier aggregate aliases. +Raw row-level fields must appear inside aggregate functions. -`order_by:` items may include `asc` or `desc` after the expression. `limit:` accepts an integer row count. +`order_by:` items may include `asc` or `desc` after the expression. +`limit:` accepts an integer row count. Malloy filter forms such as `status ? 'new' | 'open'`, ranges with `to`, regex/string matching with `~` and `!~`, and filter strings such as `f'this week'` are validated for referenced paths and emitted unchanged. diff --git a/docs/language-reference/index.md b/docs/language-reference/index.md index 3e38aa6..c34ec47 100644 --- a/docs/language-reference/index.md +++ b/docs/language-reference/index.md @@ -3,9 +3,12 @@ title: SemLang Language Reference sidebar_position: 1 --- -SemLang is a semantic modeling language that stays close to Malloy so SemLang models can compile into Malloy for query execution. It adds an ontology layer beside the analytical model: business concepts, roles, relators, situations, temporal axes, lenses, and validation predicates live in the same file as dimensions, measures, views, and queries. +SemLang is a semantic modeling language that stays close to Malloy so SemLang models can compile into Malloy for query execution. +It adds an ontology layer beside the analytical model: business concepts, roles, relators, situations, temporal axes, lenses, and validation predicates live in the same file as dimensions, measures, views, and queries. -Version 1 is intentionally conservative. Every accepted construct must either lower to deterministic Malloy or produce diagnostics. The language shape is defined by the retail SemLang examples and by recurring Malloy patterns in the banking, healthcare, manufacturing, retail, and SaaS examples. +Version 1 is intentionally conservative. +Every accepted construct must either lower to deterministic Malloy or produce diagnostics. +The language shape is defined by the retail SemLang examples and by recurring Malloy patterns in the banking, healthcare, manufacturing, retail, and SaaS examples. ## File Shape @@ -21,7 +24,8 @@ Files may include other SemLang files by relative path: include "./example.semlang" ``` -Includes are loaded before the including file is resolved. Include cycles are invalid. +Includes are loaded before the including file is resolved. +Include cycles are invalid. After the package and any includes, a file can declare semantic types, named sources, concepts, lenses, and queries: diff --git a/docs/language-reference/lenses.md b/docs/language-reference/lenses.md index 97f0f98..bf94e1a 100644 --- a/docs/language-reference/lenses.md +++ b/docs/language-reference/lenses.md @@ -3,7 +3,8 @@ title: Lenses sidebar_position: 6 --- -A lens is a query-time semantic overlay. Lenses let a query refine concepts without changing the base semantic model. +A lens is a query-time semantic overlay. +Lenses let a query refine concepts without changing the base semantic model. ```semlang lens: western_region is { @@ -51,11 +52,13 @@ lens: western_margin_operations is western_region, margin_operations extend { } ``` -V1 applies lenses left-to-right. The compiler copies the semantic model for the query, applies each lens, and merges each `refine: X extend { ... }` block into concept `X`. +V1 applies lenses left-to-right. +The compiler copies the semantic model for the query, applies each lens, and merges each `refine: X extend { ... }` block into concept `X`. ## Filters -`where:` refinements become query-local source filters on the refined concepts. Multiple filters compose by conjunction: +`where:` refinements become query-local source filters on the refined concepts. +Multiple filters compose by conjunction: ```semlang lens: active_western_stores is western_region extend { @@ -69,7 +72,8 @@ Applying `active_western_stores` includes both the inherited western-region filt ## Deep Lens Application -Lens filters apply to the whole query-local concept graph, not only to the query root. This matters when the query is rooted at one grain but a metric aggregates through a joined grain. +Lens filters apply to the whole query-local concept graph, not only to the query root. +This matters when the query is rooted at one grain but a metric aggregates through a joined grain. ```semlang concept ProductSKU is kind from duckdb.table('products') { @@ -120,7 +124,9 @@ query: young_adult_apple_value is Customer with apple_products, young_adult_cust } ``` -The query asks a customer-grain question. The Apple lens filters the product and sale-line grains, while the young-adult lens filters the customer grain. During lowering, the compiler emits query-local sources for `Customer`, `SaleLine`, and `ProductSKU`; the customer source joins the lens-expanded sale-line source, and `apple_product_spend` aggregates over that filtered upstream source. +The query asks a customer-grain question. +The Apple lens filters the product and sale-line grains, while the young-adult lens filters the customer grain. +During lowering, the compiler emits query-local sources for `Customer`, `SaleLine`, and `ProductSKU`; the customer source joins the lens-expanded sale-line source, and `apple_product_spend` aggregates over that filtered upstream source. The generated Malloy has this shape: @@ -146,7 +152,8 @@ query: young_adult_apple_value is customers__young_adult_apple_value -> { } ``` -This is the important lens contract: filters from active lenses are applied upstream to the refined concept before root-grain metrics aggregate through that concept. The base `Customer`, `SaleLine`, and `ProductSKU` sources remain available unchanged for non-lensed queries. +This is the important lens contract: filters from active lenses are applied upstream to the refined concept before root-grain metrics aggregate through that concept. +The base `Customer`, `SaleLine`, and `ProductSKU` sources remain available unchanged for non-lensed queries. ## Lens-Local Types diff --git a/docs/language-reference/schema-vocabulary.md b/docs/language-reference/schema-vocabulary.md index dd91cc8..713f85b 100644 --- a/docs/language-reference/schema-vocabulary.md +++ b/docs/language-reference/schema-vocabulary.md @@ -3,7 +3,8 @@ title: Schema Vocabulary sidebar_position: 7 --- -SemLang can project its semantic type system to JSON Schema draft 2020-12. The exported schema uses native JSON Schema keywords for value validation and the SemLang vocabulary URI for semantic metadata: +SemLang can project its semantic type system to JSON Schema draft 2020-12. +The exported schema uses native JSON Schema keywords for value validation and the SemLang vocabulary URI for semantic metadata: ```json { @@ -30,7 +31,8 @@ type: EmailAddress is string { } ``` -Recognized JSON Schema metadata includes `title`, `description`, `default`, `deprecated`, `readOnly`, `writeOnly`, `examples`, `enum`, `const`, numeric bounds, string bounds, `pattern`, `format`, content annotations, array bounds, object bounds, `properties`, `items`, and related applicator keywords. SemLang validates the simple scalar and array shapes it can check locally. +Recognized JSON Schema metadata includes `title`, `description`, `default`, `deprecated`, `readOnly`, `writeOnly`, `examples`, `enum`, `const`, numeric bounds, string bounds, `pattern`, `format`, content annotations, array bounds, object bounds, `properties`, `items`, and related applicator keywords. +SemLang validates the simple scalar and array shapes it can check locally. SemLang-specific type metadata remains available for semantic meaning: @@ -90,7 +92,9 @@ Concept row schemas export under `$defs` names beginning with `concept.`: } ``` -Identity and field descriptions export as property-level `description` values. Joins, roles, temporal axes, validations, dimensions, and measures are semantic model features rather than plain JSON value constraints, so they export as `x-semlang-*` metadata; dimension and measure descriptions are preserved inside those metadata objects. Role metadata includes the local name, qualified name, predicate, optional label, and aliases. +Identity and field descriptions export as property-level `description` values. +Joins, roles, temporal axes, validations, dimensions, and measures are semantic model features rather than plain JSON value constraints, so they export as `x-semlang-*` metadata; dimension and measure descriptions are preserved inside those metadata objects. +Role metadata includes the local name, qualified name, predicate, optional label, and aliases. ## CLI diff --git a/docs/language-reference/sources.md b/docs/language-reference/sources.md index b9623fa..49a70b3 100644 --- a/docs/language-reference/sources.md +++ b/docs/language-reference/sources.md @@ -3,7 +3,8 @@ title: Sources sidebar_position: 4 --- -SemLang source clauses are intentionally Malloy-shaped. A concept's `from` clause takes a Malloy source expression; the compiler validates the expression form and preserves it when lowering. +SemLang source clauses are intentionally Malloy-shaped. +A concept's `from` clause takes a Malloy source expression; the compiler validates the expression form and preserves it when lowering. ## Tables @@ -15,7 +16,8 @@ concept SaleLine is situation from duckdb.table('retail_line_items') { } ``` -SemLang does not treat `table('retail_line_items')` as a magic default. If the source is DuckDB, BigQuery, Postgres, or another Malloy connection, put that connection name in the declaration. +SemLang does not treat `table('retail_line_items')` as a magic default. +If the source is DuckDB, BigQuery, Postgres, or another Malloy connection, put that connection name in the declaration. ## SQL Sources diff --git a/docs/language-reference/supported_malloy_features.md b/docs/language-reference/supported_malloy_features.md index 8f7705b..0d471c4 100644 --- a/docs/language-reference/supported_malloy_features.md +++ b/docs/language-reference/supported_malloy_features.md @@ -5,9 +5,11 @@ sidebar_position: 9 # Supported Malloy Features in SemLang -This audit compares SemLang's current compiler surface with the official Malloy documentation. It focuses on what the SemLang parser, resolver, and emitter accept today and whether the emitted Malloy preserves the documented Malloy behavior. +This audit compares SemLang's current compiler surface with the official Malloy documentation. +It focuses on what the SemLang parser, resolver, and emitter accept today and whether the emitted Malloy preserves the documented Malloy behavior. -`Supported in SemLang` is exactly `Supported` only when the feature works as expected. Other statuses call out reduced syntax, validation limits, or intentionally deferred areas. +`Supported in SemLang` is exactly `Supported` only when the feature works as expected. +Other statuses call out reduced syntax, validation limits, or intentionally deferred areas. ## Official Malloy Sources Reviewed diff --git a/docs/mcp-server/configuration.md b/docs/mcp-server/configuration.md index b819815..e041e27 100644 --- a/docs/mcp-server/configuration.md +++ b/docs/mcp-server/configuration.md @@ -5,7 +5,8 @@ sidebar_position: 2 # Configuration -SemLang projects use a project-local `.semlang/settings.yml` file. Normal MCP use should not pass project roots or model path arrays to tools; run setup once, then let the server discover this file. +SemLang projects use a project-local `.semlang/settings.yml` file. +Normal MCP use should not pass project roots or model path arrays to tools; run setup once, then let the server discover this file. ```bash semlang setup @@ -30,11 +31,13 @@ ontology: entrypoint: model.semlang ``` -SemLang also uses the `.semlang` directory for managed local state such as stats caches and default row exports. `settings.yml` is the durable project config inside that directory; generated cache contents remain under `.semlang/cache`. +SemLang also uses the `.semlang` directory for managed local state such as stats caches and default row exports. +`settings.yml` is the durable project config inside that directory; generated cache contents remain under `.semlang/cache`. ## Setup -`semlang setup` creates `.semlang/settings.yml` in the current project directory. If a config already exists above the current directory, setup uses that directory's parent as the project root. +`semlang setup` creates `.semlang/settings.yml` in the current project directory. +If a config already exists above the current directory, setup uses that directory's parent as the project root. Useful options: @@ -44,13 +47,16 @@ Useful options: - `--malloy-config-path ` or `--config-path ` writes an explicit Malloy config path. - `--export-directory ` writes an explicit export directory. -Setup discovers ontology entrypoints in this order: `--path`, `model.semlang`, `semlang.semlang`, `models/model.semlang`, then a single shallow `.semlang` file candidate. If multiple candidates exist, setup reports them and asks for `--path`. +Setup discovers ontology entrypoints in this order: `--path`, `model.semlang`, `semlang.semlang`, `models/model.semlang`, then a single shallow `.semlang` file candidate. +If multiple candidates exist, setup reports them and asks for `--path`. -Setup discovers Malloy config from the ontology entrypoint directory up to the SemLang project root, checking `malloy-config-local.json` before `malloy-config.json`. If none is found, `malloy.configPath` is omitted. +Setup discovers Malloy config from the ontology entrypoint directory up to the SemLang project root, checking `malloy-config-local.json` before `malloy-config.json`. +If none is found, `malloy.configPath` is omitted. ## Runtime Discovery -`semlang mcp` starts even when `.semlang/settings.yml` does not exist. Tool calls that need project config, including `load_ontology({})`, return setup guidance until the config is created. +`semlang mcp` starts even when `.semlang/settings.yml` does not exist. +Tool calls that need project config, including `load_ontology({})`, return setup guidance until the config is created. Once configured, agents should load the ontology with an empty request: @@ -58,14 +64,17 @@ Once configured, agents should load the ontology with an empty request: {} ``` -Relative paths in `.semlang/settings.yml` resolve from the project root, which is the directory containing `.semlang`. This keeps `ontology.entrypoint: model.semlang` pointed at `/model.semlang`. +Relative paths in `.semlang/settings.yml` resolve from the project root, which is the directory containing `.semlang`. +This keeps `ontology.entrypoint: model.semlang` pointed at `/model.semlang`. ## Malloy Config -SemLang does not inline Malloy connection configuration. `malloy-config.json` and `malloy-config-local.json` are standard Malloy config files, so SemLang references them from `.semlang/settings.yml` when needed. +SemLang does not inline Malloy connection configuration. +`malloy-config.json` and `malloy-config-local.json` are standard Malloy config files, so SemLang references them from `.semlang/settings.yml` when needed. Use `malloy-config-local.json` for machine-local credentials or paths and keep it out of version control. ## Compatibility -The MCP runtime still accepts legacy `paths`, `projectDir`, `configPath`, and `malloyConfigPath` arguments for a transition period. New docs and agents should use `semlang setup` and `load_ontology({})` instead. +The MCP runtime still accepts legacy `paths`, `projectDir`, `configPath`, and `malloyConfigPath` arguments for a transition period. +New docs and agents should use `semlang setup` and `load_ontology({})` instead. diff --git a/docs/mcp-server/index.md b/docs/mcp-server/index.md index 3d518ea..3458102 100644 --- a/docs/mcp-server/index.md +++ b/docs/mcp-server/index.md @@ -7,7 +7,8 @@ sidebar_position: 1 The SemLang MCP server gives agents a small set of tools for semantic discovery, ontology navigation, lens planning, query validation, Malloy-backed query execution, and supported local action invocation. -SemLang models use Malloy-style named connections in source declarations. Configure those connections in Malloy project or global config using the same names referenced by `.semlang` files; see [Malloy Connections](./malloy-connections.md) for setup details. +SemLang models use Malloy-style named connections in source declarations. +Configure those connections in Malloy project or global config using the same names referenced by `.semlang` files; see [Malloy Connections](./malloy-connections.md) for setup details. ## Live Source Install @@ -19,7 +20,9 @@ npm install npm link ``` -This exposes `semlang` anywhere on the machine. The MCP command is intentionally source-backed: every MCP process starts through the checked-out TypeScript source with the repo-local `tsx`, so new agents pick up code changes without waiting for a build. Restart an already-running MCP session to load edits made after it started. +This exposes `semlang` anywhere on the machine. +The MCP command is intentionally source-backed: every MCP process starts through the checked-out TypeScript source with the repo-local `tsx`, so new agents pick up code changes without waiting for a build. +Restart an already-running MCP session to load edits made after it started. ## Project Configuration @@ -63,7 +66,8 @@ MCP client configuration can usually stay this small: } ``` -`run_query` returns a transaction GUID for tracing. If executed row output is larger than 10 lines, SemLang writes the rows to `/.json` and returns that path instead of inline rows. +`run_query` returns a transaction GUID for tracing. +If executed row output is larger than 10 lines, SemLang writes the rows to `/.json` and returns that path instead of inline rows. ## Tool Surface diff --git a/docs/mcp-server/lens-tools.md b/docs/mcp-server/lens-tools.md index bcec062..55efe0f 100644 --- a/docs/mcp-server/lens-tools.md +++ b/docs/mcp-server/lens-tools.md @@ -5,7 +5,8 @@ sidebar_position: 5 # Lens Detail -Lens capabilities are exposed through the consolidated `search` and `describe` tools. Use `search` to discover candidate lenses for a question, and use `describe` to inspect, expand, audit required fields, or plan lens application before validating queries. +Lens capabilities are exposed through the consolidated `search` and `describe` tools. +Use `search` to discover candidate lenses for a question, and use `describe` to inspect, expand, audit required fields, or plan lens application before validating queries. ## Discover Lenses @@ -52,7 +53,8 @@ Call `describe` with `kind: "lens"` and `operation: "expand"` to apply one or mo ### Output -Returns diagnostics, an expanded model summary, and refinements from the requested lenses. If expansion fails, the response includes diagnostics and an error. +Returns diagnostics, an expanded model summary, and refinements from the requested lenses. +If expansion fails, the response includes diagnostics and an error. ## Required Fields @@ -69,7 +71,8 @@ Call `describe` with `kind: "lens"` and `operation: "required_fields"` to report ### Output -Returns one entry per selected lens refinement. Each entry includes exposed fields, expression text, required expression fields, and field-specific matches when `field` or `fields` is provided. +Returns one entry per selected lens refinement. +Each entry includes exposed fields, expression text, required expression fields, and field-specific matches when `field` or `fields` is provided. ### Example @@ -94,7 +97,8 @@ Call `describe` with `kind: "lens"` and `operation: "plan"` to plan lens applica ### Output -Returns described lenses and ordered steps. Each step includes parent lenses to apply first, affected concepts, and added semantic types. +Returns described lenses and ordered steps. +Each step includes parent lenses to apply first, affected concepts, and added semantic types. ### Example diff --git a/docs/mcp-server/malloy-connections.md b/docs/mcp-server/malloy-connections.md index 5c06629..68ce9b7 100644 --- a/docs/mcp-server/malloy-connections.md +++ b/docs/mcp-server/malloy-connections.md @@ -5,20 +5,23 @@ sidebar_position: 7 # Malloy Connections -SemLang source declarations use Malloy connection syntax, and the MCP server preserves those connection names when it compiles a model. A source such as `warehouse.table('analytics.orders')` means: +SemLang source declarations use Malloy connection syntax, and the MCP server preserves those connection names when it compiles a model. +A source such as `warehouse.table('analytics.orders')` means: - `warehouse` is the Malloy connection name. - `table('analytics.orders')` is the table path Malloy will resolve for that connection type. - SemLang validates the shape and emits the connection-qualified Malloy source unchanged. -The compiler preserves connection-qualified source expressions; MCP query execution uses Malloy config captured when the ontology source is loaded. SemLang still requires connection names in source declarations and does not invent a connection for unqualified `table(...)` calls. +The compiler preserves connection-qualified source expressions; MCP query execution uses Malloy config captured when the ontology source is loaded. +SemLang still requires connection names in source declarations and does not invent a connection for unqualified `table(...)` calls. There are two different names involved: - The **connection name** appears in SemLang and Malloy source text, such as `warehouse.table(...)`. - The **connection type** appears in Malloy config, such as `"is": "databricks"` or `"is": "duckdb"`. -The connection name does not need to match the backend type. Prefer stable project names such as `warehouse`, `analytics`, or `finance` when the same model might move between engines. +The connection name does not need to match the backend type. +Prefer stable project names such as `warehouse`, `analytics`, or `finance` when the same model might move between engines. ## Choose Connection Names @@ -38,7 +41,8 @@ source: recent_orders is warehouse.sql(""" """) ``` -Built-in default-looking names such as `duckdb` are fine when Malloy can create or discover those connections. Custom names such as `warehouse`, `prod_bq`, or `finance_pg` must be present in Malloy configuration. +Built-in default-looking names such as `duckdb` are fine when Malloy can create or discover those connections. +Custom names such as `warehouse`, `prod_bq`, or `finance_pg` must be present in Malloy configuration. ## Project Config @@ -53,7 +57,8 @@ malloy: configPath: config/databricks-malloy.json ``` -During setup, SemLang discovers `malloy-config-local.json` or `malloy-config.json` by walking up from the SemLang model file to the project root. If it finds no config, `malloy.configPath` is omitted. +During setup, SemLang discovers `malloy-config-local.json` or `malloy-config.json` by walking up from the SemLang model file to the project root. +If it finds no config, `malloy.configPath` is omitted. ```json { @@ -99,16 +104,20 @@ Malloy merges `malloy-config-local.json` with the sibling `malloy-config.json`, ## Engine Packages -Malloy's SDK needs a connection package loaded for each configured connection type. SemLang MCP currently registers: +Malloy's SDK needs a connection package loaded for each configured connection type. +SemLang MCP currently registers: | Config `is` value | Package loaded by SemLang MCP | | ----------------- | ------------------------------ | | `duckdb` | `@malloydata/db-duckdb/native` | | `databricks` | `@malloydata/db-databricks` | -If `malloy-config.json` uses a connection type that SemLang MCP does not yet register, `run_query` fails before execution with an error naming the missing type. Adding a new engine is a SemLang MCP code/dependency change; changing connection names or credentials is project configuration. +If `malloy-config.json` uses a connection type that SemLang MCP does not yet register, `run_query` fails before execution with an error naming the missing type. +Adding a new engine is a SemLang MCP code/dependency change; changing connection names or credentials is project configuration. -Do not rewrite every SemLang model from `duckdb.table(...)` to `databricks.table(...)` just because the deployment target is Databricks. Update source declarations only when the connection name or table path should change. A model can use `warehouse.table('main.analytics.orders')` while config maps `warehouse` to `"is": "databricks"`. +Do not rewrite every SemLang model from `duckdb.table(...)` to `databricks.table(...)` just because the deployment target is Databricks. +Update source declarations only when the connection name or table path should change. +A model can use `warehouse.table('main.analytics.orders')` while config maps `warehouse` to `"is": "databricks"`. ## CLI Setup And Verification @@ -141,7 +150,8 @@ malloy-cli --project-dir /path/to/project connections list malloy-cli --project-dir /path/to/project connections test warehouse ``` -The connection name used in the SemLang model must appear in these commands. If the model says `finance_pg.table('public.invoice')`, test `finance_pg`, not `postgres`. +The connection name used in the SemLang model must appear in these commands. +If the model says `finance_pg.table('public.invoice')`, test `finance_pg`, not `postgres`. ## MCP Usage @@ -158,7 +168,8 @@ The MCP server can then: - Return generated Malloy containing the configured connection names. - Execute named or temporary queries through the Malloy SDK. -`run_query` requires the Malloy config captured by `load_ontology`. For custom names such as `warehouse`, add the connection to `malloy-config.json` or `malloy-config-local.json`, then run `semlang setup`. +`run_query` requires the Malloy config captured by `load_ontology`. +For custom names such as `warehouse`, add the connection to `malloy-config.json` or `malloy-config-local.json`, then run `semlang setup`. ## Troubleshooting diff --git a/docs/mcp-server/ontology-tools.md b/docs/mcp-server/ontology-tools.md index 0bf36bc..317f150 100644 --- a/docs/mcp-server/ontology-tools.md +++ b/docs/mcp-server/ontology-tools.md @@ -5,11 +5,13 @@ sidebar_position: 4 # Ontology Tools -Ontology tools inspect the compiled SemLang model. Use them after `load_ontology` and before query generation when an agent needs exact semantic structure. +Ontology tools inspect the compiled SemLang model. +Use them after `load_ontology` and before query generation when an agent needs exact semantic structure. ## `describe` -Describes one ontology object or lens-oriented view. This consolidated tool replaces separate concept, action, role, metric, temporal-axis, lens detail, lens expansion, required-field, and lens-plan tools. +Describes one ontology object or lens-oriented view. +This consolidated tool replaces separate concept, action, role, metric, temporal-axis, lens detail, lens expansion, required-field, and lens-plan tools. ### Inputs @@ -23,9 +25,17 @@ Describes one ontology object or lens-oriented view. This consolidated tool repl ### Output -Concept detail returns source details, identities, fields, joins, roles, dimensions, measures, views, validations, temporal axes, actions, filters, and role base names. Action detail returns subject mode, params, guards, edits, write mappings, logs, effects, agent metadata, and write targets. Role detail returns owning concept, local name, qualified name, label, aliases, and predicate. Metric detail returns matching measures with concept, expression, type name, dependencies, and source location. Temporal-axis detail returns axes with concept name, axis name, expression, and location. +Concept detail returns source details, identities, fields, joins, roles, dimensions, measures, views, validations, temporal axes, actions, filters, and role base names. +Action detail returns subject mode, params, guards, edits, write mappings, logs, effects, agent metadata, and write targets. +Role detail returns owning concept, local name, qualified name, label, aliases, and predicate. +Metric detail returns matching measures with concept, expression, type name, dependencies, and source location. +Temporal-axis detail returns axes with concept name, axis name, expression, and location. -Lens detail returns lens names, parent lenses, descriptions, declared types, and refinements. Lens expansion returns diagnostics, expanded model summary, and refinements. Required-field detail reports fields exposed by lens refinements and fields referenced by lens expressions. Lens plans return described lenses and ordered application steps. When a detail mode receives multiple independent names, the response includes a `results` array with one description result per name. +Lens detail returns lens names, parent lenses, descriptions, declared types, and refinements. +Lens expansion returns diagnostics, expanded model summary, and refinements. +Required-field detail reports fields exposed by lens refinements and fields referenced by lens expressions. +Lens plans return described lenses and ordered application steps. +When a detail mode receives multiple independent names, the response includes a `results` array with one description result per name. ### Examples @@ -62,7 +72,8 @@ Finds declared join paths from a source concept or role target to one or more ta ### Output -Returns one result per target plus a flattened `paths` array. Each path includes the concept chain and join steps with join name, kind, target, `on`, and temporal `at` fields. +Returns one result per target plus a flattened `paths` array. +Each path includes the concept chain and join steps with join name, kind, target, `on`, and temporal `at` fields. ### Example diff --git a/docs/mcp-server/query-and-action-tools.md b/docs/mcp-server/query-and-action-tools.md index dd9f8a7..bc94b2b 100644 --- a/docs/mcp-server/query-and-action-tools.md +++ b/docs/mcp-server/query-and-action-tools.md @@ -5,7 +5,8 @@ sidebar_position: 6 # Query and Action Tools -`run_query` validates and runs SemLang queries. `invoke_action` invokes supported action edits through the configured Malloy connection. +`run_query` validates and runs SemLang queries. +`invoke_action` invokes supported action edits through the configured Malloy connection. ## `run_query` @@ -43,7 +44,10 @@ Validates a named query, full query declaration, or temporary query body against ### Output -For named queries, returns the resolved query, diagnostics, extracted `queryMalloy`, and an `execution` object. For temporary queries, returns the generated query name, root, lenses, diagnostics, extracted `queryMalloy`, and `execution`. When `dry_run_only` is true, `execution` is present with `skipped: true`, `execution.ok` is omitted, and `query_limit_seconds` is not required. The full compiled Malloy model is not returned by `run_query`; request it from `load_ontology` with `return_malloy_model` when debugging the whole generated model. +For named queries, returns the resolved query, diagnostics, extracted `queryMalloy`, and an `execution` object. +For temporary queries, returns the generated query name, root, lenses, diagnostics, extracted `queryMalloy`, and `execution`. +When `dry_run_only` is true, `execution` is present with `skipped: true`, `execution.ok` is omitted, and `query_limit_seconds` is not required. +The full compiled Malloy model is not returned by `run_query`; request it from `load_ontology` with `return_malloy_model` when debugging the whole generated model. ### Examples @@ -61,9 +65,14 @@ For named queries, returns the resolved query, diagnostics, extracted `queryMall } ``` -Execution uses the Malloy config context captured by `load_ontology`. Named queries and temporary root/body queries are both eligible for execution. If the loaded SemLang config does not identify a Malloy config, execution fails with a clear setup error. If execution exceeds `query_limit_seconds`, SemLang terminates the isolated Malloy execution worker and returns a timeout result. +Execution uses the Malloy config context captured by `load_ontology`. +Named queries and temporary root/body queries are both eligible for execution. +If the loaded SemLang config does not identify a Malloy config, execution fails with a clear setup error. +If execution exceeds `query_limit_seconds`, SemLang terminates the isolated Malloy execution worker and returns a timeout result. -Custom connection names such as `warehouse.table('analytics.orders')` must be present in Malloy config. If a model references an unknown custom connection, `run_query` returns a clear Malloy execution error naming the missing connection. See [Malloy Connections](./malloy-connections.md). +Custom connection names such as `warehouse.table('analytics.orders')` must be present in Malloy config. +If a model references an unknown custom connection, `run_query` returns a clear Malloy execution error naming the missing connection. +See [Malloy Connections](./malloy-connections.md). ### Execution Results @@ -80,7 +89,8 @@ Default successful responses omit verbose execution internals including generate ## `invoke_action` -Invokes a supported action by generating SQL and executing it with the ontology's configured Malloy connection. Generated write SQL avoids `RETURNING`, `UPDATE ... FROM`, and `DELETE ... USING` so the core lowering can run on more Malloy-backed SQL engines. +Invokes a supported action by generating SQL and executing it with the ontology's configured Malloy connection. +Generated write SQL avoids `RETURNING`, `UPDATE ... FROM`, and `DELETE ... USING` so the core lowering can run on more Malloy-backed SQL engines. ### Inputs @@ -106,7 +116,8 @@ The invoker skips or rejects unsupported edit kinds, malformed raw SQL write map ### Output -Returns the resolved concept and action, generated SQL, changed row count, selected affected rows, diagnostics, timeout metadata (`query_limit_seconds`, `timed_out`), and a verification query when available. Use `dry_run_only: true` to return generated SQL without execution. +Returns the resolved concept and action, generated SQL, changed row count, selected affected rows, diagnostics, timeout metadata (`query_limit_seconds`, `timed_out`), and a verification query when available. +Use `dry_run_only: true` to return generated SQL without execution. ### Example diff --git a/docs/mcp-server/reasoning-tools.md b/docs/mcp-server/reasoning-tools.md index 25823d5..bfab705 100644 --- a/docs/mcp-server/reasoning-tools.md +++ b/docs/mcp-server/reasoning-tools.md @@ -5,7 +5,8 @@ sidebar_position: 7 # Reasoning Workflow -SemLang does not expose a separate reasoning tool. Agents should compose the compact public tools instead: +SemLang does not expose a separate reasoning tool. +Agents should compose the compact public tools instead: - Use `search` to find candidate concepts, members, metrics, queries, lenses, or entities. - Use `describe` to inspect the selected ontology objects and lens details. diff --git a/docs/mcp-server/source-and-search.md b/docs/mcp-server/source-and-search.md index 9d561bb..3e6d1d2 100644 --- a/docs/mcp-server/source-and-search.md +++ b/docs/mcp-server/source-and-search.md @@ -20,11 +20,13 @@ Compiles one or more SemLang files or inline source strings and stores the resul | `malloyConfigPath` | string | Explicit Malloy config file escape hatch. Normal projects should use `.semlang/settings.yml`. | | `returnMalloyModel` | boolean | When true, include the full compiled Malloy model in `malloyModel`. Defaults to false. | -With no inputs, `load_ontology` uses the entrypoint and runtime paths from `.semlang/settings.yml`. If no config is available, it returns setup guidance instead of guessing paths. +With no inputs, `load_ontology` uses the entrypoint and runtime paths from `.semlang/settings.yml`. +If no config is available, it returns setup guidance instead of guessing paths. ### Output -Returns `ok`, `diagnostics`, and a `context` summary with package name, loaded files, counts, source names, type names, concept names, lens names, query names, and Malloy project/config context when available. The full compiled Malloy model is omitted by default and returned as `malloyModel` only when requested with `returnMalloyModel`. +Returns `ok`, `diagnostics`, and a `context` summary with package name, loaded files, counts, source names, type names, concept names, lens names, query names, and Malloy project/config context when available. +The full compiled Malloy model is omitted by default and returned as `malloyModel` only when requested with `returnMalloyModel`. ### Example @@ -34,7 +36,8 @@ Returns `ok`, `diagnostics`, and a `context` summary with package name, loaded f ## `search` -Searches concepts, metrics, members, queries, and lenses using terms from a user question or phrase. It can also resolve ontology names or business labels when `kind` is `entity`. +Searches concepts, metrics, members, queries, and lenses using terms from a user question or phrase. +It can also resolve ontology names or business labels when `kind` is `entity`. ### Inputs @@ -46,9 +49,12 @@ Searches concepts, metrics, members, queries, and lenses using terms from a user ### Output -Metadata search returns matching concepts, metrics, members, queries, lenses, actions, and roles. Each match includes a score and matched terms. Entity resolution returns matching sources, types, concepts, members, lenses, queries, candidate identifiers, candidate fields, matching rows when local DuckDB example data is available, and roles. +Metadata search returns matching concepts, metrics, members, queries, lenses, actions, and roles. +Each match includes a score and matched terms. +Entity resolution returns matching sources, types, concepts, members, lenses, queries, candidate identifiers, candidate fields, matching rows when local DuckDB example data is available, and roles. -Lens-oriented responses include scored lenses with descriptions, parents, refined concepts, scores, and matched terms. Use `find_paths` when the exact join route matters. +Lens-oriented responses include scored lenses with descriptions, parents, refined concepts, scores, and matched terms. +Use `find_paths` when the exact join route matters. ### Example diff --git a/docs/mcp-server/tools-overview.md b/docs/mcp-server/tools-overview.md index 4fc75ec..c9a5ef6 100644 --- a/docs/mcp-server/tools-overview.md +++ b/docs/mcp-server/tools-overview.md @@ -7,7 +7,8 @@ sidebar_position: 2 The SemLang MCP server exposes a compact ontology-aware tool surface for agents that need to discover a model, inspect semantic structure, plan lens overlays, validate queries, run Malloy-backed queries, and invoke supported local actions. -Call `load_ontology` first in each MCP session. All other tools read the compiled model held in the server context and return an error if no source has been loaded. +Call `load_ontology` first in each MCP session. +All other tools read the compiled model held in the server context and return an error if no source has been loaded. ## Public Tools @@ -20,10 +21,15 @@ Call `load_ontology` first in each MCP session. All other tools read the compile | `run_query` | Validate named or temporary queries and execute them through the Malloy SDK unless `dry_run_only` is true. | | `invoke_action` | Generate and execute supported local action SQL through the configured Malloy connection, or return generated SQL with `dry_run_only`. | -The public manifest intentionally avoids duplicate aliases and narrowly sliced helper tools. Consolidated tools use structured input schemas with explicit modes so agents can choose valid arguments without carrying a long list of overlapping tool names in context. +The public manifest intentionally avoids duplicate aliases and narrowly sliced helper tools. +Consolidated tools use structured input schemas with explicit modes so agents can choose valid arguments without carrying a long list of overlapping tool names in context. ## Response Shape -Tools return structured JSON. Successful responses generally include `ok: true`; failed or skipped operations return `ok: false` with an `error`, `reason`, `diagnostics`, or `candidates` field. +Tools return structured JSON. +Successful responses generally include `ok: true`; failed or skipped operations return `ok: false` with an `error`, `reason`, `diagnostics`, or `candidates` field. -`load_ontology({})` reads `.semlang/settings.yml` for the ontology entrypoint and runtime paths. If config is missing, it returns setup guidance. `run_query` executes through the Malloy SDK using the captured Malloy config and requires `query_limit_seconds` unless `dry_run_only` is true. `invoke_action` uses the same captured Malloy connection context to execute generated action SQL. +`load_ontology({})` reads `.semlang/settings.yml` for the ontology entrypoint and runtime paths. +If config is missing, it returns setup guidance. +`run_query` executes through the Malloy SDK using the captured Malloy config and requires `query_limit_seconds` unless `dry_run_only` is true. +`invoke_action` uses the same captured Malloy connection context to execute generated action SQL. diff --git a/docs/semlang-concepts.md b/docs/semlang-concepts.md index 8136ddd..e8b1171 100644 --- a/docs/semlang-concepts.md +++ b/docs/semlang-concepts.md @@ -2,13 +2,15 @@ title: SemLang Concepts --- -SemLang concepts name what rows mean. A concept should not simply mirror a table, and it should not split a business object apart just because different systems store different columns. +SemLang concepts name what rows mean. +A concept should not simply mirror a table, and it should not split a business object apart just because different systems store different columns. The central modeling question is: > Are these rows another description of the same thing, or are they a different thing connected to it? -Prefer one `kind` when sources describe the same durable business thing at the same identity grain, use the same ordinary business noun, and share a lifecycle. Split into a joined concept when the second source introduces a different lifecycle, temporal validity, relationship, event, or measurement grain. +Prefer one `kind` when sources describe the same durable business thing at the same identity grain, use the same ordinary business noun, and share a lifecycle. +Split into a joined concept when the second source introduces a different lifecycle, temporal validity, relationship, event, or measurement grain. ## Concept Types @@ -24,7 +26,8 @@ Prefer one `kind` when sources describe the same durable business thing at the s ## Worked Example: Customer -Assume `Customer` is a `kind`: a durable customer account that sales, billing, product, and support teams all recognize. Other tables can still be modeled relative to that one kind in different ways. +Assume `Customer` is a `kind`: a durable customer account that sales, billing, product, and support teams all recognize. +Other tables can still be modeled relative to that one kind in different ways. | Source or table | What people call it | Grain | Lifecycle | Model it as | Why | | ----------------------------- | ------------------------- | ------------------------------------------- | ------------------------------------------------------- | ------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | @@ -41,7 +44,8 @@ Assume `Customer` is a `kind`: a durable customer account that sales, billing, p | `invoices` | Invoice | One row per issued invoice. | Invoices are issued, adjusted, paid, or voided. | `event` or `kind`, joined to `Customer` | Use `event` for invoice issuance; use `kind` when invoices are durable business documents with their own lifecycle. | | `customer_report_view` | Customer report row | One row per report-specific projection. | Report shape changes with the analysis, not the domain. | Query, view, or `lens` | Reporting convenience should not create a new ontology object. | -This example is intentionally merge-friendly. The CRM, billing, and customer-success sources all describe the same `Customer` kind because they share the ordinary noun, identity grain, and lifecycle. +This example is intentionally merge-friendly. +The CRM, billing, and customer-success sources all describe the same `Customer` kind because they share the ordinary noun, identity grain, and lifecycle. ## Decision Heuristics @@ -56,7 +60,8 @@ This example is intentionally merge-friendly. The CRM, billing, and customer-suc | Connects two or more concepts and carries its own fields, status, or dates. | Join through a `relator`. | The relationship has its own identity or lifecycle. | | Exists only for one report, audience, or workflow. | Use a query, view, or `lens`. | Report shape should not become ontology shape. | -The important distinction is not "same table" or "different table." The important distinction is whether the rows have the same semantic identity and lifecycle. +The important distinction is not "same table" or "different table." +The important distinction is whether the rows have the same semantic identity and lifecycle. ## Common Modeling Smells diff --git a/examples/banking-credit-risk-and-customer-exposure/about.md b/examples/banking-credit-risk-and-customer-exposure/about.md index a66ca07..6db2c8c 100644 --- a/examples/banking-credit-risk-and-customer-exposure/about.md +++ b/examples/banking-credit-risk-and-customer-exposure/about.md @@ -1,8 +1,10 @@ # Banking Credit Risk and Customer Exposure -This package models a compact banking risk scenario focused on legal customers, facilities, loans, exposure snapshots, collateral, guarantees, ratings, model scores, and quarterly review evidence. It is designed to answer credit exposure questions without hiding the grains that make banking risk data difficult. +This package models a compact banking risk scenario focused on legal customers, facilities, loans, exposure snapshots, collateral, guarantees, ratings, model scores, and quarterly review evidence. +It is designed to answer credit exposure questions without hiding the grains that make banking risk data difficult. -The analytical center is `loan_exposure_snapshots`: one row per loan, as-of date, accounting basis, stress scenario, and model run. Collateral and guarantees attach through separate many-to-many paths, ratings and collateral valuations carry effective dates, and model scores keep scenario and model-version fields explicit so comparisons are deliberate. +The analytical center is `loan_exposure_snapshots`: one row per loan, as-of date, accounting basis, stress scenario, and model run. +Collateral and guarantees attach through separate many-to-many paths, ratings and collateral valuations carry effective dates, and model scores keep scenario and model-version fields explicit so comparisons are deliberate. ```mermaid erDiagram diff --git a/examples/healthcare-patient-journey-and-quality-measures/about.md b/examples/healthcare-patient-journey-and-quality-measures/about.md index 48401fb..b7d5298 100644 --- a/examples/healthcare-patient-journey-and-quality-measures/about.md +++ b/examples/healthcare-patient-journey-and-quality-measures/about.md @@ -1,8 +1,10 @@ # Healthcare Patient Journey and Quality Measures -This package models a compact healthcare analytics scenario focused on patient journeys, inpatient discharge denominators, diagnosis intervals, lab observations, claims, payer coverage, and quality-measure populations. It is designed to keep clinical, billing, and regulatory grains separate while still making common questions easy to ask. +This package models a compact healthcare analytics scenario focused on patient journeys, inpatient discharge denominators, diagnosis intervals, lab observations, claims, payer coverage, and quality-measure populations. +It is designed to keep clinical, billing, and regulatory grains separate while still making common questions easy to ask. -The analytical center is `inpatient_stays`: one row per inpatient stay and discharge. Encounters preserve broader clinical utilization, diagnosis intervals carry clinical valid dates, lab results remain observation facts, claims represent billing outcomes, payer coverage is selected by encounter date, and quality-measure population rows make the denominator grain explicit. +The analytical center is `inpatient_stays`: one row per inpatient stay and discharge. +Encounters preserve broader clinical utilization, diagnosis intervals carry clinical valid dates, lab results remain observation facts, claims represent billing outcomes, payer coverage is selected by encounter date, and quality-measure population rows make the denominator grain explicit. ```mermaid erDiagram diff --git a/examples/manufacturing-supply-chain-traceability-and-quality/about.md b/examples/manufacturing-supply-chain-traceability-and-quality/about.md index 9e2691c..58af942 100644 --- a/examples/manufacturing-supply-chain-traceability-and-quality/about.md +++ b/examples/manufacturing-supply-chain-traceability-and-quality/about.md @@ -1,8 +1,10 @@ # Manufacturing Supply Chain Traceability and Quality -This package models a compact manufacturing analytics scenario focused on supplier lots, product BOM versions, production orders, serialized units, inspections, defects, shipments, warranty claims, and recall scope. It is designed to demonstrate traceability and quality analysis without flattening physical flow into one table. +This package models a compact manufacturing analytics scenario focused on supplier lots, product BOM versions, production orders, serialized units, inspections, defects, shipments, warranty claims, and recall scope. +It is designed to demonstrate traceability and quality analysis without flattening physical flow into one table. -The analytical center is `serialized_units`: one row per manufactured unit. Production orders bind each unit to the BOM version used at build time, supplier lot consumption records which lots fed each order, lot genealogy records split and merge relationships, inspections and defects attach at unit and inspection grain, shipments move units to customers, warranty claims attach after shipment, and recall affected units define the scoped population for campaigns. +The analytical center is `serialized_units`: one row per manufactured unit. +Production orders bind each unit to the BOM version used at build time, supplier lot consumption records which lots fed each order, lot genealogy records split and merge relationships, inspections and defects attach at unit and inspection grain, shipments move units to customers, warranty claims attach after shipment, and recall affected units define the scoped population for campaigns. ```mermaid erDiagram diff --git a/examples/retail-omnichannel-margin-and-returns/about.md b/examples/retail-omnichannel-margin-and-returns/about.md index 3c55e47..4c7ccfa 100644 --- a/examples/retail-omnichannel-margin-and-returns/about.md +++ b/examples/retail-omnichannel-margin-and-returns/about.md @@ -1,8 +1,11 @@ # Retail Omnichannel Margin and Returns -This package models a compact retail analytics scenario focused on sales, line items, returns, product history, stores, recognized customers, loyalty point balances, promotions, and inventory snapshots. It is designed to demonstrate common retail questions without flattening different grains into one table. +This package models a compact retail analytics scenario focused on sales, line items, returns, product history, stores, recognized customers, loyalty point balances, promotions, and inventory snapshots. +It is designed to demonstrate common retail questions without flattening different grains into one table. -The model separates `transactions` from `retail_line_items`: sales/customer/store questions use one row per checkout or order, while merchandise, margin, promotions, and returns use one row per sold SKU line. Product attributes are joined using valid-time history, nullable `transactions.customer_id` distinguishes recognized shoppers from unrecognized cash purchases, loyalty point balances add daily loyalty-member state, and inventory is modeled separately as daily SKU/store snapshots. The `customers` table also carries direct PII columns, but the base SemLang example intentionally exposes only hashes and consent metadata; the lens example shows how a privileged query-time lens can reveal those fields. +The model separates `transactions` from `retail_line_items`: sales/customer/store questions use one row per checkout or order, while merchandise, margin, promotions, and returns use one row per sold SKU line. +Product attributes are joined using valid-time history, nullable `transactions.customer_id` distinguishes recognized shoppers from unrecognized cash purchases, loyalty point balances add daily loyalty-member state, and inventory is modeled separately as daily SKU/store snapshots. +The `customers` table also carries direct PII columns, but the base SemLang example intentionally exposes only hashes and consent metadata; the lens example shows how a privileged query-time lens can reveal those fields. ```mermaid erDiagram diff --git a/examples/saas-product-usage-and-revenue/about.md b/examples/saas-product-usage-and-revenue/about.md index 9c6022e..74e65cb 100644 --- a/examples/saas-product-usage-and-revenue/about.md +++ b/examples/saas-product-usage-and-revenue/about.md @@ -1,8 +1,10 @@ # SaaS Product Usage and Revenue -This package models a compact SaaS analytics scenario focused on accounts, workspaces, users, plans, subscriptions, contracts, invoices, revenue recognition, entitlements, product usage, support, incidents, and renewals. It is designed to answer recurring-revenue and product-adoption questions without flattening finance, usage, and lifecycle facts into one unsafe customer table. +This package models a compact SaaS analytics scenario focused on accounts, workspaces, users, plans, subscriptions, contracts, invoices, revenue recognition, entitlements, product usage, support, incidents, and renewals. +It is designed to answer recurring-revenue and product-adoption questions without flattening finance, usage, and lifecycle facts into one unsafe customer table. -The analytical center is `subscription_periods`: one row per subscription period with contracted ARR, MRR run-rate, expansion, contraction, and churn timing. Recognized revenue is modeled separately from invoice lines in `revenue_recognition`, product activity is modeled at user-day and feature-day grains, and entitlements are effective-dated so historical usage can be evaluated against the plan rights in force on the activity date. +The analytical center is `subscription_periods`: one row per subscription period with contracted ARR, MRR run-rate, expansion, contraction, and churn timing. +Recognized revenue is modeled separately from invoice lines in `revenue_recognition`, product activity is modeled at user-day and feature-day grains, and entitlements are effective-dated so historical usage can be evaluated against the plan rights in force on the activity date. ```mermaid erDiagram diff --git a/skills/initial-ontology-creation/SKILL.md b/skills/initial-ontology-creation/SKILL.md index 48e12d7..5add2f7 100644 --- a/skills/initial-ontology-creation/SKILL.md +++ b/skills/initial-ontology-creation/SKILL.md @@ -7,35 +7,45 @@ description: Use when creating an initial SemLang ontology from a data source, o Use this skill when a user wants to create an initial ontology for a domain, data product, warehouse schema, application database, or analytics package. -Assume the ontology is for production analytics unless the user says otherwise. The goal is not to mirror every table. The goal is to infer the domain's durable concepts, relationships, useful states, events, metrics, and question vocabulary, then validate that ontology with the user before treating it as authoritative. +Assume the ontology is for production analytics unless the user says otherwise. +The goal is not to mirror every table. +The goal is to infer the domain's durable concepts, relationships, useful states, events, metrics, and question vocabulary, then validate that ontology with the user before treating it as authoritative. ## Operating Principles - Start from the user's business domain and questions, not only from the physical schema. - Treat existing documentation, data catalogs, ERDs, dbt docs, BI dashboards, metric definitions, tickets, and stakeholder notes as optional but high-value context. -- Preserve uncertainty. When a relationship, concept type, temporal grain, or metric meaning is inferred, mark it as inferred and validate it explicitly. -- When delegation is available and permitted, use a sub-agent for the data-connection and source-introspection work, then reuse that same sub-agent for the first-pass ontology creation. The calling agent should preserve context for user review, questions, and modeling decisions instead of spending it on the iterative mechanics of connection setup and bulk drafting. +- Preserve uncertainty. + When a relationship, concept type, temporal grain, or metric meaning is inferred, mark it as inferred and validate it explicitly. +- When delegation is available and permitted, use a sub-agent for the data-connection and source-introspection work, then reuse that same sub-agent for the first-pass ontology creation. + The calling agent should preserve context for user review, questions, and modeling decisions instead of spending it on the iterative mechanics of connection setup and bulk drafting. - Prefer SemLang concept stereotypes deliberately: - `kind` for identity-bearing entities such as Customer, Event, Venue, Product, Account, or Supplier. - `event` for temporal occurrences such as Order, TicketScan, MessageSend, Incident, or Payment. - `situation` for state or measurement snapshots such as InventoryLevel, PriceSnapshot, SubscriptionStatus, or DailyBalance. - `relator` for association or bridge concepts such as EventAttraction, AccountMembership, ProductBundleItem, or ProviderFacilityAffiliation. - `phase` only for lifecycle stages of an entity, not table variants. -- Keep SemLang Malloy-shaped where possible. Do not invent syntax that cannot lower clearly. -- Model joins as a dedicated pass after the first concept inventory exists. Schema extraction usually exposes fields, not business relationships. -- Validate incrementally with the SemLang MCP `load_ontology` tool. Large ontologies should be organized into domain files and loaded in batches. +- Keep SemLang Malloy-shaped where possible. + Do not invent syntax that cannot lower clearly. +- Model joins as a dedicated pass after the first concept inventory exists. + Schema extraction usually exposes fields, not business relationships. +- Validate incrementally with the SemLang MCP `load_ontology` tool. + Large ontologies should be organized into domain files and loaded in batches. ## Phase 1: Intake -Ask for the minimum information needed to begin. Prefer concise questions and proceed with reasonable assumptions when the user cannot answer everything. +Ask for the minimum information needed to begin. +Prefer concise questions and proceed with reasonable assumptions when the user cannot answer everything. Ask about the data source: -- What type of source system contains the data: warehouse/database, application database, API export, local files, remote files, or something else? If the source is local or remote files, use DuckDB to inspect and model them. +- What type of source system contains the data: warehouse/database, application database, API export, local files, remote files, or something else? + If the source is local or remote files, use DuckDB to inspect and model them. - Do you already know the catalog, schema, database, directory, file paths, or relevant source names, or should I introspect the source and bring back options? - Are there any schemas, tables, files, or domains that should obviously be ignored? -Try to connect with the available information. Ask follow-up questions reactively only when connection attempts or source inspection reveal a concrete blocker. +Try to connect with the available information. +Ask follow-up questions reactively only when connection attempts or source inspection reveal a concrete blocker. Ask one optional context question: @@ -58,7 +68,8 @@ For each relevant table, file, or source: - Identify money, count, duration, percentage, score, quantity, status, category, and identifier fields. - Mark sources that appear intentionally out of scope, staging-only, legacy, duplicated, or unsafe to query. -If the source supports metadata commands, prefer structured metadata over scraping display text. For Databricks, use current CLI shapes such as `databricks tables get ..` when available. +If the source supports metadata commands, prefer structured metadata over scraping display text. +For Databricks, use current CLI shapes such as `databricks tables get ..
` when available. ## Phase 3: Documentation And Usage Analysis @@ -74,13 +85,16 @@ Look for: - Synonyms and aliases used by different teams for the same concept. - Questions that require joins, time windows, role filters, or aggregate measures. -Compare the docs against source inventory and note conflicts. When documentation and physical schema disagree, ask the user to resolve the business meaning before encoding it as canonical. +Compare the docs against source inventory and note conflicts. +When documentation and physical schema disagree, ask the user to resolve the business meaning before encoding it as canonical. ## Phase 4: Scaffold Concept Files -Start writing SemLang early. Do not create a separate long-lived inventory document unless the user asks for one. +Start writing SemLang early. +Do not create a separate long-lived inventory document unless the user asks for one. -Create domain-oriented SemLang files and use comments near the top of each file, source, or concept to hold the working notes that would otherwise live in a separate inventory. As the model becomes clearer, move more information out of comments and into real SemLang declarations. +Create domain-oriented SemLang files and use comments near the top of each file, source, or concept to hold the working notes that would otherwise live in a separate inventory. +As the model becomes clearer, move more information out of comments and into real SemLang declarations. For each candidate concept, capture in comments or declarations: @@ -108,7 +122,9 @@ concept Order is event from databricks.table('prod.sales.orders') { } ``` -Group concepts into domain files by business area once the inventory is large enough. Use a single entry-point file that includes shared types before domain files. Avoid re-including shared files from every domain file. +Group concepts into domain files by business area once the inventory is large enough. +Use a single entry-point file that includes shared types before domain files. +Avoid re-including shared files from every domain file. ## Phase 5: Relationship Pass @@ -139,11 +155,13 @@ For each concept, identify: - Views: common analytical shapes that combine dimensions, measures, filters, and joins. - Validations: data-quality expectations, not ordinary query filters. -Use roles only when the name carries reusable business meaning. If a filter merely narrows a source for one analysis, use a `where:` clause or lens instead. +Use roles only when the name carries reusable business meaning. +If a filter merely narrows a source for one analysis, use a `where:` clause or lens instead. ## Phase 7: Draft And Validate The Ontology -Create the first SemLang draft in small, valid increments. Valid means loading the entry-point file with the SemLang MCP `load_ontology` tool and using the feedback to fix parse, semantic, source, and lowering issues. +Create the first SemLang draft in small, valid increments. +Valid means loading the entry-point file with the SemLang MCP `load_ontology` tool and using the feedback to fix parse, semantic, source, and lowering issues. - Put `package` first in every SemLang file. - Put `include` declarations immediately after `package`. @@ -158,7 +176,8 @@ Create the first SemLang draft in small, valid increments. Valid means loading t ## Phase 8: Independent Audit -Before presenting the ontology for validation, run a systematic audit. When delegation is available and permitted, have a sub-agent perform this audit independently, then iterate with that sub-agent until the issues are resolved or clearly deferred. +Before presenting the ontology for validation, run a systematic audit. +When delegation is available and permitted, have a sub-agent perform this audit independently, then iterate with that sub-agent until the issues are resolved or clearly deferred. Check for: @@ -175,11 +194,13 @@ Check for: - Source tables that were skipped without an explicit reason. - Sample questions that cannot be answered from the current model. -Fix obvious issues before the user validation session. Keep unresolved business questions visible. +Fix obvious issues before the user validation session. +Keep unresolved business questions visible. ## Phase 9: User Validation Review -Review the ontology with the user in business language before treating it as complete. Present concise summaries and ask for corrections. +Review the ontology with the user in business language before treating it as complete. +Present concise summaries and ask for corrections. First, validate the core concepts and relationships: @@ -231,7 +252,8 @@ Finally, solicit more real questions: What are five to ten real questions people ask about this domain that are painful, frequent, high-stakes, or currently require manual work? ``` -For each added question, record whether the current ontology can answer it. If not, identify the missing concept, relationship, role, measure, temporal axis, validation, or source. +For each added question, record whether the current ontology can answer it. +If not, identify the missing concept, relationship, role, measure, temporal axis, validation, or source. ## Phase 10: Iterate And Handoff @@ -249,4 +271,5 @@ The handoff should include: - Example questions the ontology can answer now. - Suggested next modeling passes. -Run the project's validation command before handoff when working in a repository. For SemLang projects, prefer the repository's full check command when available. +Run the project's validation command before handoff when working in a repository. +For SemLang projects, prefer the repository's full check command when available. diff --git a/skills/semlang-setup/SKILL.md b/skills/semlang-setup/SKILL.md index 8962ac7..78c487c 100644 --- a/skills/semlang-setup/SKILL.md +++ b/skills/semlang-setup/SKILL.md @@ -6,8 +6,7 @@ description: Inspect and configure SemLang MCP for the current Pi project. Use w # SemLang Setup for Pi Use this skill to help a user configure SemLang MCP in the current project. -Keep the setup minimal: this package loads `pi-mcp-adapter`; the skill only -helps manage MCP JSON configuration. +Keep the setup minimal: this package loads `pi-mcp-adapter`; the skill only helps manage MCP JSON configuration. ## Workflow @@ -16,26 +15,19 @@ helps manage MCP JSON configuration. - `.mcp.json` - optionally `~/.pi/agent/mcp.json` - optionally `~/.config/mcp/mcp.json` -2. Determine whether any inspected config already defines - `mcpServers.semlang`. -3. If a `semlang` MCP server already exists, summarize where it was found and - do not add another server unless the user explicitly asks. -4. If SemLang MCP is missing, propose adding project-local Pi config at - `.pi/mcp.json`. Prefer this Pi-owned project config unless an existing - `.mcp.json` already has, or clearly should have, the SemLang config. -5. Ask for user confirmation before creating or editing MCP config unless the - user has already explicitly authorized the edit. -6. Preserve existing MCP config contents. Merge only the missing - `mcpServers.semlang` entry and keep other servers/settings unchanged. +2. Determine whether any inspected config already defines `mcpServers.semlang`. +3. If a `semlang` MCP server already exists, summarize where it was found and do not add another server unless the user explicitly asks. +4. If SemLang MCP is missing, propose adding project-local Pi config at `.pi/mcp.json`. + Prefer this Pi-owned project config unless an existing `.mcp.json` already has, or clearly should have, the SemLang config. +5. Ask for user confirmation before creating or editing MCP config unless the user has already explicitly authorized the edit. +6. Preserve existing MCP config contents. + Merge only the missing `mcpServers.semlang` entry and keep other servers/settings unchanged. 7. After writing config, tell the user to run `/reload` or restart Pi. -8. Mention that SemLang MCP respects `SEMLANG_*` environment settings and that - `semlang setup` can inspect resolved SemLang settings. +8. Mention that SemLang MCP respects `SEMLANG_*` environment settings and that `semlang setup` can inspect resolved SemLang settings. ## Server snippet -Unless a project-local SemLang binary strategy is explicitly requested and is -more reliable for the project, add this server entry pinned to the package -version: +Unless a project-local SemLang binary strategy is explicitly requested and is more reliable for the project, add this server entry pinned to the package version: ```json { @@ -49,5 +41,4 @@ version: } ``` -For an existing config file, merge the `semlang` object under the existing -`mcpServers` object instead of replacing the file. +For an existing config file, merge the `semlang` object under the existing `mcpServers` object instead of replacing the file. diff --git a/skills/semlang/SKILL.md b/skills/semlang/SKILL.md index eaf83a2..a063a41 100644 --- a/skills/semlang/SKILL.md +++ b/skills/semlang/SKILL.md @@ -7,7 +7,8 @@ description: Use when reading, writing, or running SemLang semantic-model files, This is a concise guide for agents that need to read or write SemLang. -SemLang is best understood as Malloy with a semantic ontology layer. Keep Malloy's mental model for sources, joins, dimensions, measures, views, and queries, then add the SemLang differences below. +SemLang is best understood as Malloy with a semantic ontology layer. +Keep Malloy's mental model for sources, joins, dimensions, measures, views, and queries, then add the SemLang differences below. ## Start From Malloy @@ -15,71 +16,99 @@ SemLang is best understood as Malloy with a semantic ontology layer. Keep Malloy - A Malloy table stays close by: `concept Sale is event from duckdb.table('transactions')`. - Malloy-style `dimension:`, `measure:`, `view:`, `where:`, `group_by:`, `aggregate:`, and `query: ... -> { ... }` remain the default shape. - Queries target concepts, not physical source names: `query: q is SaleLine -> { ... }`. -- Every SemLang file starts with exactly one `package` declaration. `include` declarations may follow the package and must come before sources, types, concepts, lenses, and queries. -- SemLang should compile to Malloy. Do not invent syntax that cannot lower clearly. +- Every SemLang file starts with exactly one `package` declaration. + `include` declarations may follow the package and must come before sources, types, concepts, lenses, and queries. +- SemLang should compile to Malloy. + Do not invent syntax that cannot lower clearly. ## Query-First Workflow -- Start by using the existing core semantic model directly. Load the relevant SemLang files and try to answer the task with queries over existing concepts, joins, dimensions, measures, views, and lenses before changing model files. +- Start by using the existing core semantic model directly. + Load the relevant SemLang files and try to answer the task with queries over existing concepts, joins, dimensions, measures, views, and lenses before changing model files. - When a query cannot express the answer because the core model lacks a reusable business concept, relationship, type, measure, role, view, lens, validation, or source abstraction, add that reusable semantic content to the existing SemLang files that own the domain. -- Keep reusable additions durable and named in business language. Do not hide repeatable semantics inside one-off query text. -- If the work requires complex logic that is specific to a single investigation and is not worth reusing, create a separate `.semlang` file for the task. Put its `package` first, include the relevant core semantic files immediately after the package declaration, define only the task-specific sources, lenses, views, or queries needed there, and run that file instead of polluting the core model. +- Keep reusable additions durable and named in business language. + Do not hide repeatable semantics inside one-off query text. +- If the work requires complex logic that is specific to a single investigation and is not worth reusing, create a separate `.semlang` file for the task. + Put its `package` first, include the relevant core semantic files immediately after the package declaration, define only the task-specific sources, lenses, views, or queries needed there, and run that file instead of polluting the core model. ## What SemLang Adds -- `type:` declarations give primitive values semantic meaning, such as `Dollars`, `CustomerId`, or `BusinessDate`. Primitive bases are `string`, `number`, `date`, `timestamp`, `currency`, and `boolean`. -- Type bodies use JSON Schema-style metadata such as `description`, `enum`, `const`, `default`, `examples`, bounds, `pattern`, and `format`. Use `enum`, not legacy `allowed_values`; use `description`, not legacy `semantics`. -- `source:` declarations name reusable Malloy source expressions. They can wrap a table, SQL expression, another source/concept/query, or a root plus query body: `source: active_sales is Sale -> { where: status = 'active' }`. -- `concept X is kind/event/situation/relator/phase ...` names what a row means, not just where it is stored. A phase names its parent: `concept ClosedStore is phase of Store from ...`. +- `type:` declarations give primitive values semantic meaning, such as `Dollars`, `CustomerId`, or `BusinessDate`. + Primitive bases are `string`, `number`, `date`, `timestamp`, `currency`, and `boolean`. +- Type bodies use JSON Schema-style metadata such as `description`, `enum`, `const`, `default`, `examples`, bounds, `pattern`, and `format`. + Use `enum`, not legacy `allowed_values`; use `description`, not legacy `semantics`. +- `source:` declarations name reusable Malloy source expressions. + They can wrap a table, SQL expression, another source/concept/query, or a root plus query body: `source: active_sales is Sale -> { where: status = 'active' }`. +- `concept X is kind/event/situation/relator/phase ...` names what a row means, not just where it is stored. + A phase names its parent: `concept ClosedStore is phase of Store from ...`. - Concept `description:` metadata is preserved and appears in exported concept JSON Schema. - `identity` declares the semantic key and lowers to Malloy `primary_key`. - A composite identity is comma-separated and lowers through a deterministic generated primary-key dimension. -- `field:` declarations attach semantic types to source-backed fields. A trailing `?` marks nullability; `unique` records uniqueness metadata. -- Joins support `join_one`, `join_many`, and `join_cross`. `join_one` and `join_many` use `on` or `with`; `join_cross` may omit `on` and cannot use `with`. +- `field:` declarations attach semantic types to source-backed fields. + A trailing `?` marks nullability; `unique` records uniqueness metadata. +- Joins support `join_one`, `join_many`, and `join_cross`. + `join_one` and `join_many` use `on` or `with`; `join_cross` may omit `on` and cannot use `with`. - A `?` after the join name, as in `join_one customer?: Customer on customer_id`, marks optional participation metadata without changing the emitted Malloy join kind. - Join targets can be concepts, named sources, roles, or `join_one` inline named-connection sources such as `duckdb.table('customer_profiles')`. -- `role Name when predicate` names a meaningful classification when the name adds business meaning. Its canonical name is `Concept.Name`; use optional `label` and `aliases` metadata for business-language discovery. +- `role Name when predicate` names a meaningful classification when the name adds business meaning. + Its canonical name is `Concept.Name`; use optional `label` and `aliases` metadata for business-language discovery. - Role tests can use `path is Concept.Role` or an unambiguous short name such as `path is Active`. - Temporal axes such as `occurrence_time:`, `valid_time:`, `observation_time:`, and `recorded_time:` document event time, valid-time state, observed state time, and load/record time. - Temporal joins can use `at expression` instead of repeating period containment predicates. -- `validation:` declarations are data-quality rules. They are not query filters. +- `validation:` declarations are data-quality rules. + They are not query filters. - Concept `where:` filters restrict the concept source and lower to Malloy source filters. -- `lens:` declarations are query-time overlays for contextual filters or vocabulary, similar in spirit to Malloy source extension but applied to existing semantic names. Lenses can inherit parent lenses, add lens-local types, and refine concepts with fields, joins, roles, dimensions, measures, views, validations, temporal axes, identities, and `where` filters. -- Actions are concept-local write declarations. They describe permitted writes with `subject`, `param:`, `guard:`, `edit:`, write mappings, logs, effects, and `agent:` metadata. They do not lower to Malloy reads. -- Mark source-backed fields `writeable` only when actions may assign them. A writeable source field implies `write: column field_name = value`; a writeable derived dimension must declare explicit write mappings. -- JSON Schema export projects semantic types and source-backed concept row shapes. Ontology features without exact JSON Schema semantics are preserved as `x-semlang-*` metadata rather than translated into misleading native validation keywords. +- `lens:` declarations are query-time overlays for contextual filters or vocabulary, similar in spirit to Malloy source extension but applied to existing semantic names. + Lenses can inherit parent lenses, add lens-local types, and refine concepts with fields, joins, roles, dimensions, measures, views, validations, temporal axes, identities, and `where` filters. +- Actions are concept-local write declarations. + They describe permitted writes with `subject`, `param:`, `guard:`, `edit:`, write mappings, logs, effects, and `agent:` metadata. + They do not lower to Malloy reads. +- Mark source-backed fields `writeable` only when actions may assign them. + A writeable source field implies `write: column field_name = value`; a writeable derived dimension must declare explicit write mappings. +- JSON Schema export projects semantic types and source-backed concept row shapes. + Ontology features without exact JSON Schema semantics are preserved as `x-semlang-*` metadata rather than translated into misleading native validation keywords. ## Authoring Rules - Prefer the smallest SemLang construct that carries new meaning. - Use a role only when the role is reusable and meaningful in business language. - If a lens only narrows data, write `where:` directly rather than declaring a role for the same predicate. -- Keep grains separate. Do not flatten events, situations, relators, and snapshots into one concept just to make a query shorter. -- Use declared joins and measures instead of spelling long paths from scratch. Prefer `with foreign_key` when joining to a concept by its identity and the shorthand preserves the intended relationship. +- Keep grains separate. + Do not flatten events, situations, relators, and snapshots into one concept just to make a query shorter. +- Use declared joins and measures instead of spelling long paths from scratch. + Prefer `with foreign_key` when joining to a concept by its identity and the shorthand preserves the intended relationship. - Put fields in `field:` blocks and derived values in `dimension:` or `measure:` blocks. - Keep aggregate aliases aggregate-safe: raw row fields must be inside aggregate functions. -- Use concept-local `view:` declarations for reusable analytical shapes. Views and queries support `where`, `select`, `project`, `group_by`, `aggregate`, `having`, `calculate`, `nest`, `index`, `order_by`, and `limit`/`top` where valid. +- Use concept-local `view:` declarations for reusable analytical shapes. + Views and queries support `where`, `select`, `project`, `group_by`, `aggregate`, `having`, `calculate`, `nest`, `index`, `order_by`, and `limit`/`top` where valid. - Use `project:` only as compatibility spelling; expect it to emit as Malloy `select:`. - Use `top:` only as compatibility spelling; expect it to behave as a row limit. - Filters may use Malloy-shaped forms such as `?` alternation, `to` ranges, `~`/`!~` matching, and filter strings like `f'this week'`. -- Use lenses for query-time changes to semantic meaning. A query applies them with `with`, and parent/named lenses apply left-to-right to a copied query model. -- Do not use validations as ordinary filters. If a query should exclude rows, use `where:` on the concept, lens refinement, view, or query. -- Do not assign actions to measures, joins, roles, views, validations, identities, or temporal axes. Action edits can target only writeable fields or writeable dimensions. +- Use lenses for query-time changes to semantic meaning. + A query applies them with `with`, and parent/named lenses apply left-to-right to a copied query model. +- Do not use validations as ordinary filters. + If a query should exclude rows, use `where:` on the concept, lens refinement, view, or query. +- Do not assign actions to measures, joins, roles, views, validations, identities, or temporal axes. + Action edits can target only writeable fields or writeable dimensions. - Treat action `agent:` metadata as presentation metadata, never authorization. ## File and Source Rules - Keep `package` first except for comments and blank lines. -- Put all `include` declarations immediately after the package. Includes are model loading, not textual macros, and cycles are invalid. -- Use named Malloy connections in physical sources: `duckdb.table('orders')` or `duckdb.sql("""select ...""")`. Do not write unqualified `table('orders')`. +- Put all `include` declarations immediately after the package. + Includes are model loading, not textual macros, and cycles are invalid. +- Use named Malloy connections in physical sources: `duckdb.table('orders')` or `duckdb.sql("""select ...""")`. + Do not write unqualified `table('orders')`. - A concept `from` clause may reference a table, SQL source, named source, another concept source, or a query result. -- When a named source query is rooted in a concept, SemLang validates the query body against that concept. When rooted in a non-concept source, keep it Malloy-shaped and avoid SemLang-only expression constructs that need concept context. +- When a named source query is rooted in a concept, SemLang validates the query body against that concept. + When rooted in a non-concept source, keep it Malloy-shaped and avoid SemLang-only expression constructs that need concept context. ## Diagnostics and Validation - The compiler reports parse, duplicate-symbol, unresolved-reference, invalid syntax, temporal join, lens refinement, and aggregate-alias diagnostics with source locations where available. - Successful compiler artifacts are stage-safe: parse errors block the AST, semantic errors block the semantic model, and a missing semantic model blocks Malloy and JSON Schema output. -- Lint diagnostics are opt-in. They can flag missing `occurrence_time` on events, missing `observation_time` on situations, suspicious type/join modeling, and Malloy SDK validation problems. +- Lint diagnostics are opt-in. + They can flag missing `occurrence_time` on events, missing `observation_time` on situations, suspicious type/join modeling, and Malloy SDK validation problems. - When lint validates generated Malloy, diagnostics may include generated Malloy context and mapped original SemLang locations. ## Requirements Discipline