Feature request
Add a first-class data source for YTsaurus, an open-source distributed storage and execution system, via its CHYT (ClickHouse-over-YT) clique.
Why
YTsaurus is the storage and compute backbone for petabyte-scale analytics at several organizations and is a common destination for Wren-style semantic SQL workloads. Today, users running on YT have no first-class Wren connector and have to either:
- Bypass Wren entirely and query CHYT directly (losing MDL semantics, dry-run validation, denied-functions policy, memory module, text-to-SQL surface), or
- Misconfigure the
clickhouse connector against the YT HTTP proxy — which fails because CHYT mounts at a non-root URL path, routes by a URL parameter (chyt.clique_alias) rather than the ClickHouse database field, and accepts only Authorization: OAuth <token> (it rejects Basic and Bearer).
The CHYT clique exposes a ClickHouse-compatible HTTP protocol, so the bulk of Wren's existing ClickHouse / Ibis / sqlglot machinery applies unchanged — only the auth/routing shim is new.
Proposed scope
- New
DataSource.ytsaurus enum value and factory entry
- New
YTsaurusConnectionInfo Pydantic model: proxy, clique, token (with YT_TOKEN env fallback), secure, port, query_path, settings, kwargs
- New
wren.connector.ytsaurus module (subclass of the Ibis ClickHouse connector with YT-specific auth header injection, clique-alias URL param patching, and a CHYT-friendly query / dry_run path that bypasses ibis' CREATE VIEW-based introspection)
- New pip extra:
wren-engine[ytsaurus]
- sqlglot dialect map:
ytsaurus -> clickhouse
- Optional MDL-level rewrite: when
data_source == ytsaurus, replace <schema>.<table> references in the physical SQL with backticked YT paths sourced from each model's properties.ytPath
- Docs:
core/wren/docs/connectors/ytsaurus.md, README install row, docs/connections.md JSON example
Out of scope
- Native YQL (Query Tracker) — CHYT only; users who need raw YQL can fork the connector
- SPYT (Spark-on-YT)
Implementation
PR forthcoming, will reference this issue.
Feature request
Add a first-class data source for YTsaurus, an open-source distributed storage and execution system, via its CHYT (ClickHouse-over-YT) clique.
Why
YTsaurus is the storage and compute backbone for petabyte-scale analytics at several organizations and is a common destination for Wren-style semantic SQL workloads. Today, users running on YT have no first-class Wren connector and have to either:
clickhouseconnector against the YT HTTP proxy — which fails because CHYT mounts at a non-root URL path, routes by a URL parameter (chyt.clique_alias) rather than the ClickHousedatabasefield, and accepts onlyAuthorization: OAuth <token>(it rejects Basic and Bearer).The CHYT clique exposes a ClickHouse-compatible HTTP protocol, so the bulk of Wren's existing ClickHouse / Ibis / sqlglot machinery applies unchanged — only the auth/routing shim is new.
Proposed scope
DataSource.ytsaurusenum value and factory entryYTsaurusConnectionInfoPydantic model:proxy,clique,token(withYT_TOKENenv fallback),secure,port,query_path,settings,kwargswren.connector.ytsaurusmodule (subclass of the Ibis ClickHouse connector with YT-specific auth header injection, clique-alias URL param patching, and a CHYT-friendlyquery/dry_runpath that bypasses ibis'CREATE VIEW-based introspection)wren-engine[ytsaurus]ytsaurus -> clickhousedata_source == ytsaurus, replace<schema>.<table>references in the physical SQL with backticked YT paths sourced from each model'sproperties.ytPathcore/wren/docs/connectors/ytsaurus.md,READMEinstall row,docs/connections.mdJSON exampleOut of scope
Implementation
PR forthcoming, will reference this issue.