Skip to content

feat(wren): add YTsaurus (CHYT) connector #2257

@nar3k

Description

@nar3k

Feature request

Add a first-class data source for YTsaurus, an open-source distributed storage and execution system, via its CHYT (ClickHouse-over-YT) clique.

Why

YTsaurus is the storage and compute backbone for petabyte-scale analytics at several organizations and is a common destination for Wren-style semantic SQL workloads. Today, users running on YT have no first-class Wren connector and have to either:

  • Bypass Wren entirely and query CHYT directly (losing MDL semantics, dry-run validation, denied-functions policy, memory module, text-to-SQL surface), or
  • Misconfigure the clickhouse connector against the YT HTTP proxy — which fails because CHYT mounts at a non-root URL path, routes by a URL parameter (chyt.clique_alias) rather than the ClickHouse database field, and accepts only Authorization: OAuth <token> (it rejects Basic and Bearer).

The CHYT clique exposes a ClickHouse-compatible HTTP protocol, so the bulk of Wren's existing ClickHouse / Ibis / sqlglot machinery applies unchanged — only the auth/routing shim is new.

Proposed scope

  • New DataSource.ytsaurus enum value and factory entry
  • New YTsaurusConnectionInfo Pydantic model: proxy, clique, token (with YT_TOKEN env fallback), secure, port, query_path, settings, kwargs
  • New wren.connector.ytsaurus module (subclass of the Ibis ClickHouse connector with YT-specific auth header injection, clique-alias URL param patching, and a CHYT-friendly query / dry_run path that bypasses ibis' CREATE VIEW-based introspection)
  • New pip extra: wren-engine[ytsaurus]
  • sqlglot dialect map: ytsaurus -> clickhouse
  • Optional MDL-level rewrite: when data_source == ytsaurus, replace <schema>.<table> references in the physical SQL with backticked YT paths sourced from each model's properties.ytPath
  • Docs: core/wren/docs/connectors/ytsaurus.md, README install row, docs/connections.md JSON example

Out of scope

  • Native YQL (Query Tracker) — CHYT only; users who need raw YQL can fork the connector
  • SPYT (Spark-on-YT)

Implementation

PR forthcoming, will reference this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions