diff --git a/polaris-shell/README.md b/polaris-shell/README.md new file mode 100644 index 00000000..26279d2c --- /dev/null +++ b/polaris-shell/README.md @@ -0,0 +1,182 @@ +# Polaris Shell + +An interactive SQL shell for exploring [Apache Iceberg](https://iceberg.apache.org/) tables and catalog metadata through [Apache Polaris](https://polaris.apache.org/) via its REST catalog API. No Spark, no Flink, no heavyweight engine — just a single fat JAR and a properties file. + +Polaris Shell complements Polaris with a SQL interface for answering routine questions about your catalog — how many tables are in a namespace, how many snapshots a table has, where it is stored, whether it has too many small files — without spinning up Trino, Spark, or pyiceberg. + +> **SELECT queries read data directly via the Iceberg Java library and are intended for sampling and exploration, not production workloads.** + +> **Try it in minutes** — a fully self-contained Docker demo is included. See [demo/README.md](demo/README.md). + +--- + +## How it works + +Polaris Shell connects to a Polaris server using the **Iceberg REST catalog protocol** and OAuth2 client credentials. It parses SQL statements with an [ANTLR 4](https://www.antlr.org/) grammar, converts them to Iceberg API calls, and prints results to the terminal. No JDBC driver, no query engine — queries are executed directly through the Iceberg Java library against the catalog. + +``` +SQL input → ANTLR parser → QueryPlan → Iceberg REST catalog API → results +``` + +--- + +## Supported commands + +| Command | Example | +|---|---| +| `SELECT` with predicate, projection, ORDER BY, LIMIT | `SELECT id, amount FROM retail.orders WHERE region = 'us-east-1' ORDER BY amount DESC LIMIT 10` | +| `SHOW TABLES IN ` | `SHOW TABLES IN retail` | +| `DESCRIBE STATS ` | `DESCRIBE STATS retail.orders` | +| `SHOW TABLE LOCATION
` | `SHOW TABLE LOCATION retail.products` | +| `SHOW TABLE POLICIES
` | `SHOW TABLE POLICIES retail.orders` | +| `DIAGNOSE TABLE
` | `DIAGNOSE TABLE retail.orders` | +| `EXPLAIN SELECT ...` | `EXPLAIN SELECT * FROM retail.orders WHERE region = 'us-east-1'` | + +**`EXPLAIN`** shows the Iceberg scan plan: snapshot info, partition spec, manifest and data-file counts before and after filter pushdown, estimated bytes scanned, and any warnings (small files, missing column statistics). + +**`DIAGNOSE`** scans the table's data files and reports how many are below the 128 MiB target size — a quick check for compaction candidates. + +SQL keywords are case-insensitive. Predicates support `=`, `!=`, `<>`, `<`, `<=`, `>`, `>=`, `IS NULL`, `IS NOT NULL`, `IN (...)`, `NOT IN (...)`, `AND`, `OR`, and `NOT`. + +### Sample output + +``` +sql> SELECT id, region, amount FROM retail.orders WHERE region = 'us-east-1' LIMIT 3 +id=1, region=us-east-1, amount=312 +id=2, region=us-east-1, amount=87 +id=5, region=us-east-1, amount=204 +(3 rows) + +sql> SHOW TABLES IN retail + namespace: retail + tableCount: 3 + tables: [retail.orders, retail.products, retail.regions] + +sql> DIAGNOSE TABLE retail.orders + smallFileThresholdBytes: 134217728 + smallFileCount: 4 + +sql> EXPLAIN SELECT * FROM retail.orders WHERE region = 'us-east-1' +┌──────────────────────────────────────────────────────────────────────┐ +│ ICEBERG SCAN PLAN — retail.orders │ +├──────────────────────────────────────────────────────────────────────┤ +│ Snapshot ID │ 7326491023847162 │ +│ Snapshot timestamp (ms) │ 1715000000000 │ +│ Partition spec │ [region: identity] │ +│ Schema columns │ 5 │ +│ Projected columns │ 5 │ +├──────────────────────────────────────┬──────────────────────────────┤ +│ Total manifest files │ 3 │ +│ Manifests after pruning │ 1 │ +│ Data files total │ 10 │ +│ Data files after filter │ 2 (80.0% eliminated) │ +│ Estimated bytes scanned │ 1.2 MiB │ +│ Pushdown filter │ ref(name="region") == ... │ +└──────────────────────────────────────┴──────────────────────────────┘ +``` + +--- + +## Limitations + +- **Single-table reads only** — no `JOIN` +- **No aggregate functions** — `COUNT`, `SUM`, `AVG`, `MIN`, `MAX`, and `GROUP BY` are not supported +- **No DML** — `INSERT`, `UPDATE`, and `DELETE` are not supported +- **No DDL** — `CREATE TABLE`, `DROP TABLE`, and `ALTER TABLE` are not supported +- **`ORDER BY` is in-memory** — all rows matching the filter are fetched before sorting; use `LIMIT` to bound the result set +- **No subqueries or CTEs** + +--- + +## Quick start + +### Prerequisites +- Java 21+ +- A running Polaris server (or use the [demo](demo/README.md) — no server setup required) + +### 1. Build + +```bash +./gradlew generateGrammarSource shadowJar +``` + +This produces `build/libs/polaris-shell-demo.jar`. + +### 2. Configure + +Copy the example properties file and fill in your Polaris connection details: + +```bash +cp polaris-sql-demo.properties.example polaris-sql-demo.properties +``` + +```properties +# Required +polaris.uri=http://localhost:8181/api/catalog +polaris.warehouse=my-catalog +polaris.client.id=root +polaris.client.secret=s3cr3t + +# Optional +polaris.token.endpoint=http://localhost:8181/api/catalog/v1/oauth/tokens +cli.max-display-rows=100 + +# S3 / MinIO FileIO properties (pass-through to the Iceberg catalog) +# s3.endpoint=http://localhost:9000 +# s3.path-style-access=true +# io-impl=org.apache.iceberg.aws.s3.S3FileIO +``` + +Any property not prefixed with `polaris.` or `cli.` is passed directly to the Iceberg catalog (useful for S3 region, MinIO credentials, custom FileIO implementations, etc.). + +### 3. Run + +```bash +java -jar build/libs/polaris-shell-demo.jar polaris-sql-demo.properties +``` + +``` +Connecting to Polaris at http://localhost:8181/api/catalog ... +Authenticated. Type SQL statements or 'exit' to quit. + +sql> SHOW TABLES IN retail +sql> SELECT * FROM retail.orders WHERE amount > 100 LIMIT 5 +sql> EXPLAIN SELECT * FROM retail.orders WHERE region = 'us-east-1' +sql> exit +``` + +--- + +## Demo + +The [`demo/`](demo/README.md) directory contains a fully local environment using **Docker Compose + MinIO** — no AWS account or external Polaris server required. It spins up Polaris, MinIO, and seeds three sample Iceberg tables in under a minute. + +See **[demo/README.md](demo/README.md)** for step-by-step instructions. + +--- + +## Configuration reference + +| Property | Required | Default | Description | +|---|---|---|---| +| `polaris.uri` | Yes | — | Polaris REST catalog base URI | +| `polaris.warehouse` | Yes | — | Warehouse / catalog name | +| `polaris.client.id` | Yes | — | OAuth2 client ID | +| `polaris.client.secret` | Yes | — | OAuth2 client secret | +| `polaris.token.endpoint` | No | `{polaris.uri}/v1/oauth/tokens` | Token endpoint override | +| `cli.max-display-rows` | No | `100` | Row cap for SELECT output | +| *(any other key)* | No | — | Passed through to the Iceberg catalog | + +--- + +## Building from source + +```bash +# Generate ANTLR sources and build the fat JAR +./gradlew generateGrammarSource shadowJar + +# Run tests +./gradlew test +``` + +Requires Java 21. The Gradle wrapper is included — no local Gradle installation needed. diff --git a/polaris-shell/build.gradle.kts b/polaris-shell/build.gradle.kts new file mode 100644 index 00000000..b677b0eb --- /dev/null +++ b/polaris-shell/build.gradle.kts @@ -0,0 +1,146 @@ + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.github.jengelman.gradle.plugins.shadow.tasks.ShadowJar + +plugins { + id("java") + alias(libs.plugins.shadow) +} + +java { + toolchain { + languageVersion = JavaLanguageVersion.of(21) + } +} + +// Isolated configuration: ANTLR 4 tool jar, does not leak into compile/runtime +val antlrTool: Configuration by configurations.creating + +val antlrOutputDir = layout.buildDirectory.dir("generated/antlr/main") +val antlrPackageDir = layout.buildDirectory.dir("generated/antlr/main/org/apache/polaris/tools/grammar") +val grammarFile = file("src/main/antlr/IcebergSQL.g4") + +val generateGrammarSource by tasks.registering(JavaExec::class) { + description = "Generate Java sources from IcebergSQL.g4 using ANTLR 4" + group = "build" + + classpath = antlrTool + mainClass = "org.antlr.v4.Tool" + + doFirst { antlrPackageDir.get().asFile.mkdirs() } + + args = listOf( + "-visitor", + "-no-listener", + "-package", "org.apache.polaris.tools.grammar", + "-o", antlrPackageDir.get().asFile.absolutePath, // output into full package path + grammarFile.absolutePath + ) + + inputs.file(grammarFile) + outputs.dir(antlrOutputDir) // declare root as output for incremental build tracking +} + +sourceSets { + main { + java { srcDir(antlrOutputDir) } // root — Java compiler walks subdirs automatically + } +} + +tasks.named("compileJava") { + dependsOn(generateGrammarSource) +} + +dependencies { + antlrTool(libs.antlr4) // ANTLR 4 tool — code-gen only, not shipped + implementation(libs.antlr4.engine.runtime) // ANTLR 4 runtime — shipped in our jar + + implementation(platform(libs.iceberg.bom)) + implementation("org.apache.iceberg:iceberg-api") + implementation("org.apache.iceberg:iceberg-core") + implementation("org.apache.iceberg:iceberg-data") + implementation("org.apache.iceberg:iceberg-parquet") + implementation("org.apache.iceberg:iceberg-aws") + implementation("org.apache.parquet:parquet-column:1.16.0") + // iceberg-aws declares ALL AWS SDK deps as compileOnly — none appear in its + // published metadata, so every module it references must be added explicitly + // here to be bundled in the shadow jar. + runtimeOnly(libs.awssdk.s3) + runtimeOnly(libs.awssdk.sts) + runtimeOnly(libs.awssdk.kms) + runtimeOnly(libs.awssdk.dynamodb) + runtimeOnly(libs.awssdk.glue) + runtimeOnly(libs.awssdk.lakeformation) + runtimeOnly(libs.awssdk.url.connection.client) + + implementation(libs.guava) + implementation(libs.slf4j.api) + runtimeOnly("org.slf4j:slf4j-simple:${libs.versions.slf4j.get()}") + + implementation(libs.hadoop.common) + implementation(libs.hadoop.client.runtime) + + // ── Test dependencies ────────────────────────────────────────────────────── + testImplementation(platform(libs.junit.bom)) + testImplementation("org.junit.jupiter:junit-jupiter-api") + testRuntimeOnly("org.junit.jupiter:junit-jupiter-engine") + testRuntimeOnly("org.junit.platform:junit-platform-launcher") + + testImplementation(libs.assertj) + testImplementation(libs.mockito.junit.jupiter) + + testImplementation(platform(libs.testcontainers.bom)) + testImplementation("org.testcontainers:testcontainers") + testImplementation("org.testcontainers:testcontainers-junit-jupiter") + // Provides org.apache.polaris.test.minio.MinioContainer + testImplementation("org.apache.polaris:polaris-minio-testcontainer:1.4.1") +} + +tasks.named("test") { + useJUnitPlatform() + // Integration tests require Docker + running Polaris/MinIO containers; + // exclude them from the default test task so normal builds succeed. + exclude("**/*IntegrationTest*") +} + +tasks.register("integrationTest") { + description = "Runs integration tests that require Docker (Polaris + MinIO)." + group = "verification" + useJUnitPlatform() + testClassesDirs = sourceSets["test"].output.classesDirs + classpath = sourceSets["test"].runtimeClasspath + include("**/*IntegrationTest*") +} + +// ── Demo fat jar ────────────────────────────────────────────────────────────── +tasks.named("shadowJar") { + archiveClassifier.set("demo") + mergeServiceFiles() + isZip64 = true + manifest { + attributes("Main-Class" to "org.apache.polaris.tools.cli.PolarisShell") + } + // Exclude SLF4J 1.7.x bindings that leak in from Hadoop transitive deps; + // we ship slf4j-simple 2.x as the provider instead. + exclude("org/slf4j/impl/StaticLoggerBinder.class") + exclude("org/slf4j/impl/StaticMDCBinder.class") + exclude("org/slf4j/impl/StaticMarkerBinder.class") +} \ No newline at end of file diff --git a/polaris-shell/demo/README.md b/polaris-shell/demo/README.md new file mode 100644 index 00000000..fa3a0661 --- /dev/null +++ b/polaris-shell/demo/README.md @@ -0,0 +1,60 @@ +# Polaris Shell — Local Demo + +No AWS account or external Polaris server required. Everything runs locally via Docker. + +## Prerequisites +- Docker + Docker Compose +- Java 21+ + +## Steps + +0. **Build the fat jar** (from the `polaris-shell` directory) + ```bash + cd polaris-shell + ./gradlew generateGrammarSource shadowJar + ``` + Then change into the demo directory: + ```bash + cd demo + ``` + +1. **Start the environment** + ```bash + docker compose up -d + ``` + +2. **Seed demo data** (run once) + ```bash + ./seed.sh + ``` + This creates three Iceberg tables in MinIO: + - `retail.orders` — 200 rows, partitioned by `region` + - `retail.products` — 50 rows, unpartitioned + - `retail.regions` — 3 rows, reference data + +3. **Launch the SQL shell** + ```bash + java -jar ../build/libs/polaris-shell-demo.jar demo.properties + ``` + +## Example queries +```sql +SHOW TABLES IN retail + +SELECT * FROM retail.orders WHERE region = 'us-east-1' LIMIT 10 + +SELECT * FROM retail.orders WHERE amount > 200 ORDER BY amount DESC LIMIT 5 + +DESCRIBE STATS retail.orders + +DIAGNOSE TABLE retail.orders + +EXPLAIN SELECT * FROM retail.orders WHERE region = 'us-east-1' + +SHOW TABLE LOCATION retail.products +``` + +## Tear down +```bash +docker compose down -v +``` diff --git a/polaris-shell/demo/demo.properties b/polaris-shell/demo/demo.properties new file mode 100644 index 00000000..eeaab2c8 --- /dev/null +++ b/polaris-shell/demo/demo.properties @@ -0,0 +1,20 @@ +# Polaris SQL Engine — local demo properties +# Start the environment first: docker compose up -d +# Then seed data: ./seed.sh +# Then run the shell: java -jar ../build/libs/polaris-sql-engine-*-demo.jar demo.properties + +polaris.uri=http://localhost:8181/api/catalog +polaris.warehouse=demo +polaris.client.id=demo-client +polaris.client.secret=demo-secret + +# MinIO S3 settings — routes from the host JVM to the MinIO container +io-impl=org.apache.iceberg.aws.s3.S3FileIO +s3.endpoint=http://localhost:9000 +# Endpoint Polaris uses to reach MinIO inside Docker (Docker-internal hostname) +s3.internal-endpoint=http://minio:9000 +s3.path-style-access=true +s3.access-key-id=demoaccesskey +s3.secret-access-key=demosecretkey + +cli.max-display-rows=50 diff --git a/polaris-shell/demo/docker-compose.yml b/polaris-shell/demo/docker-compose.yml new file mode 100644 index 00000000..4ce1108e --- /dev/null +++ b/polaris-shell/demo/docker-compose.yml @@ -0,0 +1,60 @@ +# Demo environment for the Polaris SQL Engine +# Usage: docker compose up -d && ./seed.sh && java -jar ../build/libs/polaris-shell-*-demo.jar demo.properties +services: + + minio: + image: minio/minio:latest + container_name: polaris-demo-minio + command: server /data --console-address ":9001" + environment: + MINIO_ROOT_USER: demoaccesskey + MINIO_ROOT_PASSWORD: demosecretkey + ports: + - "9000:9000" # S3 API + - "9001:9001" # MinIO console (http://localhost:9001) + volumes: + - minio-data:/data + healthcheck: + test: ["CMD", "mc", "ready", "local"] + interval: 5s + retries: 10 + + minio-init: + image: minio/mc:latest + container_name: polaris-demo-minio-init + depends_on: + minio: + condition: service_healthy + entrypoint: > + /bin/sh -c " + mc alias set local http://minio:9000 demoaccesskey demosecretkey && + mc mb --ignore-existing local/demo-bucket + " + + polaris: + image: apache/polaris:latest + container_name: polaris-demo + depends_on: + minio-init: + condition: service_completed_successfully + ports: + - "8181:8181" + - "8182:8182" + environment: + POLARIS_BOOTSTRAP_CREDENTIALS: "POLARIS,demo-client,demo-secret" + quarkus.otel.sdk.disabled: "true" + AWS_REGION: us-east-1 + AWS_ACCESS_KEY_ID: demoaccesskey + AWS_SECRET_ACCESS_KEY: demosecretkey + polaris.features."ALLOW_INSECURE_STORAGE_TYPES": "true" + polaris.features."SUPPORTED_CATALOG_STORAGE_TYPES": "[\"S3\"]" + # false → Polaris uses its own credentials and applies endpointInternal from storageConfigInfo + polaris.features."SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION": "false" + polaris.readiness.ignore-severe-issues: "true" + healthcheck: + test: ["CMD-SHELL", "curl -sf http://localhost:8182/q/health || exit 1"] + interval: 10s + retries: 18 + +volumes: + minio-data: diff --git a/polaris-shell/demo/seed.sh b/polaris-shell/demo/seed.sh new file mode 100755 index 00000000..ef684ad9 --- /dev/null +++ b/polaris-shell/demo/seed.sh @@ -0,0 +1,26 @@ +#!/usr/bin/env bash +# Seeds the local Polaris + MinIO demo environment with Iceberg tables and data. +# Run this once after: docker compose up -d +set -euo pipefail + +echo "STARTING SEED" +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" + +JAR="$(ls "$SCRIPT_DIR"/../build/libs/polaris-shell-demo.jar | head -1)" + +echo "JAR is $JAR" + +if [[ -z "$JAR" ]]; then + echo "Demo jar not found. Build it first with:" + echo " ./gradlew generateGrammarSource shadowJar" + exit 1 +fi + +echo "Waiting for Polaris to be ready..." +for i in $(seq 1 30); do + curl -sf http://localhost:8182/q/health > /dev/null && break + echo " attempt $i/30..." + sleep 5 +done + +java -cp "$JAR" org.apache.polaris.tools.cli.DemoDataSeeder "$SCRIPT_DIR/demo.properties" diff --git a/polaris-shell/gradle/libs.versions.toml b/polaris-shell/gradle/libs.versions.toml new file mode 100644 index 00000000..6e0d2bf7 --- /dev/null +++ b/polaris-shell/gradle/libs.versions.toml @@ -0,0 +1,54 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# + +[versions] +antlr4 = "4.13.2" +assertj = "3.27.3" +awssdk = "2.42.13" +guava = "33.4.0-jre" +hadoop = "3.4.1" +iceberg = "1.10.0" +junit = "5.12.0" +mockito = "5.17.0" +shadow = "9.3.2" +slf4j = "2.0.17" +testcontainers = "2.0.5" + +[libraries] +antlr4 = { module = "org.antlr:antlr4", version.ref = "antlr4" } +antlr4-engine-runtime = { module = "org.antlr:antlr4-runtime", version.ref = "antlr4" } +assertj = { module = "org.assertj:assertj-core", version.ref = "assertj" } +awssdk-dynamodb = { module = "software.amazon.awssdk:dynamodb", version.ref = "awssdk" } +awssdk-glue = { module = "software.amazon.awssdk:glue", version.ref = "awssdk" } +awssdk-kms = { module = "software.amazon.awssdk:kms", version.ref = "awssdk" } +awssdk-lakeformation = { module = "software.amazon.awssdk:lakeformation", version.ref = "awssdk" } +awssdk-s3 = { module = "software.amazon.awssdk:s3", version.ref = "awssdk" } +awssdk-sts = { module = "software.amazon.awssdk:sts", version.ref = "awssdk" } +awssdk-url-connection-client = { module = "software.amazon.awssdk:url-connection-client", version.ref = "awssdk" } +guava = { module = "com.google.guava:guava", version.ref = "guava" } +hadoop-common = { module = "org.apache.hadoop:hadoop-common", version.ref = "hadoop" } +hadoop-client-runtime = { module = "org.apache.hadoop:hadoop-client-runtime", version.ref = "hadoop" } +iceberg-bom = { module = "org.apache.iceberg:iceberg-bom", version.ref = "iceberg" } +junit-bom = { module = "org.junit:junit-bom", version.ref = "junit" } +mockito-junit-jupiter = { module = "org.mockito:mockito-junit-jupiter", version.ref = "mockito" } +slf4j-api = { module = "org.slf4j:slf4j-api", version.ref = "slf4j" } +testcontainers-bom = { module = "org.testcontainers:testcontainers-bom", version.ref = "testcontainers" } + +[plugins] +shadow = { id = "com.gradleup.shadow", version.ref = "shadow" } diff --git a/polaris-shell/gradle/wrapper/gradle-wrapper.properties b/polaris-shell/gradle/wrapper/gradle-wrapper.properties new file mode 100644 index 00000000..19a6bdeb --- /dev/null +++ b/polaris-shell/gradle/wrapper/gradle-wrapper.properties @@ -0,0 +1,7 @@ +distributionBase=GRADLE_USER_HOME +distributionPath=wrapper/dists +distributionUrl=https\://services.gradle.org/distributions/gradle-9.3.0-bin.zip +networkTimeout=10000 +validateDistributionUrl=true +zipStoreBase=GRADLE_USER_HOME +zipStorePath=wrapper/dists diff --git a/polaris-shell/gradlew b/polaris-shell/gradlew new file mode 100755 index 00000000..adff685a --- /dev/null +++ b/polaris-shell/gradlew @@ -0,0 +1,248 @@ +#!/bin/sh + +# +# Copyright © 2015 the original authors. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# https://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# SPDX-License-Identifier: Apache-2.0 +# + +############################################################################## +# +# Gradle start up script for POSIX generated by Gradle. +# +# Important for running: +# +# (1) You need a POSIX-compliant shell to run this script. If your /bin/sh is +# noncompliant, but you have some other compliant shell such as ksh or +# bash, then to run this script, type that shell name before the whole +# command line, like: +# +# ksh Gradle +# +# Busybox and similar reduced shells will NOT work, because this script +# requires all of these POSIX shell features: +# * functions; +# * expansions «$var», «${var}», «${var:-default}», «${var+SET}», +# «${var#prefix}», «${var%suffix}», and «$( cmd )»; +# * compound commands having a testable exit status, especially «case»; +# * various built-in commands including «command», «set», and «ulimit». +# +# Important for patching: +# +# (2) This script targets any POSIX shell, so it avoids extensions provided +# by Bash, Ksh, etc; in particular arrays are avoided. +# +# The "traditional" practice of packing multiple parameters into a +# space-separated string is a well documented source of bugs and security +# problems, so this is (mostly) avoided, by progressively accumulating +# options in "$@", and eventually passing that to Java. +# +# Where the inherited environment variables (DEFAULT_JVM_OPTS, JAVA_OPTS, +# and GRADLE_OPTS) rely on word-splitting, this is performed explicitly; +# see the in-line comments for details. +# +# There are tweaks for specific operating systems such as AIX, CygWin, +# Darwin, MinGW, and NonStop. +# +# (3) This script is generated from the Groovy template +# https://github.com/gradle/gradle/blob/HEAD/platforms/jvm/plugins-application/src/main/resources/org/gradle/api/internal/plugins/unixStartScript.txt +# within the Gradle project. +# +# You can find Gradle at https://github.com/gradle/gradle/. +# +############################################################################## + +# Attempt to set APP_HOME + +# Resolve links: $0 may be a link +app_path=$0 + +# Need this for daisy-chained symlinks. +while + APP_HOME=${app_path%"${app_path##*/}"} # leaves a trailing /; empty if no leading path + [ -h "$app_path" ] +do + ls=$( ls -ld "$app_path" ) + link=${ls#*' -> '} + case $link in #( + /*) app_path=$link ;; #( + *) app_path=$APP_HOME$link ;; + esac +done + +# This is normally unused +# shellcheck disable=SC2034 +APP_BASE_NAME=${0##*/} +# Discard cd standard output in case $CDPATH is set (https://github.com/gradle/gradle/issues/25036) +APP_HOME=$( cd -P "${APP_HOME:-./}" > /dev/null && printf '%s\n' "$PWD" ) || exit + +# Use the maximum available, or set MAX_FD != -1 to use that value. +MAX_FD=maximum + +warn () { + echo "$*" +} >&2 + +die () { + echo + echo "$*" + echo + exit 1 +} >&2 + +# OS specific support (must be 'true' or 'false'). +cygwin=false +msys=false +darwin=false +nonstop=false +case "$( uname )" in #( + CYGWIN* ) cygwin=true ;; #( + Darwin* ) darwin=true ;; #( + MSYS* | MINGW* ) msys=true ;; #( + NONSTOP* ) nonstop=true ;; +esac + + + +# Determine the Java command to use to start the JVM. +if [ -n "$JAVA_HOME" ] ; then + if [ -x "$JAVA_HOME/jre/sh/java" ] ; then + # IBM's JDK on AIX uses strange locations for the executables + JAVACMD=$JAVA_HOME/jre/sh/java + else + JAVACMD=$JAVA_HOME/bin/java + fi + if [ ! -x "$JAVACMD" ] ; then + die "ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME + +Please set the JAVA_HOME variable in your environment to match the +location of your Java installation." + fi +else + JAVACMD=java + if ! command -v java >/dev/null 2>&1 + then + die "ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH. + +Please set the JAVA_HOME variable in your environment to match the +location of your Java installation." + fi +fi + +# Increase the maximum file descriptors if we can. +if ! "$cygwin" && ! "$darwin" && ! "$nonstop" ; then + case $MAX_FD in #( + max*) + # In POSIX sh, ulimit -H is undefined. That's why the result is checked to see if it worked. + # shellcheck disable=SC2039,SC3045 + MAX_FD=$( ulimit -H -n ) || + warn "Could not query maximum file descriptor limit" + esac + case $MAX_FD in #( + '' | soft) :;; #( + *) + # In POSIX sh, ulimit -n is undefined. That's why the result is checked to see if it worked. + # shellcheck disable=SC2039,SC3045 + ulimit -n "$MAX_FD" || + warn "Could not set maximum file descriptor limit to $MAX_FD" + esac +fi + +# Collect all arguments for the java command, stacking in reverse order: +# * args from the command line +# * the main class name +# * -classpath +# * -D...appname settings +# * --module-path (only if needed) +# * DEFAULT_JVM_OPTS, JAVA_OPTS, and GRADLE_OPTS environment variables. + +# For Cygwin or MSYS, switch paths to Windows format before running java +if "$cygwin" || "$msys" ; then + APP_HOME=$( cygpath --path --mixed "$APP_HOME" ) + + JAVACMD=$( cygpath --unix "$JAVACMD" ) + + # Now convert the arguments - kludge to limit ourselves to /bin/sh + for arg do + if + case $arg in #( + -*) false ;; # don't mess with options #( + /?*) t=${arg#/} t=/${t%%/*} # looks like a POSIX filepath + [ -e "$t" ] ;; #( + *) false ;; + esac + then + arg=$( cygpath --path --ignore --mixed "$arg" ) + fi + # Roll the args list around exactly as many times as the number of + # args, so each arg winds up back in the position where it started, but + # possibly modified. + # + # NB: a `for` loop captures its iteration list before it begins, so + # changing the positional parameters here affects neither the number of + # iterations, nor the values presented in `arg`. + shift # remove old arg + set -- "$@" "$arg" # push replacement arg + done +fi + + +# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script. +DEFAULT_JVM_OPTS='"-Xmx64m" "-Xms64m"' + +# Collect all arguments for the java command: +# * DEFAULT_JVM_OPTS, JAVA_OPTS, and optsEnvironmentVar are not allowed to contain shell fragments, +# and any embedded shellness will be escaped. +# * For example: A user cannot expect ${Hostname} to be expanded, as it is an environment variable and will be +# treated as '${Hostname}' itself on the command line. + +set -- \ + "-Dorg.gradle.appname=$APP_BASE_NAME" \ + -jar "$APP_HOME/gradle/wrapper/gradle-wrapper.jar" \ + "$@" + +# Stop when "xargs" is not available. +if ! command -v xargs >/dev/null 2>&1 +then + die "xargs is not available" +fi + +# Use "xargs" to parse quoted args. +# +# With -n1 it outputs one arg per line, with the quotes and backslashes removed. +# +# In Bash we could simply go: +# +# readarray ARGS < <( xargs -n1 <<<"$var" ) && +# set -- "${ARGS[@]}" "$@" +# +# but POSIX shell has neither arrays nor command substitution, so instead we +# post-process each arg (as a line of input to sed) to backslash-escape any +# character that might be a shell metacharacter, then use eval to reverse +# that process (while maintaining the separation between arguments), and wrap +# the whole thing up as a single "set" statement. +# +# This will of course break if any of these variables contains a newline or +# an unmatched quote. +# + +eval "set -- $( + printf '%s\n' "$DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS" | + xargs -n1 | + sed ' s~[^-[:alnum:]+,./:=@_]~\\&~g; ' | + tr '\n' ' ' + )" '"$@"' + +exec "$JAVACMD" "$@" diff --git a/polaris-shell/gradlew.bat b/polaris-shell/gradlew.bat new file mode 100644 index 00000000..c4bdd3ab --- /dev/null +++ b/polaris-shell/gradlew.bat @@ -0,0 +1,93 @@ +@rem +@rem Copyright 2015 the original author or authors. +@rem +@rem Licensed under the Apache License, Version 2.0 (the "License"); +@rem you may not use this file except in compliance with the License. +@rem You may obtain a copy of the License at +@rem +@rem https://www.apache.org/licenses/LICENSE-2.0 +@rem +@rem Unless required by applicable law or agreed to in writing, software +@rem distributed under the License is distributed on an "AS IS" BASIS, +@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +@rem See the License for the specific language governing permissions and +@rem limitations under the License. +@rem +@rem SPDX-License-Identifier: Apache-2.0 +@rem + +@if "%DEBUG%"=="" @echo off +@rem ########################################################################## +@rem +@rem Gradle startup script for Windows +@rem +@rem ########################################################################## + +@rem Set local scope for the variables with windows NT shell +if "%OS%"=="Windows_NT" setlocal + +set DIRNAME=%~dp0 +if "%DIRNAME%"=="" set DIRNAME=. +@rem This is normally unused +set APP_BASE_NAME=%~n0 +set APP_HOME=%DIRNAME% + +@rem Resolve any "." and ".." in APP_HOME to make it shorter. +for %%i in ("%APP_HOME%") do set APP_HOME=%%~fi + +@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script. +set DEFAULT_JVM_OPTS="-Xmx64m" "-Xms64m" + +@rem Find java.exe +if defined JAVA_HOME goto findJavaFromJavaHome + +set JAVA_EXE=java.exe +%JAVA_EXE% -version >NUL 2>&1 +if %ERRORLEVEL% equ 0 goto execute + +echo. 1>&2 +echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH. 1>&2 +echo. 1>&2 +echo Please set the JAVA_HOME variable in your environment to match the 1>&2 +echo location of your Java installation. 1>&2 + +goto fail + +:findJavaFromJavaHome +set JAVA_HOME=%JAVA_HOME:"=% +set JAVA_EXE=%JAVA_HOME%/bin/java.exe + +if exist "%JAVA_EXE%" goto execute + +echo. 1>&2 +echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME% 1>&2 +echo. 1>&2 +echo Please set the JAVA_HOME variable in your environment to match the 1>&2 +echo location of your Java installation. 1>&2 + +goto fail + +:execute +@rem Setup the command line + + + +@rem Execute Gradle +"%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -jar "%APP_HOME%\gradle\wrapper\gradle-wrapper.jar" %* + +:end +@rem End local scope for the variables with windows NT shell +if %ERRORLEVEL% equ 0 goto mainEnd + +:fail +rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of +rem the _cmd.exe /c_ return code! +set EXIT_CODE=%ERRORLEVEL% +if %EXIT_CODE% equ 0 set EXIT_CODE=1 +if not ""=="%GRADLE_EXIT_CONSOLE%" exit %EXIT_CODE% +exit /b %EXIT_CODE% + +:mainEnd +if "%OS%"=="Windows_NT" endlocal + +:omega diff --git a/polaris-shell/polaris-sql-demo.properties.example b/polaris-shell/polaris-sql-demo.properties.example new file mode 100644 index 00000000..68813a09 --- /dev/null +++ b/polaris-shell/polaris-sql-demo.properties.example @@ -0,0 +1,29 @@ +# ── Polaris connection ──────────────────────────────────────────────────────── +polaris.uri=http://localhost:8181/api/catalog +polaris.warehouse=my-catalog + +# OAuth2 client credentials (used to obtain a bearer token at startup) +polaris.client.id=root +polaris.client.secret=s3cr3t + +# Token endpoint — defaults to {polaris.uri}/v1/oauth/tokens if omitted +# polaris.token.endpoint=http://localhost:8181/api/catalog/v1/oauth/tokens + +# ── Display options ─────────────────────────────────────────────────────────── +# Maximum rows to print for SELECT results (safety cap) +cli.max-display-rows=100 + +# ── S3 / FileIO properties ──────────────────────────────────────────────────── +# Any key not prefixed with polaris. or cli. is passed directly to the catalog. + +# For real AWS S3: +# s3.region=us-east-1 + +# For MinIO or other S3-compatible storage: +# s3.endpoint=http://localhost:9000 +# s3.path-style-access=true +# s3.access-key-id=minioadmin +# s3.secret-access-key=minioadmin + +# FileIO implementation (omit for real S3, set for MinIO/RustFS/etc.) +# io-impl=org.apache.iceberg.aws.s3.S3FileIO diff --git a/polaris-shell/settings.gradle.kts b/polaris-shell/settings.gradle.kts new file mode 100644 index 00000000..2808fa9e --- /dev/null +++ b/polaris-shell/settings.gradle.kts @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +pluginManagement { + repositories { + mavenCentral() + gradlePluginPortal() + } +} + +dependencyResolutionManagement { + repositories { + mavenCentral() + } +} + +rootProject.name = "polaris-shell" diff --git a/polaris-shell/src/main/antlr/IcebergSQL.g4 b/polaris-shell/src/main/antlr/IcebergSQL.g4 new file mode 100644 index 00000000..1bc302f5 --- /dev/null +++ b/polaris-shell/src/main/antlr/IcebergSQL.g4 @@ -0,0 +1,224 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +grammar IcebergSQL; + +// ─── Entry Point ───────────────────────────────────────────────────────────── + +query + : selectQuery # selectStmt + | showTablesQuery # showTablesStmt + | describeStatsQuery # describeStatsStmt + | showLocationQuery # showLocationStmt + | showPoliciesQuery # showPoliciesStmt + | diagnoseQuery # diagnoseStmt + | explainQuery # explainStmt + ; + +selectQuery + : SELECT columnList + FROM tableRef + (WHERE predicate)? + (ORDER BY orderByList)? + (LIMIT INTEGER_LITERAL)? + EOF + ; + +showTablesQuery + : SHOW TABLES IN namespaceRef EOF + ; + +describeStatsQuery + : DESCRIBE STATS tableRef EOF + ; + +showLocationQuery + : SHOW TABLE LOCATION tableRef EOF + ; + +showPoliciesQuery + : SHOW TABLE POLICIES tableRef EOF + ; + +diagnoseQuery + : DIAGNOSE TABLE tableRef EOF + ; + +explainQuery + : EXPLAIN selectQuery + ; + +// ─── ORDER BY ──────────────────────────────────────────────────────────────── + +orderByList + : orderByItem (COMMA orderByItem)* + ; + +orderByItem + : identifier (ASC | DESC)? + ; + +// ─── Namespace / Table References ──────────────────────────────────────────── + +namespaceRef + : identifier (DOT identifier)* + ; + +// ─── Column Projection ─────────────────────────────────────────────────────── + +columnList + : STAR # allColumns + | column (COMMA column)* # namedColumns + ; + +column + : identifier (DOT identifier)? # simpleColumn // e.g. region or t.region + ; + +// ─── Table Reference ───────────────────────────────────────────────────────── + +// Supports unqualified (events), single-namespace (logs.events), +// or multi-level namespace (prod.logs.events) +tableRef + : identifier (DOT identifier)* + ; + +// ─── Identifier (bare or backtick-quoted) ──────────────────────────────────── + +identifier + : ID + | QUOTED_ID + ; + +// ─── Predicates ────────────────────────────────────────────────────────────── + +predicate + : LPAREN predicate RPAREN # parenPred + | NOT predicate # notPred + | predicate AND predicate # andPred + | predicate OR predicate # orPred + | expression IS NOT? NULL # isNullPred + | expression IN + LPAREN literal (COMMA literal)* RPAREN # inPred + | expression NOT IN + LPAREN literal (COMMA literal)* RPAREN # notInPred + | expression op expression # comparisonPred + ; + +// ─── Expressions & Literals ────────────────────────────────────────────────── + +// An expression in a comparison is always a column reference or a literal +expression + : identifier # columnRef + | literal # literalExpr + ; + +literal + : INTEGER_LITERAL # intLiteral + | FLOAT_LITERAL # floatLiteral + | STRING_LITERAL # stringLiteral + | TRUE # trueLiteral + | FALSE # falseLiteral + ; + +// ─── Comparison Operators ──────────────────────────────────────────────────── + +op : EQ | NEQ | LT | LTE | GT | GTE ; + +// ─── Keywords (case-insensitive via fragment) ───────────────────────────────── + +SELECT : S E L E C T ; +FROM : F R O M ; +WHERE : W H E R E ; +LIMIT : L I M I T ; +ORDER : O R D E R ; +BY : B Y ; +ASC : A S C ; +DESC : D E S C ; +AND : A N D ; +OR : O R ; +NOT : N O T ; +IN : I N ; +IS : I S ; +NULL : N U L L ; +TRUE : T R U E ; +FALSE : F A L S E ; +SHOW : S H O W ; +TABLES : T A B L E S ; +TABLE : T A B L E ; +DESCRIBE : D E S C R I B E ; +STATS : S T A T S ; +LOCATION : L O C A T I O N ; +POLICIES : P O L I C I E S ; +DIAGNOSE : D I A G N O S E ; +EXPLAIN : E X P L A I N ; + +// ─── Operators & Punctuation ───────────────────────────────────────────────── + +EQ : '=' ; +NEQ : '!=' | '<>' ; +LT : '<' ; +LTE : '<=' ; +GT : '>' ; +GTE : '>=' ; +STAR : '*' ; +COMMA : ',' ; +DOT : '.' ; +LPAREN : '(' ; +RPAREN : ')' ; + +// ─── Identifiers & Literals ────────────────────────────────────────────────── + +ID + : [a-zA-Z_] [a-zA-Z_0-9]* + ; + +QUOTED_ID + : '`' ( ~'`' )+ '`' + ; + +INTEGER_LITERAL + : '-'? [0-9]+ + ; + +FLOAT_LITERAL + : '-'? [0-9]+ '.' [0-9]* + | '-'? '.' [0-9]+ + ; + +// Single-quoted strings: 'us-east-1' ('' = escaped single quote inside) +STRING_LITERAL + : '\'' ( ~'\'' | '\'\'' )* '\'' + ; + +// ─── Whitespace & Comments ─────────────────────────────────────────────────── + +WS : [ \t\r\n]+ -> skip ; +LINE_COMMENT : '--' ~[\r\n]* -> skip ; +BLOCK_COMMENT: '/*' .*? '*/' -> skip ; + +// ─── Case-insensitive letter fragments ─────────────────────────────────────── + +fragment A : [aA]; fragment B : [bB]; fragment C : [cC]; fragment D : [dD]; +fragment E : [eE]; fragment F : [fF]; fragment G : [gG]; fragment H : [hH]; +fragment I : [iI]; fragment J : [jJ]; fragment K : [kK]; fragment L : [lL]; +fragment M : [mM]; fragment N : [nN]; fragment O : [oO]; fragment P : [pP]; +fragment Q : [qQ]; fragment R : [rR]; fragment S : [sS]; fragment T : [tT]; +fragment U : [uU]; fragment V : [vV]; fragment W : [wW]; fragment X : [xX]; +fragment Y : [yY]; fragment Z : [zZ]; diff --git a/polaris-shell/src/main/java/org/apache/polaris/tools/cli/DemoDataSeeder.java b/polaris-shell/src/main/java/org/apache/polaris/tools/cli/DemoDataSeeder.java new file mode 100644 index 00000000..888f3e3b --- /dev/null +++ b/polaris-shell/src/main/java/org/apache/polaris/tools/cli/DemoDataSeeder.java @@ -0,0 +1,343 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.cli; + +import org.apache.iceberg.DataFile; +import org.apache.iceberg.DataFiles; +import org.apache.iceberg.FileFormat; +import org.apache.iceberg.PartitionKey; +import org.apache.iceberg.PartitionSpec; +import org.apache.iceberg.Schema; +import org.apache.iceberg.Table; +import org.apache.iceberg.aws.s3.S3FileIO; +import org.apache.iceberg.catalog.Namespace; +import org.apache.iceberg.catalog.TableIdentifier; +import org.apache.iceberg.data.GenericRecord; +import org.apache.iceberg.data.Record; +import org.apache.iceberg.data.parquet.GenericParquetWriter; +import org.apache.iceberg.io.FileAppender; +import org.apache.iceberg.io.OutputFile; +import org.apache.iceberg.parquet.Parquet; +import org.apache.iceberg.rest.RESTCatalog; +import org.apache.iceberg.types.Types; + +import java.io.FileInputStream; +import java.net.URI; +import java.net.URLEncoder; +import java.net.http.HttpClient; +import java.net.http.HttpRequest; +import java.net.http.HttpResponse; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Properties; +import java.util.Random; +import java.util.UUID; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +/** + * Seeds a local Polaris + MinIO demo environment with realistic Iceberg tables and data. + * + *

Usage: java -cp polaris-sql-engine-*-demo.jar \ + * org.apache.polaris.tools.cli.DemoDataSeeder [path/to/demo.properties] + * + *

Idempotent: re-running skips already-existing catalogs, namespaces, and tables. + */ +public class DemoDataSeeder { + + private static final Pattern ACCESS_TOKEN_PATTERN = + Pattern.compile("\"access_token\"\\s*:\\s*\"([^\"]+)\""); + + static final String CATALOG = "demo"; + static final String NAMESPACE = "retail"; + static final String BUCKET = "demo-bucket"; + + // Internal MinIO endpoint used in catalog storage config (Polaris → MinIO via Docker network). + // For a host-only demo both point to localhost, which is fine because + // SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION is enabled. + static final String MINIO_ENDPOINT = "http://localhost:9000"; + + public static void main(String[] args) throws Exception { + String propsPath = args.length > 0 ? args[0] : "demo.properties"; + Properties props = new Properties(); + try (var in = new FileInputStream(propsPath)) { props.load(in); } + + String polarisUri = props.getProperty("polaris.uri"); + String clientId = props.getProperty("polaris.client.id"); + String clientSecret = props.getProperty("polaris.client.secret"); + String s3Endpoint = props.getProperty("s3.endpoint", MINIO_ENDPOINT); + // Polaris runs inside Docker, so it must reach MinIO via the Docker-internal hostname. + // Falls back to s3.endpoint if not set (e.g. non-Docker deployments). + String s3InternalEndpoint = props.getProperty("s3.internal-endpoint", s3Endpoint); + String accessKey = props.getProperty("s3.access-key-id"); + String secretKey = props.getProperty("s3.secret-access-key"); + + System.out.println("Connecting to Polaris at " + polarisUri + " ..."); + String token = obtainToken(polarisUri, clientId, clientSecret); + System.out.println("Authenticated."); + + createCatalogIfAbsent(polarisUri, token, accessKey, secretKey, s3Endpoint, s3InternalEndpoint); + + // Build RESTCatalog + Map catalogProps = new HashMap<>(); + catalogProps.put("uri", polarisUri); + catalogProps.put("warehouse", CATALOG); + catalogProps.put("token", token); + catalogProps.put("io-impl", "org.apache.iceberg.aws.s3.S3FileIO"); + catalogProps.put("s3.endpoint", s3Endpoint); + catalogProps.put("s3.path-style-access", "true"); + catalogProps.put("s3.access-key-id", accessKey); + catalogProps.put("s3.secret-access-key", secretKey); + + try (RESTCatalog catalog = new RESTCatalog()) { + catalog.initialize("demo-seed", catalogProps); + + Namespace ns = Namespace.of(NAMESPACE); + if (!catalog.namespaceExists(ns)) { + catalog.createNamespace(ns); + System.out.println("Created namespace: " + NAMESPACE); + } + + // S3FileIO for writing Parquet files + S3FileIO fileIO = new S3FileIO(); + fileIO.initialize(Map.of( + "s3.endpoint", s3Endpoint, + "s3.path-style-access", "true", + "s3.access-key-id", accessKey, + "s3.secret-access-key", secretKey)); + + seedOrdersTable(catalog, fileIO); + seedProductsTable(catalog, fileIO); + seedRegionsTable(catalog, fileIO); + } + + System.out.println("\nDemo data ready. Try these queries:"); + System.out.println(" SHOW TABLES IN retail"); + System.out.println(" SELECT * FROM retail.orders WHERE amount > 100 LIMIT 10"); + System.out.println(" SELECT * FROM retail.orders ORDER BY amount DESC LIMIT 5"); + System.out.println(" DESCRIBE STATS retail.orders"); + System.out.println(" DIAGNOSE TABLE retail.orders"); + System.out.println(" EXPLAIN SELECT * FROM retail.orders WHERE region = 'us-east-1'"); + } + + // ── Table: retail.orders ────────────────────────────────────────────────── + + private static void seedOrdersTable(RESTCatalog catalog, S3FileIO fileIO) throws Exception { + TableIdentifier id = TableIdentifier.of(NAMESPACE, "orders"); + Schema schema = new Schema( + Types.NestedField.required(1, "order_id", Types.IntegerType.get()), + Types.NestedField.required(2, "customer", Types.StringType.get()), + Types.NestedField.required(3, "region", Types.StringType.get()), + Types.NestedField.required(4, "amount", Types.DoubleType.get()), + Types.NestedField.required(5, "status", Types.StringType.get())); + + PartitionSpec spec = PartitionSpec.builderFor(schema).identity("region").build(); + Table table = getOrCreateTable(catalog, id, schema, spec); + + // Write two data files — one per region — to exercise partition pruning in EXPLAIN + List usEast = new ArrayList<>(); + List euWest = new ArrayList<>(); + String[] statuses = {"COMPLETED", "PENDING", "SHIPPED", "CANCELLED"}; + Random rng = new Random(42); + for (int i = 1; i <= 200; i++) { + GenericRecord r = GenericRecord.create(schema); + r.setField("order_id", i); + r.setField("customer", "Customer-" + i); + r.setField("amount", Math.round(rng.nextDouble() * 500 * 100.0) / 100.0); + r.setField("status", statuses[rng.nextInt(statuses.length)]); + if (i % 2 == 0) { + r.setField("region", "us-east-1"); + usEast.add(r); + } else { + r.setField("region", "eu-west-1"); + euWest.add(r); + } + } + appendRecords(table, fileIO, schema, spec, usEast); + appendRecords(table, fileIO, schema, spec, euWest); + System.out.println("Seeded retail.orders (200 rows)"); + } + + // ── Table: retail.products ──────────────────────────────────────────────── + + private static void seedProductsTable(RESTCatalog catalog, S3FileIO fileIO) throws Exception { + TableIdentifier id = TableIdentifier.of(NAMESPACE, "products"); + Schema schema = new Schema( + Types.NestedField.required(1, "product_id", Types.IntegerType.get()), + Types.NestedField.required(2, "name", Types.StringType.get()), + Types.NestedField.required(3, "category", Types.StringType.get()), + Types.NestedField.required(4, "price", Types.DoubleType.get()), + Types.NestedField.required(5, "in_stock", Types.BooleanType.get())); + + Table table = getOrCreateTable(catalog, id, schema, PartitionSpec.unpartitioned()); + + String[] categories = {"Electronics", "Clothing", "Food", "Books", "Tools"}; + List rows = new ArrayList<>(); + Random rng = new Random(7); + for (int i = 1; i <= 50; i++) { + GenericRecord r = GenericRecord.create(schema); + r.setField("product_id", i); + r.setField("name", "Product-" + i); + r.setField("category", categories[i % categories.length]); + r.setField("price", Math.round(rng.nextDouble() * 200 * 100.0) / 100.0); + r.setField("in_stock", rng.nextBoolean()); + rows.add(r); + } + appendRecords(table, fileIO, schema, PartitionSpec.unpartitioned(), rows); + System.out.println("Seeded retail.products (50 rows)"); + } + + // ── Table: retail.regions ───────────────────────────────────────────────── + + private static void seedRegionsTable(RESTCatalog catalog, S3FileIO fileIO) throws Exception { + TableIdentifier id = TableIdentifier.of(NAMESPACE, "regions"); + Schema schema = new Schema( + Types.NestedField.required(1, "region_id", Types.StringType.get()), + Types.NestedField.required(2, "description", Types.StringType.get()), + Types.NestedField.required(3, "timezone", Types.StringType.get())); + + Table table = getOrCreateTable(catalog, id, schema, PartitionSpec.unpartitioned()); + + List rows = List.of( + row(schema, "us-east-1", "US East (N. Virginia)", "America/New_York"), + row(schema, "eu-west-1", "EU West (Ireland)", "Europe/Dublin"), + row(schema, "ap-south-1","Asia Pacific (Mumbai)", "Asia/Kolkata")); + + appendRecords(table, fileIO, schema, PartitionSpec.unpartitioned(), rows); + System.out.println("Seeded retail.regions (3 rows)"); + } + + // ── Helpers ─────────────────────────────────────────────────────────────── + + private static Table getOrCreateTable( + RESTCatalog catalog, TableIdentifier id, Schema schema, PartitionSpec spec) { + if (catalog.tableExists(id)) { + System.out.println("Table already exists, skipping: " + id); + return catalog.loadTable(id); + } + return catalog.createTable(id, schema, spec); + } + + private static void appendRecords( + Table table, S3FileIO fileIO, Schema schema, PartitionSpec spec, + List records) throws Exception { + if (records.isEmpty()) return; + + String location = table.locationProvider() + .newDataLocation(UUID.randomUUID() + ".parquet"); + OutputFile outputFile = fileIO.newOutputFile(location); + + FileAppender appender = Parquet.write(outputFile) + .schema(schema) + .createWriterFunc(GenericParquetWriter::create) + .build(); + appender.addAll(records); + appender.close(); + + DataFiles.Builder fileBuilder = DataFiles.builder(spec) + .withPath(location) + .withFileSizeInBytes(fileIO.newInputFile(location).getLength()) + .withMetrics(appender.metrics()) + .withFormat(FileFormat.PARQUET) + .withRecordCount(records.size()); + + if (spec.isPartitioned()) { + PartitionKey key = new PartitionKey(spec, schema); + key.partition(records.get(0)); + fileBuilder = fileBuilder.withPartition(key); + } + + DataFile dataFile = fileBuilder.build(); + + table.newAppend().appendFile(dataFile).commit(); + } + + private static GenericRecord row(Schema schema, Object... values) { + GenericRecord r = GenericRecord.create(schema); + List fields = schema.columns(); + for (int i = 0; i < values.length; i++) r.setField(fields.get(i).name(), values[i]); + return r; + } + + private static void createCatalogIfAbsent( + String polarisUri, String token, + String accessKey, String secretKey, + String s3Endpoint, String s3InternalEndpoint) throws Exception { + String json = String.format(""" + { + "catalog": { + "type": "INTERNAL", + "name": "%s", + "properties": { + "default-base-location": "s3://%s/warehouse", + "s3.endpoint": "%s", + "s3.path-style-access": "true", + "s3.access-key-id": "%s", + "s3.secret-access-key": "%s" + }, + "storageConfigInfo": { + "storageType": "S3", + "allowedLocations": ["s3://%s"], + "roleArn": "arn:aws:iam::000000000000:role/demo", + "endpoint": "%s", + "endpointInternal": "%s", + "pathStyleAccess": true, + "stsUnavailable": true + } + } + } + """, CATALOG, BUCKET, s3Endpoint, accessKey, secretKey, BUCKET, s3Endpoint, s3InternalEndpoint); + + HttpRequest req = HttpRequest.newBuilder() + .uri(URI.create(polarisUri.replace("/api/catalog", "") + "/api/management/v1/catalogs")) + .header("Content-Type", "application/json") + .header("Authorization", "Bearer " + token) + .POST(HttpRequest.BodyPublishers.ofString(json)) + .build(); + HttpResponse resp = HttpClient.newHttpClient() + .send(req, HttpResponse.BodyHandlers.ofString()); + if (resp.statusCode() == 409) { + System.out.println("Catalog already exists, skipping creation."); + } else if (resp.statusCode() >= 300) { + throw new RuntimeException("Failed to create catalog: " + resp.body()); + } + } + + private static String obtainToken(String uri, String clientId, String clientSecret) + throws Exception { + String body = "grant_type=client_credentials" + + "&client_id=" + URLEncoder.encode(clientId, StandardCharsets.UTF_8) + + "&client_secret=" + URLEncoder.encode(clientSecret, StandardCharsets.UTF_8) + + "&scope=PRINCIPAL_ROLE:ALL"; + HttpRequest req = HttpRequest.newBuilder() + .uri(URI.create(uri + "/v1/oauth/tokens")) + .header("Content-Type", "application/x-www-form-urlencoded") + .POST(HttpRequest.BodyPublishers.ofString(body)) + .build(); + HttpResponse resp = HttpClient.newHttpClient() + .send(req, HttpResponse.BodyHandlers.ofString()); + Matcher m = ACCESS_TOKEN_PATTERN.matcher(resp.body()); + if (!m.find()) throw new IllegalStateException("No access_token in: " + resp.body()); + return m.group(1); + } +} diff --git a/polaris-shell/src/main/java/org/apache/polaris/tools/cli/OAuthHelper.java b/polaris-shell/src/main/java/org/apache/polaris/tools/cli/OAuthHelper.java new file mode 100644 index 00000000..45b16a8e --- /dev/null +++ b/polaris-shell/src/main/java/org/apache/polaris/tools/cli/OAuthHelper.java @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.cli; + +import java.net.URI; +import java.net.URLEncoder; +import java.net.http.HttpClient; +import java.net.http.HttpRequest; +import java.net.http.HttpResponse; +import java.nio.charset.StandardCharsets; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +/** + * Obtains an OAuth2 bearer token from a Polaris token endpoint using the + * client-credentials grant. + */ +class OAuthHelper { + + private static final Pattern ACCESS_TOKEN_PATTERN = + Pattern.compile("\"access_token\"\\s*:\\s*\"([^\"]+)\""); + + /** + * Performs a {@code client_credentials} exchange against {@code tokenEndpoint} + * and returns the resulting access token. + * + * @param tokenEndpoint full URL, e.g. {@code http://localhost:8181/api/catalog/v1/oauth/tokens} + * @param clientId OAuth client ID + * @param clientSecret OAuth client secret + * @return the access token string + * @throws Exception if the HTTP call fails or the response contains no {@code access_token} + */ + static String obtainToken(String tokenEndpoint, String clientId, String clientSecret) + throws Exception { + String body = "grant_type=client_credentials" + + "&client_id=" + URLEncoder.encode(clientId, StandardCharsets.UTF_8) + + "&client_secret=" + URLEncoder.encode(clientSecret, StandardCharsets.UTF_8) + + "&scope=PRINCIPAL_ROLE:ALL"; + + HttpRequest request = HttpRequest.newBuilder() + .uri(URI.create(tokenEndpoint)) + .header("Content-Type", "application/x-www-form-urlencoded") + .POST(HttpRequest.BodyPublishers.ofString(body)) + .build(); + + HttpResponse response = HttpClient.newHttpClient() + .send(request, HttpResponse.BodyHandlers.ofString()); + + if (response.statusCode() < 200 || response.statusCode() >= 300) { + throw new IllegalStateException( + "OAuth token request failed: HTTP " + response.statusCode() + + " — " + response.body()); + } + + Matcher matcher = ACCESS_TOKEN_PATTERN.matcher(response.body()); + if (!matcher.find()) { + throw new IllegalStateException( + "No access_token found in OAuth response: " + response.body()); + } + return matcher.group(1); + } + + private OAuthHelper() {} +} diff --git a/polaris-shell/src/main/java/org/apache/polaris/tools/cli/PolarisShell.java b/polaris-shell/src/main/java/org/apache/polaris/tools/cli/PolarisShell.java new file mode 100644 index 00000000..d67b6141 --- /dev/null +++ b/polaris-shell/src/main/java/org/apache/polaris/tools/cli/PolarisShell.java @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.cli; + +import org.apache.polaris.tools.planner.IcebergRestQueryExecutor; +import org.apache.polaris.tools.planner.QueryExecutor; +import org.apache.polaris.tools.planner.QueryPlan; +import org.apache.polaris.tools.planner.SqlToQueryPlan; + +import java.io.FileInputStream; +import java.nio.charset.StandardCharsets; +import java.util.HashMap; +import java.util.Map; +import java.util.Properties; +import java.util.Scanner; + +/** + * Interactive SQL shell for querying an Iceberg catalog via Polaris. + * + *

Usage

+ *
+ *   java -jar sql-engine-demo.jar [path/to/polaris-sql-demo.properties]
+ * 
+ * + *

If no path is given the shell looks for {@code polaris-sql-demo.properties} in the + * current working directory. + * + *

Required properties

+ *
    + *
  • {@code polaris.uri} — Polaris REST catalog base URI, e.g. + * {@code http://localhost:8181/api/catalog}
  • + *
  • {@code polaris.warehouse} — warehouse / catalog name
  • + *
  • {@code polaris.client.id} — OAuth2 client ID
  • + *
  • {@code polaris.client.secret} — OAuth2 client secret
  • + *
+ * + *

Optional properties

+ *
    + *
  • {@code polaris.token.endpoint} — defaults to {@code {polaris.uri}/v1/oauth/tokens}
  • + *
  • {@code cli.max-display-rows} — row cap for SELECT output (default: 100)
  • + *
  • Any other key (not prefixed with {@code polaris.} or {@code cli.}) is passed + * directly to the Iceberg catalog as a catalog property, e.g. S3 / MinIO settings.
  • + *
+ */ +public class PolarisShell { + + public static void main(String[] args) throws Exception { + String propsPath = args.length > 0 ? args[0] : "polaris-sql-demo.properties"; + + // 1. Load properties + Properties props = new Properties(); + try (var in = new FileInputStream(propsPath)) { + props.load(in); + } catch (java.io.FileNotFoundException e) { + System.err.println("Properties file not found: " + propsPath); + System.err.println("Create it from the polaris-sql-demo.properties.example template."); + System.exit(1); + } + + String uri = required(props, "polaris.uri"); + String warehouse = required(props, "polaris.warehouse"); + String clientId = required(props, "polaris.client.id"); + String clientSecret = required(props, "polaris.client.secret"); + int maxRows = Integer.parseInt(props.getProperty("cli.max-display-rows", "100")); + String tokenEndpoint = props.getProperty("polaris.token.endpoint", uri + "/v1/oauth/tokens"); + + // 2. Obtain token + System.out.println("Connecting to Polaris at " + uri + " ..."); + String token; + try { + token = OAuthHelper.obtainToken(tokenEndpoint, clientId, clientSecret); + } catch (Exception e) { + System.err.println("Failed to obtain OAuth token: " + e.getMessage()); + System.exit(1); + return; + } + System.out.println("Authenticated. Type SQL statements or 'exit' to quit.\n"); + + // 3. Build extra catalog properties (everything not prefixed polaris.* or cli.*) + Map extraProps = new HashMap<>(); + for (String key : props.stringPropertyNames()) { + if (!key.startsWith("polaris.") && !key.startsWith("cli.")) { + extraProps.put(key, props.getProperty(key)); + } + } + + // 4. Start REPL + var translator = new SqlToQueryPlan(); + try (var restExecutor = new IcebergRestQueryExecutor(uri, warehouse, token, extraProps); + var scanner = new Scanner(System.in, StandardCharsets.UTF_8)) { + + var catalogExecutor = new QueryExecutor(restExecutor.getCatalog()); + + while (true) { + System.out.print("sql> "); + if (!scanner.hasNextLine()) break; // EOF (e.g. piped input) + String line = scanner.nextLine().trim(); + + if (line.isBlank()) continue; + if (line.equalsIgnoreCase("exit") || line.equalsIgnoreCase("quit")) break; + + try { + QueryPlan plan = translator.translate(line); + ResultPrinter.print(plan, restExecutor, catalogExecutor, maxRows); + } catch (IllegalArgumentException e) { + System.err.println("Parse error: " + e.getMessage()); + } catch (Exception e) { + System.err.println("Error: " + e.getMessage()); + } + } + } + + System.out.println("Bye."); + } + + private static String required(Properties props, String key) { + String value = props.getProperty(key); + if (value == null || value.isBlank()) { + throw new IllegalArgumentException("Missing required property: " + key); + } + return value.trim(); + } +} diff --git a/polaris-shell/src/main/java/org/apache/polaris/tools/cli/ResultPrinter.java b/polaris-shell/src/main/java/org/apache/polaris/tools/cli/ResultPrinter.java new file mode 100644 index 00000000..e75a0c7b --- /dev/null +++ b/polaris-shell/src/main/java/org/apache/polaris/tools/cli/ResultPrinter.java @@ -0,0 +1,163 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.cli; + +import org.apache.iceberg.data.Record; +import org.apache.iceberg.io.CloseableIterable; +import org.apache.iceberg.types.Types; +import org.apache.polaris.tools.planner.IcebergRestQueryExecutor; +import org.apache.polaris.tools.planner.QueryExecutor; +import org.apache.polaris.tools.planner.QueryPlan; + +import java.io.IOException; +import java.util.List; +import java.util.Map; +import java.util.OptionalLong; + +/** + * Formats and prints the results of executing a {@link QueryPlan} to stdout. + * + *

SELECT plans are executed via {@link IcebergRestQueryExecutor} and rows are + * printed one per line. All other plans are executed via {@link QueryExecutor} and + * the returned {@code Map} is printed as aligned key-value pairs. + */ +class ResultPrinter { + + /** + * Executes the given plan and prints results to stdout. + * + * @param plan the plan to execute + * @param restExecutor executor for SELECT plans (reads actual table data) + * @param catalogExecutor executor for metadata plans (SHOW, DESCRIBE, DIAGNOSE, EXPLAIN) + * @param maxRows display cap for SELECT results; ignored if the plan already has a LIMIT + */ + static void print(QueryPlan plan, + IcebergRestQueryExecutor restExecutor, + QueryExecutor catalogExecutor, + int maxRows) throws Exception { + switch (plan) { + case QueryPlan.Select select -> printSelect(select, restExecutor, maxRows); + case QueryPlan.Explain ex -> printExplain(ex, catalogExecutor); + default -> printCatalogResult(catalogExecutor.execute(plan)); + } + } + + private static void printSelect(QueryPlan.Select plan, + IcebergRestQueryExecutor executor, + int maxRows) throws Exception { + // Honour an existing LIMIT; otherwise cap at maxRows to protect the terminal + QueryPlan.Select capped = plan.limit().isPresent() + ? plan + : new QueryPlan.Select( + plan.namespacedTable(), + plan.projectedColumns(), + plan.filter(), + plan.orderBy(), + OptionalLong.of(maxRows)); + + int count = 0; + if (!plan.orderBy().isEmpty()) { + List rows = executor.executeOrdered(capped); + for (Record r : rows) { + System.out.println(formatRecord(r)); + count++; + } + } else { + try (CloseableIterable records = executor.executeWithLimit(capped)) { + for (Record r : records) { + System.out.println(formatRecord(r)); + count++; + } + } + } + System.out.printf("(%d row%s)%n", count, count == 1 ? "" : "s"); + } + + private static void printExplain(QueryPlan.Explain plan, QueryExecutor catalogExecutor) { + @SuppressWarnings("unchecked") + Map result = (Map) catalogExecutor.execute(plan); + + int width = 70; + String border = "─".repeat(width); + System.out.println("┌" + border + "┐"); + System.out.printf("│ ICEBERG SCAN PLAN — %-49s│%n", result.get("table")); + System.out.println("├──────────────────────────────────────┬" + "─".repeat(32) + "┤"); + printRow("Snapshot ID", result.get("snapshotId")); + printRow("Snapshot timestamp (ms)", result.get("snapshotTimestampMs")); + printRow("Partition spec", result.get("partitionSpec")); + printRow("Schema columns", result.get("schemaColumnCount")); + printRow("Projected columns", result.get("projectedColumnCount")); + System.out.println("├──────────────────────────────────────┬" + "─".repeat(32) + "┤"); + long total = (long) result.get("totalDataFiles"); + long after = (long) result.get("dataFilesAfterFilter"); + double pct = total > 0 ? 100.0 * (total - after) / total : 0.0; + printRow("Total manifest files", result.get("totalManifestFiles")); + printRow("Manifests after pruning", result.get("manifestsAfterPruning")); + printRow("Data files total", total); + printRow("Data files after filter", String.format("%d (%.1f%% eliminated)", after, pct)); + printRow("Estimated bytes scanned", formatBytes((long) result.get("estimatedBytes"))); + printRow("Pushdown filter", result.get("pushdownFilter")); + @SuppressWarnings("unchecked") + List warnings = (List) result.get("warnings"); + if (!warnings.isEmpty()) { + System.out.println("├──────────────────────────────────────┴" + "─".repeat(32) + "┤"); + System.out.println("│ ⚠ Warnings" + " ".repeat(width - 12) + "│"); + for (String w : warnings) System.out.printf("│ • %-67s│%n", w); + } + System.out.println("└" + border + "┘"); + } + + private static void printRow(String label, Object value) { + System.out.printf("│ %-36s│ %-30s│%n", label, value); + } + + private static String formatBytes(long bytes) { + if (bytes < 1024) return bytes + " B"; + if (bytes < 1024 * 1024) return String.format("%.1f KiB", bytes / 1024.0); + if (bytes < 1024L*1024*1024) return String.format("%.1f MiB", bytes / (1024.0 * 1024)); + return String.format("%.2f GiB", bytes / (1024.0 * 1024 * 1024)); + } + + private static void printCatalogResult(Object result) { + if (result instanceof Map map) { + map.forEach((k, v) -> System.out.printf(" %-28s %s%n", k + ":", v)); + } else { + System.out.println(result); + } + System.out.println(); + } + + /** + * Formats a single record as {@code field1=value1, field2=value2, ...}. + * Uses positional field access to ensure ordering matches the schema. + */ + private static String formatRecord(Record r) { + StringBuilder sb = new StringBuilder(); + List fields = r.struct().fields(); + for (int i = 0; i < fields.size(); i++) { + if (i > 0) sb.append(", "); + String name = fields.get(i).name(); + sb.append(name).append("=").append(r.getField(name)); + } + return sb.toString(); + } + + private ResultPrinter() {} +} diff --git a/polaris-shell/src/main/java/org/apache/polaris/tools/planner/IcebergExpressionVisitor.java b/polaris-shell/src/main/java/org/apache/polaris/tools/planner/IcebergExpressionVisitor.java new file mode 100644 index 00000000..cefa8457 --- /dev/null +++ b/polaris-shell/src/main/java/org/apache/polaris/tools/planner/IcebergExpressionVisitor.java @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.planner; + +import org.apache.iceberg.expressions.Expression; +import org.apache.iceberg.expressions.Expressions; +import org.apache.polaris.tools.grammar.IcebergSQLBaseVisitor; +import org.apache.polaris.tools.grammar.IcebergSQLParser; + +import java.util.List; + +/** + * ANTLR visitor that converts parsed SQL predicate nodes into Iceberg {@link Expression} + * objects, supporting AND, OR, NOT, comparisons, IN, NOT IN, and IS NULL predicates. + */ +public class IcebergExpressionVisitor extends IcebergSQLBaseVisitor { + @Override + public Expression visitAndPred(IcebergSQLParser.AndPredContext ctx) { + return Expressions.and(visit(ctx.predicate(0)), visit(ctx.predicate(1))); + } + + @Override + public Expression visitOrPred(IcebergSQLParser.OrPredContext ctx) { + return Expressions.or(visit(ctx.predicate(0)), visit(ctx.predicate(1))); + } + @Override + public Expression visitNotPred(IcebergSQLParser.NotPredContext ctx) { + return Expressions.not(visit(ctx.predicate())); + } + + @Override + public Expression visitInPred(IcebergSQLParser.InPredContext ctx) { + String col = SqlUtil.unquote(ctx.expression().getText()); + List values = ctx.literal().stream().map(this::parseLiteral).toList(); + return Expressions.in(col, values); + } + + @Override + public Expression visitNotInPred(IcebergSQLParser.NotInPredContext ctx) { + String col = SqlUtil.unquote(ctx.expression().getText()); + List values = ctx.literal().stream().map(this::parseLiteral).toList(); + return Expressions.notIn(col, values); + } + + @Override + public Expression visitIsNullPred(IcebergSQLParser.IsNullPredContext ctx) { + String col = SqlUtil.unquote(ctx.expression().getText()); + boolean notNull = ctx.NOT() != null; + return notNull ? Expressions.notNull(col) : Expressions.isNull(col); + } + + @Override + public Expression visitParenPred(IcebergSQLParser.ParenPredContext ctx) { + return visit(ctx.predicate()); + } + + @Override + public Expression visitComparisonPred(IcebergSQLParser.ComparisonPredContext ctx) { + String col = SqlUtil.unquote(ctx.expression(0).getText()); + Object value = parseLiteralExp(ctx.expression(1)); + return switch(ctx.op().getText()) { + case "<" -> Expressions.lessThan(col, value); + case "<=" -> Expressions.lessThanOrEqual(col, value); + case ">" -> Expressions.greaterThan(col, value); + case ">=" -> Expressions.greaterThanOrEqual(col, value); + case "=" -> Expressions.equal(col, value); + case "!=" -> Expressions.notEqual(col, value); + default -> throw new IllegalArgumentException("Unknown comparison operator: " + ctx.op().getText()); + }; + } + + private Object parseLiteralExp(IcebergSQLParser.ExpressionContext ctx) { + if (ctx instanceof IcebergSQLParser.LiteralExprContext literalCtx) { + return parseLiteral(literalCtx.literal()); + } + throw new IllegalArgumentException("Expected literal, got: " + ctx.getText()); + } + + private Object parseLiteral(IcebergSQLParser.LiteralContext ctx) { + switch (ctx) { + case IcebergSQLParser.IntLiteralContext intCtx -> { + return Long.parseLong(intCtx.getText()); + } + case IcebergSQLParser.FloatLiteralContext floatCtx -> { + return Double.parseDouble(floatCtx.getText()); + } + case IcebergSQLParser.StringLiteralContext stringCtx -> { + String raw = stringCtx.getText(); + return raw.substring(1, raw.length() - 1).replace("''", "'"); + } + case IcebergSQLParser.TrueLiteralContext tlc -> { + return true; + } + case IcebergSQLParser.FalseLiteralContext flc -> { + return false; + } + default -> throw new IllegalArgumentException("Unknown literal type: " + ctx.getText()); + } + } +} diff --git a/polaris-shell/src/main/java/org/apache/polaris/tools/planner/IcebergRestQueryExecutor.java b/polaris-shell/src/main/java/org/apache/polaris/tools/planner/IcebergRestQueryExecutor.java new file mode 100644 index 00000000..649dae6c --- /dev/null +++ b/polaris-shell/src/main/java/org/apache/polaris/tools/planner/IcebergRestQueryExecutor.java @@ -0,0 +1,203 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.planner; + +import org.apache.iceberg.Schema; +import org.apache.iceberg.Table; +import org.apache.iceberg.catalog.Namespace; +import org.apache.iceberg.catalog.TableIdentifier; +import org.apache.iceberg.data.IcebergGenerics; +import org.apache.iceberg.data.Record; +import org.apache.iceberg.io.CloseableIterable; +import org.apache.iceberg.io.CloseableIterator; +import org.apache.iceberg.rest.RESTCatalog; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Comparator; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +/** + * Executes SELECT {@link QueryPlan.Select} plans against an Iceberg REST catalog (e.g. Polaris). + * Connects via the Iceberg {@link RESTCatalog}, applies column projection and predicate pushdown, + * and returns rows as a {@link CloseableIterable} of {@link Record}. + * Callers must close the returned iterable and this executor when done. + */ +public class IcebergRestQueryExecutor implements AutoCloseable { + + private final RESTCatalog catalog; + + /** + * @param uri Polaris REST catalog URI, e.g. {@code http://localhost:8181/api/catalog} + * @param warehouse Polaris warehouse / catalog name + * @param token OAuth2 bearer token; use {@code credential} property instead for + * automatic client-credentials exchange + */ + public IcebergRestQueryExecutor(String uri, String warehouse, String token) { + this(uri, warehouse, token, Map.of()); + } + + /** + * Constructor that accepts additional catalog properties, for example FileIO configuration + * needed to reach S3-compatible storage (MinIO, S3Mock, etc.) in integration tests. + * + * @param uri Polaris REST catalog URI + * @param warehouse Polaris warehouse / catalog name + * @param token OAuth2 bearer token + * @param extraProperties additional properties merged into the catalog configuration + */ + public IcebergRestQueryExecutor( + String uri, String warehouse, String token, Map extraProperties) { + Map properties = new HashMap<>(extraProperties); + properties.put("uri", uri); + properties.put("warehouse", warehouse); + properties.put("token", token); + this.catalog = new RESTCatalog(); + this.catalog.initialize("polaris", properties); + } + + /** + * Executes the SELECT plan and returns all matching records. + * Column projection and predicate pushdown are applied at the scan level. + * The caller is responsible for closing the returned iterable. + */ + public CloseableIterable execute(QueryPlan.Select plan) { + Table table = loadTable(plan.namespacedTable()); + IcebergGenerics.ScanBuilder scanBuilder = IcebergGenerics.read(table); + + if (!plan.projectedColumns().isEmpty()) { + Schema projected = table.schema().select(plan.projectedColumns()); + scanBuilder = scanBuilder.project(projected); + } + + if (plan.filter() != null) { + scanBuilder = scanBuilder.where(plan.filter()); + } + + return scanBuilder.build(); + } + + /** + * Executes the SELECT plan and returns at most {@code limit} records. + * If the plan has no LIMIT clause, all matching records are returned. + * The caller is responsible for closing the returned iterable. + */ + public CloseableIterable executeWithLimit(QueryPlan.Select plan) { + CloseableIterable all = execute(plan); + try { + if (plan.limit().isEmpty()) { + return all; + } + long limit = plan.limit().getAsLong(); + return new CloseableIterable<>() { + @Override + public CloseableIterator iterator() { + CloseableIterator delegate = all.iterator(); + return new CloseableIterator<>() { + private long count = 0; + + @Override + public boolean hasNext() { + return count < limit && delegate.hasNext(); + } + + @Override + public Record next() { + count++; + return delegate.next(); + } + + @Override + public void close() throws IOException { + delegate.close(); + } + }; + } + + @Override + public void close() throws IOException { + all.close(); + } + }; + } catch (Exception e) { + try { + all.close(); + } catch (IOException closeEx) { + e.addSuppressed(closeEx); + } + throw e; + } + } + + /** + * Executes the SELECT plan, applies ORDER BY sorting in-memory, and returns at most + * {@code limit} records. If the plan has no ORDER BY clause, rows are returned in + * scan order. If the plan has no LIMIT clause, all matching records are returned. + */ + public List executeOrdered(QueryPlan.Select plan) throws IOException { + List rows = new ArrayList<>(); + try (CloseableIterable all = execute(plan)) { + for (Record r : all) rows.add(r); + } + if (!plan.orderBy().isEmpty()) { + rows.sort(buildComparator(plan.orderBy())); + } + if (plan.limit().isPresent()) { + return rows.subList(0, (int) Math.min(rows.size(), plan.limit().getAsLong())); + } + return rows; + } + + @SuppressWarnings({"unchecked", "rawtypes"}) + private Comparator buildComparator(List orderBy) { + Comparator comp = null; + for (QueryPlan.OrderByItem item : orderBy) { + Comparator c = Comparator.comparing( + r -> (Comparable) r.getField(item.column()), + Comparator.nullsLast(Comparator.naturalOrder())); + if (!item.ascending()) c = c.reversed(); + comp = comp == null ? c : comp.thenComparing(c); + } + return comp; + } + + /** Exposes the underlying catalog for inspection or metadata operations. */ + public RESTCatalog getCatalog() { + return catalog; + } + + private Table loadTable(String namespacedTable) { + String[] parts = namespacedTable.split("\\."); + if (parts.length < 2) { + throw new IllegalArgumentException( + "Table name must be namespace-qualified (e.g. 'ns.table'), got: " + namespacedTable); + } + Namespace ns = Namespace.of(Arrays.copyOf(parts, parts.length - 1)); + return catalog.loadTable(TableIdentifier.of(ns, parts[parts.length - 1])); + } + + @Override + public void close() throws Exception { + catalog.close(); + } +} diff --git a/polaris-shell/src/main/java/org/apache/polaris/tools/planner/QueryExecutor.java b/polaris-shell/src/main/java/org/apache/polaris/tools/planner/QueryExecutor.java new file mode 100644 index 00000000..e2c498c6 --- /dev/null +++ b/polaris-shell/src/main/java/org/apache/polaris/tools/planner/QueryExecutor.java @@ -0,0 +1,226 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.planner; + +import org.apache.iceberg.FileScanTask; +import org.apache.iceberg.Snapshot; +import org.apache.iceberg.Table; +import org.apache.iceberg.TableScan; +import org.apache.iceberg.catalog.Catalog; +import org.apache.iceberg.catalog.Namespace; +import org.apache.iceberg.catalog.TableIdentifier; +import org.apache.iceberg.io.CloseableIterable; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.stream.StreamSupport; + +/** + * Executes a {@link QueryPlan} against an Iceberg {@link Catalog}. + * Dispatches each plan type to the appropriate Iceberg API calls and returns + * the result as a plain Java object (typically a {@code Map}). + */ +public class QueryExecutor { + + private final Catalog catalog; + + public QueryExecutor(Catalog catalog) { + this.catalog = catalog; + } + + public Object execute(QueryPlan plan) { + return switch (plan) { + case QueryPlan.Select s -> executeSelect(s); + case QueryPlan.ShowTables st -> showTables(st); + case QueryPlan.DescribeStats d -> describeStats(d); + case QueryPlan.ShowLocation sl -> showLocation(sl); + case QueryPlan.ShowPolicies sp -> showPolicies(sp); + case QueryPlan.Diagnose diag -> diagnose(diag); + case QueryPlan.Explain ex -> explainSelect(ex.innerSelect()); + }; + } + + // Use-case 1: count + list tables under a namespace + private Object showTables(QueryPlan.ShowTables plan) { + Namespace ns = Namespace.of(plan.namespace().split("\\.")); + List tables = catalog.listTables(ns); + return Map.of( + "namespace", plan.namespace(), + "tableCount", tables.size(), + "tables", tables + ); + } + + // Use-case 2: snapshot count, current snapshot id, partition spec, schema + private Object describeStats(QueryPlan.DescribeStats plan) { + Table table = loadTable(plan.namespacedTable()); + var currentSnapshot = table.currentSnapshot(); + long snapshotCount = StreamSupport.stream(table.snapshots().spliterator(), false).count(); + return Map.of( + "snapshotCount", snapshotCount, + "currentSnapshotId", currentSnapshot != null ? currentSnapshot.snapshotId() : -1L, + "partitionSpec", table.spec().toString(), + "schema", table.schema().toString() + ); + } + + // Use-case 3: storage location + private Object showLocation(QueryPlan.ShowLocation plan) { + Table table = loadTable(plan.namespacedTable()); + return Map.of("location", table.location()); + } + + // Use-case 4: effective policies via table properties (polaris.policy.* prefix only) + private static final String POLICY_PREFIX = "polaris.policy."; + + private Object showPolicies(QueryPlan.ShowPolicies plan) { + Table table = loadTable(plan.namespacedTable()); + Map policies = new HashMap<>(); + table.properties().forEach((k, v) -> { + if (k.startsWith(POLICY_PREFIX)) { + policies.put(k, v); + } + }); + return policies; + } + + /** + * Files smaller than this threshold (128 MiB) are considered "small" by the diagnostics scan. + * This matches the default Iceberg target file size. + */ + private static final long SMALL_FILE_THRESHOLD_BYTES = 128 * 1024 * 1024L; + + // Use-case 5: small-file diagnostics via manifest scanning + private Object diagnose(QueryPlan.Diagnose plan) { + Table table = loadTable(plan.namespacedTable()); + long smallFileCount = 0; + if (table.currentSnapshot() != null) { + try (var tasks = table.newScan().planFiles()) { + for (var fileScanTask : tasks) { + if (fileScanTask.file().fileSizeInBytes() < SMALL_FILE_THRESHOLD_BYTES) { + smallFileCount++; + } + } + } catch (java.io.IOException e) { + throw new RuntimeException("Failed to close file scan tasks during diagnose", e); + } + } + return Map.of( + "smallFileThresholdBytes", SMALL_FILE_THRESHOLD_BYTES, + "smallFileCount", smallFileCount + ); + } + + // Use-case 6: EXPLAIN — scan plan introspection + private Object explainSelect(QueryPlan.Select plan) { + Table table = loadTable(plan.namespacedTable()); + Snapshot currentSnapshot = table.currentSnapshot(); + + long totalManifestFiles = 0; + long totalDataFiles = 0; + long dataFilesAfterFilter = 0; + long manifestsAfterPruning = 0; + long estimatedBytes = 0; + long smallFileCount = 0; + long noStatsCount = 0; + List warnings = new ArrayList<>(); + + if (currentSnapshot != null) { + totalManifestFiles = currentSnapshot.dataManifests(table.io()).size(); + + // Count total data files (unfiltered) + try (CloseableIterable allTasks = table.newScan().planFiles()) { + for (FileScanTask ignored : allTasks) totalDataFiles++; + } catch (IOException e) { + throw new RuntimeException("Failed to count total files during EXPLAIN", e); + } + + // Apply user filter — Iceberg performs partition pruning here + TableScan scan = table.newScan(); + if (plan.filter() != null) scan = scan.filter(plan.filter()); + if (!plan.projectedColumns().isEmpty()) scan = scan.select(plan.projectedColumns()); + + try (CloseableIterable tasks = scan.planFiles()) { + for (FileScanTask task : tasks) { + dataFilesAfterFilter++; + estimatedBytes += task.file().fileSizeInBytes(); + if (task.file().fileSizeInBytes() < SMALL_FILE_THRESHOLD_BYTES) smallFileCount++; + if (task.file().valueCounts() == null || task.file().valueCounts().isEmpty()) noStatsCount++; + } + } catch (IOException e) { + throw new RuntimeException("Failed to plan files during EXPLAIN", e); + } + + // Approximate manifests after pruning by the same ratio as files + manifestsAfterPruning = totalDataFiles > 0 + ? Math.max(1, (long) Math.ceil(totalManifestFiles * (double) dataFilesAfterFilter / totalDataFiles)) + : 0; + + if (smallFileCount > 0) + warnings.add(smallFileCount + " of " + dataFilesAfterFilter + " files are below 128 MiB — consider compaction"); + if (noStatsCount > 0) + warnings.add(noStatsCount + " data files have no column statistics"); + } + + Map result = new LinkedHashMap<>(); + result.put("table", plan.namespacedTable()); + result.put("snapshotId", currentSnapshot != null ? currentSnapshot.snapshotId() : -1L); + result.put("snapshotTimestampMs", currentSnapshot != null ? currentSnapshot.timestampMillis() : -1L); + result.put("partitionSpec", table.spec().toString()); + result.put("schemaColumnCount", table.schema().columns().size()); + result.put("projectedColumnCount", plan.projectedColumns().isEmpty() + ? table.schema().columns().size() + : plan.projectedColumns().size()); + result.put("totalManifestFiles", totalManifestFiles); + result.put("manifestsAfterPruning", manifestsAfterPruning); + result.put("totalDataFiles", totalDataFiles); + result.put("dataFilesAfterFilter", dataFilesAfterFilter); + result.put("estimatedBytes", estimatedBytes); + result.put("pushdownFilter", plan.filter() != null ? plan.filter().toString() : "none"); + result.put("warnings", warnings); + return result; + } + + /** + * Not supported: {@link QueryExecutor} handles metadata operations only. + * Use {@link IcebergRestQueryExecutor} to execute SELECT plans. + * + * @throws IllegalArgumentException always + */ + private Object executeSelect(QueryPlan.Select plan) { + throw new IllegalArgumentException( + "QueryExecutor does not support SELECT plans; use IcebergRestQueryExecutor"); + } + + private Table loadTable(String namespacedTable) { + String[] parts = namespacedTable.split("\\."); + if (parts.length < 2) { + throw new IllegalArgumentException( + "Table name must be namespace-qualified (e.g. 'ns.table'), got: " + namespacedTable); + } + Namespace ns = Namespace.of(Arrays.copyOf(parts, parts.length - 1)); + return catalog.loadTable(TableIdentifier.of(ns, parts[parts.length - 1])); + } +} diff --git a/polaris-shell/src/main/java/org/apache/polaris/tools/planner/QueryPlan.java b/polaris-shell/src/main/java/org/apache/polaris/tools/planner/QueryPlan.java new file mode 100644 index 00000000..fe23e985 --- /dev/null +++ b/polaris-shell/src/main/java/org/apache/polaris/tools/planner/QueryPlan.java @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.planner; + +import org.apache.iceberg.expressions.Expression; + +import java.util.List; +import java.util.OptionalLong; + +/** + * Sealed interface representing the output of SQL-to-plan translation. + * Each permitted record type corresponds to one supported statement: + * SELECT queries, SHOW TABLES, DESCRIBE STATS, SHOW LOCATION, SHOW POLICIES, DIAGNOSE, and EXPLAIN. + */ +public sealed interface QueryPlan + permits QueryPlan.Select, + QueryPlan.ShowTables, + QueryPlan.DescribeStats, + QueryPlan.ShowLocation, + QueryPlan.ShowPolicies, + QueryPlan.Diagnose, + QueryPlan.Explain { + + record OrderByItem(String column, boolean ascending) {} + + record Select( + String namespacedTable, + List projectedColumns, + Expression filter, + List orderBy, + OptionalLong limit + ) implements QueryPlan {} + + record ShowTables(String namespace) implements QueryPlan {} + + record DescribeStats(String namespacedTable) implements QueryPlan {} + + record ShowLocation(String namespacedTable) implements QueryPlan {} + + record ShowPolicies(String namespacedTable) implements QueryPlan {} + + record Diagnose(String namespacedTable) implements QueryPlan {} + + record Explain(Select innerSelect) implements QueryPlan {} +} diff --git a/polaris-shell/src/main/java/org/apache/polaris/tools/planner/SqlToQueryPlan.java b/polaris-shell/src/main/java/org/apache/polaris/tools/planner/SqlToQueryPlan.java new file mode 100644 index 00000000..4a462660 --- /dev/null +++ b/polaris-shell/src/main/java/org/apache/polaris/tools/planner/SqlToQueryPlan.java @@ -0,0 +1,124 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.planner; + +import org.antlr.v4.runtime.BaseErrorListener; +import org.antlr.v4.runtime.CharStreams; +import org.antlr.v4.runtime.CommonTokenStream; +import org.antlr.v4.runtime.RecognitionException; +import org.antlr.v4.runtime.Recognizer; +import org.apache.iceberg.expressions.Expression; +import org.apache.polaris.tools.grammar.IcebergSQLLexer; +import org.apache.polaris.tools.grammar.IcebergSQLParser; + +import java.util.ArrayList; +import java.util.List; +import java.util.OptionalLong; +import java.util.stream.Collectors; + +/** + * Translates a SQL string into a {@link QueryPlan} by parsing it with the IcebergSQL ANTLR grammar + * and dispatching each statement type to the corresponding plan record. + */ +public class SqlToQueryPlan { + + private static final BaseErrorListener THROWING_ERROR_LISTENER = new BaseErrorListener() { + @Override + public void syntaxError(Recognizer recognizer, Object offendingSymbol, + int line, int charPositionInLine, String msg, RecognitionException e) { + throw new IllegalArgumentException( + "SQL syntax error at line " + line + ":" + charPositionInLine + " – " + msg); + } + }; + + public QueryPlan translate(String sql) { + var lexer = new IcebergSQLLexer(CharStreams.fromString(sql)); + lexer.removeErrorListeners(); + lexer.addErrorListener(THROWING_ERROR_LISTENER); + var tokens = new CommonTokenStream(lexer); + var parser = new IcebergSQLParser(tokens); + parser.removeErrorListeners(); + parser.addErrorListener(THROWING_ERROR_LISTENER); + + IcebergSQLParser.QueryContext queryCtx = parser.query(); + + return switch (queryCtx) { + case IcebergSQLParser.SelectStmtContext ctx -> + translateSelect(ctx.selectQuery()); + case IcebergSQLParser.ShowTablesStmtContext ctx -> + new QueryPlan.ShowTables( + identifiersToString(ctx.showTablesQuery().namespaceRef().identifier())); + case IcebergSQLParser.DescribeStatsStmtContext ctx -> + new QueryPlan.DescribeStats( + identifiersToString(ctx.describeStatsQuery().tableRef().identifier())); + case IcebergSQLParser.ShowLocationStmtContext ctx -> + new QueryPlan.ShowLocation( + identifiersToString(ctx.showLocationQuery().tableRef().identifier())); + case IcebergSQLParser.ShowPoliciesStmtContext ctx -> + new QueryPlan.ShowPolicies( + identifiersToString(ctx.showPoliciesQuery().tableRef().identifier())); + case IcebergSQLParser.DiagnoseStmtContext ctx -> + new QueryPlan.Diagnose( + identifiersToString(ctx.diagnoseQuery().tableRef().identifier())); + case IcebergSQLParser.ExplainStmtContext ctx -> + new QueryPlan.Explain(translateSelect(ctx.explainQuery().selectQuery())); + default -> throw new IllegalArgumentException("Unrecognized statement: " + sql); + }; + } + + private QueryPlan.Select translateSelect(IcebergSQLParser.SelectQueryContext ctx) { + String table = identifiersToString(ctx.tableRef().identifier()); + + List columns = new ArrayList<>(); + if (ctx.columnList() instanceof IcebergSQLParser.NamedColumnsContext columnListCtx) { + columns = columnListCtx.column().stream() + .map(col -> (IcebergSQLParser.SimpleColumnContext) col) + .map(col -> identifiersToString(col.identifier())) + .collect(Collectors.toCollection(ArrayList::new)); + } + + Expression filter = null; + if (ctx.predicate() != null) { + filter = new IcebergExpressionVisitor().visit(ctx.predicate()); + } + + List orderBy = new ArrayList<>(); + if (ctx.orderByList() != null) { + for (var item : ctx.orderByList().orderByItem()) { + boolean asc = item.DESC() == null; + orderBy.add(new QueryPlan.OrderByItem( + SqlUtil.unquote(item.identifier().getText()), asc)); + } + } + + OptionalLong limit = OptionalLong.empty(); + if (ctx.INTEGER_LITERAL() != null) { + limit = OptionalLong.of(Long.parseLong(ctx.INTEGER_LITERAL().getText())); + } + + return new QueryPlan.Select(table, columns, filter, orderBy, limit); + } + + private static String identifiersToString(List identifiers) { + return identifiers.stream() + .map(id -> SqlUtil.unquote(id.getText())) + .collect(Collectors.joining(".")); + } +} diff --git a/polaris-shell/src/main/java/org/apache/polaris/tools/planner/SqlUtil.java b/polaris-shell/src/main/java/org/apache/polaris/tools/planner/SqlUtil.java new file mode 100644 index 00000000..170a972b --- /dev/null +++ b/polaris-shell/src/main/java/org/apache/polaris/tools/planner/SqlUtil.java @@ -0,0 +1,28 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.planner; + +final class SqlUtil { + static String unquote(String raw) { + return raw.startsWith("`") ? raw.substring(1, raw.length() - 1) : raw; + } + + private SqlUtil() {} +} diff --git a/polaris-shell/src/main/resources/simplelogger.properties b/polaris-shell/src/main/resources/simplelogger.properties new file mode 100644 index 00000000..4b5687e0 --- /dev/null +++ b/polaris-shell/src/main/resources/simplelogger.properties @@ -0,0 +1,7 @@ +# slf4j-simple configuration for the demo fat jar. +# Suppress INFO/WARN/DEBUG chatter from Iceberg, Hadoop, and AWS SDK; +# only show ERROR-level messages so query results are not obscured. +org.slf4j.simpleLogger.defaultLogLevel=error +org.slf4j.simpleLogger.showDateTime=false +org.slf4j.simpleLogger.showThreadName=false +org.slf4j.simpleLogger.showLogName=false diff --git a/polaris-shell/src/test/java/org/apache/polaris/tools/planner/IcebergExpressionVisitorTest.java b/polaris-shell/src/test/java/org/apache/polaris/tools/planner/IcebergExpressionVisitorTest.java new file mode 100644 index 00000000..18e1ff6c --- /dev/null +++ b/polaris-shell/src/test/java/org/apache/polaris/tools/planner/IcebergExpressionVisitorTest.java @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.planner; + +import static org.assertj.core.api.Assertions.assertThat; + +import org.apache.iceberg.expressions.Expression; +import org.apache.iceberg.expressions.UnboundPredicate; +import org.junit.jupiter.api.Test; + +/** + * Focused unit tests for {@link IcebergExpressionVisitor} literal parsing and predicate building, + * exercised through the full parse pipeline via {@link SqlToQueryPlan}. + */ +class IcebergExpressionVisitorTest { + + private final SqlToQueryPlan translator = new SqlToQueryPlan(); + + /** Parse a WHERE predicate string and return the resulting Iceberg Expression. */ + private Expression parse(String predicate) { + QueryPlan.Select plan = (QueryPlan.Select) translator.translate( + "SELECT * FROM ns.t WHERE " + predicate); + return plan.filter(); + } + + /** Cast the expression to UnboundPredicate for literal value access. */ + @SuppressWarnings("unchecked") + private UnboundPredicate unbound(String predicate) { + return (UnboundPredicate) parse(predicate); + } + + // ── String literal ──────────────────────────────────────────────────────── + + @Test + void stringLiteralStripsQuotes() { + UnboundPredicate pred = unbound("region = 'us-east-1'"); + assertThat(pred.literal().value()).isEqualTo("us-east-1"); + } + + @Test + void escapedSingleQuoteInStringLiteral() { + // SQL '' inside a string literal represents a single quote character + UnboundPredicate pred = unbound("name = 'it''s'"); + assertThat(pred.literal().value()).isEqualTo("it's"); + } + + // ── Numeric literals ────────────────────────────────────────────────────── + + @Test + void intLiteralParsedAsLong() { + UnboundPredicate pred = unbound("id > 42"); + assertThat(pred.literal().value()).isEqualTo(42L); + } + + @Test + void floatLiteralParsedAsDouble() { + UnboundPredicate pred = unbound("score >= 3.14"); + assertThat(pred.literal().value()).isEqualTo(3.14); + } + + // ── Boolean literals ────────────────────────────────────────────────────── + + @Test + void trueLiteralParsedAsBoolean() { + UnboundPredicate pred = unbound("active = true"); + assertThat(pred.literal().value()).isEqualTo(Boolean.TRUE); + } + + @Test + void falseLiteralParsedAsBoolean() { + UnboundPredicate pred = unbound("active = false"); + assertThat(pred.literal().value()).isEqualTo(Boolean.FALSE); + } + + // ── IN predicate ────────────────────────────────────────────────────────── + + @Test + void inPredicateWithMultipleValues() { + Expression e = parse("id IN (1, 2, 3)"); + assertThat(e.op()).isEqualTo(Expression.Operation.IN); + } + + @Test + void notInPredicateOperation() { + Expression e = parse("id NOT IN (1, 2)"); + assertThat(e.op()).isEqualTo(Expression.Operation.NOT_IN); + } + + // ── NULL checks ─────────────────────────────────────────────────────────── + + @Test + void isNullPredicateOperation() { + Expression e = parse("region IS NULL"); + assertThat(e.op()).isEqualTo(Expression.Operation.IS_NULL); + } + + @Test + void isNotNullPredicateOperation() { + Expression e = parse("region IS NOT NULL"); + assertThat(e.op()).isEqualTo(Expression.Operation.NOT_NULL); + } +} diff --git a/polaris-shell/src/test/java/org/apache/polaris/tools/planner/IcebergRestQueryExecutorIntegrationTest.java b/polaris-shell/src/test/java/org/apache/polaris/tools/planner/IcebergRestQueryExecutorIntegrationTest.java new file mode 100644 index 00000000..19b33754 --- /dev/null +++ b/polaris-shell/src/test/java/org/apache/polaris/tools/planner/IcebergRestQueryExecutorIntegrationTest.java @@ -0,0 +1,409 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.planner; + +import static org.assertj.core.api.Assertions.assertThat; + +import java.net.URI; +import java.net.URLEncoder; +import java.net.http.HttpClient; +import java.net.http.HttpRequest; +import java.net.http.HttpResponse; +import java.nio.charset.StandardCharsets; +import java.time.Duration; +import java.util.regex.Matcher; +import java.util.regex.Pattern; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.UUID; +import org.apache.iceberg.DataFile; +import org.apache.iceberg.DataFiles; +import org.apache.iceberg.FileFormat; +import org.apache.iceberg.PartitionSpec; +import org.apache.iceberg.Schema; +import org.apache.iceberg.Table; +import org.apache.iceberg.catalog.Namespace; +import org.apache.iceberg.catalog.TableIdentifier; +import org.apache.iceberg.data.GenericRecord; +import org.apache.iceberg.data.Record; +import org.apache.iceberg.data.parquet.GenericParquetWriter; +import org.apache.iceberg.io.CloseableIterable; +import org.apache.iceberg.io.FileAppender; +import org.apache.iceberg.io.OutputFile; +import org.apache.iceberg.parquet.Parquet; +import org.apache.iceberg.aws.s3.S3FileIO; +import org.apache.iceberg.rest.RESTCatalog; +import org.apache.iceberg.types.Types; +import org.apache.polaris.test.minio.MinioContainer; +import org.junit.jupiter.api.BeforeAll; +import org.junit.jupiter.api.Test; +import org.junit.jupiter.api.TestInstance; +import org.testcontainers.containers.GenericContainer; +import org.testcontainers.containers.Network; +import org.testcontainers.containers.wait.strategy.Wait; +import org.testcontainers.junit.jupiter.Container; +import org.testcontainers.junit.jupiter.Testcontainers; + +@Testcontainers +@TestInstance(TestInstance.Lifecycle.PER_CLASS) +class IcebergRestQueryExecutorIntegrationTest { + + static final String CATALOG_NAME = "sqltestcatalog"; + static final String NAMESPACE = "testns"; + static final String TABLE_NAME = "events"; + static final String CLIENT_ID = "root"; + static final String CLIENT_SECRET = "s3cr3t"; + static final String MINIO_ACCESS_KEY = "sqltest-ak"; + static final String MINIO_SECRET_KEY = "sqltest-sk"; + static final String BUCKET = "sqltest-bucket"; + static final String MINIO_ALIAS = "minio"; + static final int MINIO_PORT = 9000; + + static final Network network = Network.newNetwork(); + + @Container + static final MinioContainer minio = new MinioContainer( + null, MINIO_ACCESS_KEY, MINIO_SECRET_KEY, BUCKET, "us-east-1") + .withNetwork(network) + .withNetworkAliases(MINIO_ALIAS); + + @Container + static final GenericContainer polaris = new GenericContainer<>("apache/polaris:latest") + .withNetwork(network) + .withExposedPorts(8181, 8182) + .withEnv("POLARIS_BOOTSTRAP_CREDENTIALS", "POLARIS," + CLIENT_ID + "," + CLIENT_SECRET) + .withEnv("quarkus.otel.sdk.disabled", "true") + .withEnv("AWS_REGION", "us-east-1") + .withEnv("AWS_ACCESS_KEY_ID", MINIO_ACCESS_KEY) + .withEnv("AWS_SECRET_ACCESS_KEY", MINIO_SECRET_KEY) + .withEnv("polaris.features.\"ALLOW_INSECURE_STORAGE_TYPES\"", "true") + .withEnv("polaris.features.\"SUPPORTED_CATALOG_STORAGE_TYPES\"", "[\"S3\"]") + .withEnv("polaris.features.\"SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION\"", "true") + .withEnv("polaris.readiness.ignore-severe-issues", "true") + .waitingFor( + Wait.forHttp("/q/health") + .forPort(8182) + .withStartupTimeout(Duration.ofMinutes(3))); + + static RESTCatalog restCatalog; + static Schema schema; + static String polarisBase; + static String token; + + @BeforeAll + void setUpCatalogAndData() throws Exception { + polarisBase = "http://localhost:" + polaris.getMappedPort(8181); + + token = obtainToken(polarisBase, CLIENT_ID, CLIENT_SECRET); + createPolarisS3Catalog(polarisBase, token, minio); + + // Build RESTCatalog pointed at Polaris with MinIO S3 FileIO properties + Map props = new HashMap<>(); + props.put("uri", polarisBase + "/api/catalog"); + props.put("warehouse", CATALOG_NAME); + props.put("token", token); + props.putAll(minio.icebergProperties()); + props.put("s3.path-style-access", "true"); + props.put("io-impl", "org.apache.iceberg.aws.s3.S3FileIO"); + + restCatalog = new RESTCatalog(); + restCatalog.initialize("polaris-inttest", props); + + schema = new Schema( + Types.NestedField.required(1, "id", Types.IntegerType.get()), + Types.NestedField.required(2, "name", Types.StringType.get()), + Types.NestedField.required(3, "value", Types.DoubleType.get())); + + Namespace ns = Namespace.of(NAMESPACE); + restCatalog.createNamespace(ns); + Table table = restCatalog.createTable( + TableIdentifier.of(ns, TABLE_NAME), + schema, + PartitionSpec.unpartitioned()); + + writeTestRecords(table); + } + + // ── SELECT via IcebergRestQueryExecutor ─────────────────────────────────── + + @Test + void selectAllColumnsReturnsAllRows() throws Exception { + SqlToQueryPlan translator = new SqlToQueryPlan(); + QueryPlan plan = translator.translate("SELECT * FROM " + NAMESPACE + "." + TABLE_NAME); + + Map s3Props = new HashMap<>(minio.icebergProperties()); + s3Props.put("s3.path-style-access", "true"); + s3Props.put("io-impl", "org.apache.iceberg.aws.s3.S3FileIO"); + + try (IcebergRestQueryExecutor executor = new IcebergRestQueryExecutor( + polarisBase + "/api/catalog", CATALOG_NAME, token, s3Props); + CloseableIterable records = executor.execute((QueryPlan.Select) plan)) { + + List rows = toList(records); + assertThat(rows).hasSize(3); + } + } + + @Test + void selectWithFilterReturnsMatchingRows() throws Exception { + SqlToQueryPlan translator = new SqlToQueryPlan(); + QueryPlan plan = translator.translate( + "SELECT id, name, value FROM " + NAMESPACE + "." + TABLE_NAME + " WHERE value > 50"); + + Map s3Props = new HashMap<>(minio.icebergProperties()); + s3Props.put("s3.path-style-access", "true"); + s3Props.put("io-impl", "org.apache.iceberg.aws.s3.S3FileIO"); + + try (IcebergRestQueryExecutor executor = new IcebergRestQueryExecutor( + polarisBase + "/api/catalog", CATALOG_NAME, token, s3Props); + CloseableIterable records = executor.executeWithLimit((QueryPlan.Select) plan)) { + + List rows = toList(records); + // Alice (75.0) and Carol (90.0) pass the filter; Bob (25.0) does not + assertThat(rows).hasSize(2); + assertThat(rows).extracting(r -> r.getField("name")) + .containsExactlyInAnyOrder("Alice", "Carol"); + // Verify the predicate actually filtered on value, not just row count + assertThat(rows).extracting(r -> r.getField("value")) + .allSatisfy(v -> assertThat((Double) v).isGreaterThan(50.0)); + } + } + + @Test + void selectWithLimitReturnsAtMostNRows() throws Exception { + SqlToQueryPlan translator = new SqlToQueryPlan(); + QueryPlan plan = translator.translate( + "SELECT * FROM " + NAMESPACE + "." + TABLE_NAME + " LIMIT 1"); + + Map s3Props = new HashMap<>(minio.icebergProperties()); + s3Props.put("s3.path-style-access", "true"); + s3Props.put("io-impl", "org.apache.iceberg.aws.s3.S3FileIO"); + + try (IcebergRestQueryExecutor executor = new IcebergRestQueryExecutor( + polarisBase + "/api/catalog", CATALOG_NAME, token, s3Props); + CloseableIterable records = executor.executeWithLimit((QueryPlan.Select) plan)) { + + List rows = toList(records); + assertThat(rows).hasSize(1); + } + } + + // ── Stats commands via QueryExecutor ────────────────────────────────────── + + @Test + void showTablesReturnsEventsTable() { + QueryExecutor executor = new QueryExecutor(restCatalog); + Object result = executor.execute(new QueryPlan.ShowTables(NAMESPACE)); + + @SuppressWarnings("unchecked") + Map map = (Map) result; + assertThat(map).containsEntry("namespace", NAMESPACE); + assertThat((int) map.get("tableCount")).isGreaterThanOrEqualTo(1); + } + + @Test + void describeStatsReturnsSnapshotInfo() { + QueryExecutor executor = new QueryExecutor(restCatalog); + Object result = executor.execute(new QueryPlan.DescribeStats(NAMESPACE + "." + TABLE_NAME)); + + @SuppressWarnings("unchecked") + Map map = (Map) result; + assertThat((long) map.get("snapshotCount")).isGreaterThanOrEqualTo(1L); + assertThat((long) map.get("currentSnapshotId")).isNotEqualTo(-1L); + } + + @Test + void showLocationReturnsS3Uri() { + QueryExecutor executor = new QueryExecutor(restCatalog); + Object result = executor.execute(new QueryPlan.ShowLocation(NAMESPACE + "." + TABLE_NAME)); + + @SuppressWarnings("unchecked") + Map map = (Map) result; + assertThat((String) map.get("location")).startsWith("s3://"); + } + + // ── EXPLAIN via QueryExecutor ───────────────────────────────────────────── + + @Test + void explainSelectShowsDataFileStats() { + SqlToQueryPlan translator = new SqlToQueryPlan(); + QueryPlan plan = translator.translate( + "EXPLAIN SELECT * FROM " + NAMESPACE + "." + TABLE_NAME + " WHERE value > 50"); + + QueryExecutor executor = new QueryExecutor(restCatalog); + @SuppressWarnings("unchecked") + Map result = (Map) executor.execute(plan); + + assertThat(result).containsKey("snapshotId"); + assertThat((long) result.get("totalDataFiles")).isGreaterThanOrEqualTo(1L); + assertThat((long) result.get("dataFilesAfterFilter")).isGreaterThanOrEqualTo(0L); + assertThat(result.get("warnings")).isInstanceOf(List.class); + } + + // ── Helpers ─────────────────────────────────────────────────────────────── + + /** POST to Polaris /oauth/tokens and extract the access_token value. */ + private static String obtainToken(String base, String clientId, String clientSecret) + throws Exception { + String body = "grant_type=client_credentials" + + "&client_id=" + URLEncoder.encode(clientId, StandardCharsets.UTF_8) + + "&client_secret=" + URLEncoder.encode(clientSecret, StandardCharsets.UTF_8) + + "&scope=PRINCIPAL_ROLE:ALL"; + + HttpRequest req = HttpRequest.newBuilder() + .uri(URI.create(base + "/api/catalog/v1/oauth/tokens")) + .header("Content-Type", "application/x-www-form-urlencoded") + .POST(HttpRequest.BodyPublishers.ofString(body)) + .build(); + + HttpResponse resp = HttpClient.newHttpClient() + .send(req, HttpResponse.BodyHandlers.ofString()); + assertThat(resp.statusCode()).as("OAuth token response").isIn(200, 201); + + String body2 = resp.body(); + Matcher matcher = Pattern.compile("\"access_token\"\\s*:\\s*\"([^\"]+)\"").matcher(body2); + if (!matcher.find()) { + throw new IllegalStateException("No access_token in OAuth response: " + body2); + } + return matcher.group(1); + } + + /** + * Creates a Polaris catalog backed by the MinIO bucket using the Management API. + * Uses a fake roleArn with ALLOW_INSECURE_STORAGE_TYPES enabled so Polaris accepts + * S3 storage without real AWS credential validation. + */ + private static void createPolarisS3Catalog( + String base, String token, MinioContainer minio) throws Exception { + String bucket = minio.bucket(); + // Internal endpoint: Polaris container → MinIO via Docker network alias. + // All catalog properties use this so that both Polaris server-side AND + // vended client-side configs can reach MinIO. Our test JVM overrides the + // endpoint in its own RESTCatalog init properties. + String serverEndpoint = "http://" + MINIO_ALIAS + ":" + MINIO_PORT + "/"; + + String json = String.format( + """ + { + "catalog": { + "type": "INTERNAL", + "name": "%s", + "properties": { + "default-base-location": "s3://%s/warehouse", + "s3.endpoint": "%s", + "s3.path-style-access": "true", + "s3.access-key-id": "%s", + "s3.secret-access-key": "%s", + "table-default.s3.endpoint": "%s", + "table-default.s3.path-style-access": "true", + "table-default.s3.access-key-id": "%s", + "table-default.s3.secret-access-key": "%s" + }, + "storageConfigInfo": { + "storageType": "S3", + "allowedLocations": ["s3://%s"], + "roleArn": "arn:aws:iam::123456789012:role/polaris-inttest", + "endpoint": "%s" + } + } + } + """, + CATALOG_NAME, + bucket, + serverEndpoint, + minio.accessKey(), + minio.secretKey(), + serverEndpoint, + minio.accessKey(), + minio.secretKey(), + bucket, + serverEndpoint); + + HttpRequest req = HttpRequest.newBuilder() + .uri(URI.create(base + "/api/management/v1/catalogs")) + .header("Content-Type", "application/json") + .header("Authorization", "Bearer " + token) + .POST(HttpRequest.BodyPublishers.ofString(json)) + .build(); + + HttpResponse resp = HttpClient.newHttpClient() + .send(req, HttpResponse.BodyHandlers.ofString()); + assertThat(resp.statusCode()) + .as("Create catalog response body: " + resp.body()) + .isIn(200, 201); + } + + /** + * Writes 3 test records to the Iceberg table as a Parquet data file. + * Uses a local S3FileIO configured with the external MinIO endpoint + * (localhost mapped port) because the table's own FileIO is configured + * with the Docker-internal endpoint (minio:9000) which isn't reachable + * from the test JVM. + */ + private static void writeTestRecords(Table table) throws Exception { + List records = List.of( + row(table.schema(), 1, "Alice", 75.0), + row(table.schema(), 2, "Bob", 25.0), + row(table.schema(), 3, "Carol", 90.0)); + + // Create a FileIO that reaches MinIO from the test JVM (localhost port) + S3FileIO localFileIO = new S3FileIO(); + Map ioProps = new HashMap<>(minio.icebergProperties()); + ioProps.put("s3.path-style-access", "true"); + localFileIO.initialize(ioProps); + + String location = table.locationProvider() + .newDataLocation(UUID.randomUUID() + ".parquet"); + OutputFile outputFile = localFileIO.newOutputFile(location); + + FileAppender writer = Parquet.write(outputFile) + .schema(table.schema()) + .createWriterFunc(GenericParquetWriter::create) + .build(); + writer.addAll(records); + writer.close(); + org.apache.iceberg.Metrics fileMetrics = writer.metrics(); + long fileSize = localFileIO.newInputFile(location).getLength(); + DataFile dataFile = DataFiles.builder(table.spec()) + .withPath(location) + .withFileSizeInBytes(fileSize) + .withMetrics(fileMetrics) + .withFormat(FileFormat.PARQUET) + .build(); + + table.newAppend().appendFile(dataFile).commit(); + } + + private static Record row(Schema schema, int id, String name, double value) { + GenericRecord record = GenericRecord.create(schema); + record.setField("id", id); + record.setField("name", name); + record.setField("value", value); + return record; + } + + private static List toList(CloseableIterable iterable) { + List list = new ArrayList<>(); + iterable.forEach(list::add); + return list; + } +} diff --git a/polaris-shell/src/test/java/org/apache/polaris/tools/planner/QueryExecutorTest.java b/polaris-shell/src/test/java/org/apache/polaris/tools/planner/QueryExecutorTest.java new file mode 100644 index 00000000..5321bda3 --- /dev/null +++ b/polaris-shell/src/test/java/org/apache/polaris/tools/planner/QueryExecutorTest.java @@ -0,0 +1,242 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.planner; + +import static org.assertj.core.api.Assertions.assertThat; +import static org.assertj.core.api.Assertions.assertThatThrownBy; +import static org.mockito.Mockito.lenient; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.OptionalLong; +import org.apache.iceberg.DataFile; +import org.apache.iceberg.FileScanTask; +import org.apache.iceberg.ManifestFile; +import org.apache.iceberg.PartitionSpec; +import org.apache.iceberg.Schema; +import org.apache.iceberg.Snapshot; +import org.apache.iceberg.Table; +import org.apache.iceberg.TableScan; +import org.apache.iceberg.catalog.Catalog; +import org.apache.iceberg.catalog.Namespace; +import org.apache.iceberg.catalog.TableIdentifier; +import org.apache.iceberg.io.CloseableIterable; +import org.apache.iceberg.io.FileIO; +import org.apache.iceberg.types.Types; +import org.junit.jupiter.api.BeforeEach; +import org.junit.jupiter.api.Test; +import org.junit.jupiter.api.extension.ExtendWith; +import org.mockito.Mock; +import org.mockito.junit.jupiter.MockitoExtension; + +@ExtendWith(MockitoExtension.class) +class QueryExecutorTest { + + @Mock Catalog catalog; + @Mock Table table; + @Mock Snapshot snapshot; + @Mock TableScan tableScan; + + QueryExecutor executor; + + static final Namespace NS = Namespace.of("prod"); + static final TableIdentifier TABLE_ID = TableIdentifier.of(NS, "events"); + static final String NAMESPACED_TABLE = "prod.events"; + + @BeforeEach + void setUp() { + executor = new QueryExecutor(catalog); + lenient().when(catalog.loadTable(TABLE_ID)).thenReturn(table); + } + + @Test + void selectPlanThrowsIllegalArgumentException() { + QueryPlan.Select selectPlan = new QueryPlan.Select( + NAMESPACED_TABLE, List.of(), null, List.of(), OptionalLong.empty()); + assertThatThrownBy(() -> executor.execute(selectPlan)) + .isInstanceOf(IllegalArgumentException.class) + .hasMessageContaining("SELECT"); + } + + @Test + void showTablesReturnsNamespaceAndCount() { + List tables = List.of(TABLE_ID, TableIdentifier.of(NS, "logs")); + when(catalog.listTables(NS)).thenReturn(tables); + + Object result = executor.execute(new QueryPlan.ShowTables("prod")); + + assertThat(result).isInstanceOf(Map.class); + @SuppressWarnings("unchecked") + Map map = (Map) result; + assertThat(map).containsEntry("namespace", "prod"); + assertThat(map).containsEntry("tableCount", 2); + assertThat(map.get("tables")).isEqualTo(tables); + } + + @Test + void describeStatsReturnsSnapshotInfo() { + Schema schema = new Schema( + Types.NestedField.required(1, "id", Types.IntegerType.get())); + when(table.currentSnapshot()).thenReturn(snapshot); + when(table.snapshots()).thenReturn(List.of(snapshot, snapshot)); + when(table.spec()).thenReturn(PartitionSpec.unpartitioned()); + when(table.schema()).thenReturn(schema); + when(snapshot.snapshotId()).thenReturn(42L); + + Object result = executor.execute(new QueryPlan.DescribeStats(NAMESPACED_TABLE)); + + assertThat(result).isInstanceOf(Map.class); + @SuppressWarnings("unchecked") + Map map = (Map) result; + assertThat(map).containsEntry("snapshotCount", 2L); + assertThat(map).containsEntry("currentSnapshotId", 42L); + } + + @Test + void describeStatsHandlesNoSnapshot() { + when(table.currentSnapshot()).thenReturn(null); + when(table.snapshots()).thenReturn(Collections.emptyList()); + when(table.spec()).thenReturn(PartitionSpec.unpartitioned()); + when(table.schema()).thenReturn(new Schema()); + + Object result = executor.execute(new QueryPlan.DescribeStats(NAMESPACED_TABLE)); + + @SuppressWarnings("unchecked") + Map map = (Map) result; + assertThat(map).containsEntry("snapshotCount", 0L); + assertThat(map).containsEntry("currentSnapshotId", -1L); + } + + @Test + void showLocationReturnsTableLocation() { + when(table.location()).thenReturn("s3://my-bucket/warehouse/prod/events"); + + Object result = executor.execute(new QueryPlan.ShowLocation(NAMESPACED_TABLE)); + + @SuppressWarnings("unchecked") + Map map = (Map) result; + assertThat(map).containsEntry("location", "s3://my-bucket/warehouse/prod/events"); + } + + @Test + void showPoliciesReturnsPolicyPropertiesOnly() { + Map props = Map.of( + "write.format.default", "parquet", + "polaris.policy.retention", "30d"); + when(table.properties()).thenReturn(props); + + Object result = executor.execute(new QueryPlan.ShowPolicies(NAMESPACED_TABLE)); + + @SuppressWarnings("unchecked") + Map policies = (Map) result; + assertThat(policies).containsOnlyKeys("polaris.policy.retention"); + assertThat(policies).doesNotContainKey("write.format.default"); + } + + @Test + void diagnoseReturnsSmallFileCount() { + DataFile smallFile = mock(DataFile.class); + DataFile largeFile = mock(DataFile.class); + FileScanTask smallTask = mock(FileScanTask.class); + FileScanTask largeTask = mock(FileScanTask.class); + + long threshold = 128 * 1024 * 1024L; + when(table.currentSnapshot()).thenReturn(snapshot); + when(table.newScan()).thenReturn(tableScan); + when(tableScan.planFiles()).thenReturn(CloseableIterable.withNoopClose( + List.of(smallTask, largeTask))); + when(smallTask.file()).thenReturn(smallFile); + when(largeTask.file()).thenReturn(largeFile); + when(smallFile.fileSizeInBytes()).thenReturn(1024L); + when(largeFile.fileSizeInBytes()).thenReturn(threshold + 1); + + Object result = executor.execute(new QueryPlan.Diagnose(NAMESPACED_TABLE)); + + @SuppressWarnings("unchecked") + Map map = (Map) result; + assertThat(map).containsEntry("smallFileThresholdBytes", threshold); + assertThat(map).containsEntry("smallFileCount", 1L); + } + + @Test + void diagnoseWithNoSnapshotReturnsZeroCount() { + when(table.currentSnapshot()).thenReturn(null); + + Object result = executor.execute(new QueryPlan.Diagnose(NAMESPACED_TABLE)); + + @SuppressWarnings("unchecked") + Map map = (Map) result; + assertThat(map).containsEntry("smallFileCount", 0L); + } + + @Test + void explainReturnsDataFileStatsWithWarnings() { + long threshold = 128 * 1024 * 1024L; + long smallFileSize = 1024L; + long largeFileSize = threshold + 1; + + DataFile smallFile = mock(DataFile.class); + DataFile largeFile = mock(DataFile.class); + FileScanTask smallTask = mock(FileScanTask.class); + FileScanTask largeTask = mock(FileScanTask.class); + ManifestFile manifest1 = mock(ManifestFile.class); + ManifestFile manifest2 = mock(ManifestFile.class); + FileIO fileIO = mock(FileIO.class); + Schema schema = new Schema( + Types.NestedField.required(1, "id", Types.IntegerType.get())); + + when(table.currentSnapshot()).thenReturn(snapshot); + when(snapshot.snapshotId()).thenReturn(99L); + when(snapshot.timestampMillis()).thenReturn(1000L); + when(snapshot.dataManifests(fileIO)).thenReturn(List.of(manifest1, manifest2)); + when(table.io()).thenReturn(fileIO); + when(table.newScan()).thenReturn(tableScan); + when(tableScan.planFiles()).thenReturn(CloseableIterable.withNoopClose( + List.of(smallTask, largeTask))); + when(smallTask.file()).thenReturn(smallFile); + when(largeTask.file()).thenReturn(largeFile); + when(smallFile.fileSizeInBytes()).thenReturn(smallFileSize); + when(largeFile.fileSizeInBytes()).thenReturn(largeFileSize); + when(smallFile.valueCounts()).thenReturn(Map.of(1, 10L)); + when(largeFile.valueCounts()).thenReturn(Map.of(1, 100L)); + when(table.spec()).thenReturn(PartitionSpec.unpartitioned()); + when(table.schema()).thenReturn(schema); + + QueryPlan.Select select = new QueryPlan.Select( + NAMESPACED_TABLE, List.of(), null, List.of(), OptionalLong.empty()); + QueryPlan plan = new QueryPlan.Explain(select); + + Object result = executor.execute(plan); + + assertThat(result).isInstanceOf(Map.class); + @SuppressWarnings("unchecked") + Map map = (Map) result; + assertThat(map).containsEntry("dataFilesAfterFilter", 2L); + assertThat(map).containsEntry("totalDataFiles", 2L); + assertThat(map).containsEntry("snapshotId", 99L); + @SuppressWarnings("unchecked") + List warnings = (List) map.get("warnings"); + assertThat(warnings).isNotEmpty(); + assertThat(warnings.get(0)).contains("128 MiB"); + } +} diff --git a/polaris-shell/src/test/java/org/apache/polaris/tools/planner/SqlToQueryPlanTest.java b/polaris-shell/src/test/java/org/apache/polaris/tools/planner/SqlToQueryPlanTest.java new file mode 100644 index 00000000..5254212a --- /dev/null +++ b/polaris-shell/src/test/java/org/apache/polaris/tools/planner/SqlToQueryPlanTest.java @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.tools.planner; + +import static org.assertj.core.api.Assertions.assertThat; +import static org.assertj.core.api.Assertions.assertThatThrownBy; + +import org.apache.iceberg.expressions.Expression; +import org.junit.jupiter.api.Test; + +class SqlToQueryPlanTest { + + private final SqlToQueryPlan translator = new SqlToQueryPlan(); + + @Test + void selectStarProducesSelectPlan() { + QueryPlan plan = translator.translate("SELECT * FROM prod.events"); + + assertThat(plan).isInstanceOf(QueryPlan.Select.class); + QueryPlan.Select select = (QueryPlan.Select) plan; + assertThat(select.namespacedTable()).isEqualTo("prod.events"); + assertThat(select.projectedColumns()).isEmpty(); + assertThat(select.filter()).isNull(); + assertThat(select.orderBy()).isEmpty(); + assertThat(select.limit()).isEmpty(); + } + + @Test + void selectWithColumnsFilterAndLimit() { + QueryPlan plan = translator.translate( + "SELECT region, sales FROM prod.events WHERE sales > 1000 LIMIT 50"); + + assertThat(plan).isInstanceOf(QueryPlan.Select.class); + QueryPlan.Select select = (QueryPlan.Select) plan; + assertThat(select.namespacedTable()).isEqualTo("prod.events"); + assertThat(select.projectedColumns()).containsExactly("region", "sales"); + assertThat(select.filter()).isNotNull(); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.GT); + assertThat(select.orderBy()).isEmpty(); + assertThat(select.limit()).hasValue(50L); + } + + @Test + void showTablesProducesShowTablesPlan() { + QueryPlan plan = translator.translate("SHOW TABLES IN prod.logs"); + + assertThat(plan).isInstanceOf(QueryPlan.ShowTables.class); + assertThat(((QueryPlan.ShowTables) plan).namespace()).isEqualTo("prod.logs"); + } + + @Test + void describeStatsProducesDescribeStatsPlan() { + QueryPlan plan = translator.translate("DESCRIBE STATS prod.events"); + + assertThat(plan).isInstanceOf(QueryPlan.DescribeStats.class); + assertThat(((QueryPlan.DescribeStats) plan).namespacedTable()).isEqualTo("prod.events"); + } + + @Test + void showLocationProducesShowLocationPlan() { + QueryPlan plan = translator.translate("SHOW TABLE LOCATION prod.events"); + + assertThat(plan).isInstanceOf(QueryPlan.ShowLocation.class); + assertThat(((QueryPlan.ShowLocation) plan).namespacedTable()).isEqualTo("prod.events"); + } + + @Test + void showPoliciesProducesShowPoliciesPlan() { + QueryPlan plan = translator.translate("SHOW TABLE POLICIES prod.events"); + + assertThat(plan).isInstanceOf(QueryPlan.ShowPolicies.class); + assertThat(((QueryPlan.ShowPolicies) plan).namespacedTable()).isEqualTo("prod.events"); + } + + @Test + void diagnoseTableProducesDiagnosePlan() { + QueryPlan plan = translator.translate("DIAGNOSE TABLE prod.events"); + + assertThat(plan).isInstanceOf(QueryPlan.Diagnose.class); + assertThat(((QueryPlan.Diagnose) plan).namespacedTable()).isEqualTo("prod.events"); + } + + @Test + void deeplyNestedNamespace() { + QueryPlan plan = translator.translate("SHOW TABLES IN catalog.schema.db"); + + assertThat(plan).isInstanceOf(QueryPlan.ShowTables.class); + assertThat(((QueryPlan.ShowTables) plan).namespace()).isEqualTo("catalog.schema.db"); + } + + @Test + void invalidSqlThrowsIllegalArgumentException() { + assertThatThrownBy(() -> translator.translate("NOT VALID SQL !!!")) + .isInstanceOf(IllegalArgumentException.class); + } + + @Test + void syntaxErrorThrowsIllegalArgumentException() { + assertThatThrownBy(() -> translator.translate("SELECT FROM")) + .isInstanceOf(IllegalArgumentException.class); + } + + @Test + void emptyStringThrows() { + assertThatThrownBy(() -> translator.translate("")) + .isInstanceOf(IllegalArgumentException.class); + } + + @Test + void selectWithAndPredicate() { + QueryPlan plan = translator.translate( + "SELECT id FROM ns.tbl WHERE id > 0 AND id < 100"); + + assertThat(plan).isInstanceOf(QueryPlan.Select.class); + QueryPlan.Select select = (QueryPlan.Select) plan; + assertThat(select.filter()).isNotNull(); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.AND); + } + + @Test + void selectWithOrPredicate() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT id FROM ns.tbl WHERE id < 0 OR id > 100"); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.OR); + } + + @Test + void selectWithNotPredicate() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT id FROM ns.tbl WHERE NOT id = 5"); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.NOT); + } + + @Test + void selectWithInPredicate() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT id FROM ns.tbl WHERE region IN ('us-east-1', 'eu-west-1')"); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.IN); + } + + @Test + void selectWithNotInPredicate() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT id FROM ns.tbl WHERE region NOT IN ('us-east-1')"); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.NOT_IN); + } + + @Test + void selectWithIsNullPredicate() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT id FROM ns.tbl WHERE region IS NULL"); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.IS_NULL); + } + + @Test + void selectWithIsNotNullPredicate() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT id FROM ns.tbl WHERE region IS NOT NULL"); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.NOT_NULL); + } + + @Test + void selectWithAllComparisonOperators() { + assertThat(filterOp("SELECT id FROM ns.t WHERE id < 10")).isEqualTo(Expression.Operation.LT); + assertThat(filterOp("SELECT id FROM ns.t WHERE id <= 10")).isEqualTo(Expression.Operation.LT_EQ); + assertThat(filterOp("SELECT id FROM ns.t WHERE id >= 10")).isEqualTo(Expression.Operation.GT_EQ); + assertThat(filterOp("SELECT id FROM ns.t WHERE id != 10")).isEqualTo(Expression.Operation.NOT_EQ); + assertThat(filterOp("SELECT id FROM ns.t WHERE id = 10")).isEqualTo(Expression.Operation.EQ); + } + + @Test + void selectWithNestedParentheses() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT id FROM ns.tbl WHERE (id > 0 AND id < 10) OR id = 99"); + assertThat(select.filter()).isNotNull(); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.OR); + } + + @Test + void selectWithStringLiteralFilter() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT id FROM ns.tbl WHERE region = 'us-east-1'"); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.EQ); + } + + @Test + void selectWithFloatLiteralFilter() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT id FROM ns.tbl WHERE score >= 0.5"); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.GT_EQ); + } + + @Test + void selectWithBooleanLiteralFilter() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT id FROM ns.tbl WHERE active = true"); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.EQ); + } + + // ── LIMIT edge cases ────────────────────────────────────────────────────── + + @Test + void limitZeroIsValid() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT * FROM ns.t LIMIT 0"); + assertThat(select.limit()).hasValue(0L); + } + + @Test + void selectWithLimitButNoFilter() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT id FROM ns.t LIMIT 5"); + assertThat(select.filter()).isNull(); + assertThat(select.limit()).hasValue(5L); + } + + // ── EXPLAIN ────────────────────────────────────────────────────────────── + + @Test + void explainSelectProducesExplainPlan() { + QueryPlan plan = translator.translate("EXPLAIN SELECT * FROM ns.t"); + + assertThat(plan).isInstanceOf(QueryPlan.Explain.class); + QueryPlan.Explain explain = (QueryPlan.Explain) plan; + assertThat(explain.innerSelect().namespacedTable()).isEqualTo("ns.t"); + } + + // ── ORDER BY ───────────────────────────────────────────────────────────── + + @Test + void selectWithOrderByAsc() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT * FROM ns.t ORDER BY col"); + + assertThat(select.orderBy()).hasSize(1); + assertThat(select.orderBy().get(0).column()).isEqualTo("col"); + assertThat(select.orderBy().get(0).ascending()).isTrue(); + } + + @Test + void selectWithOrderByDesc() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT * FROM ns.t ORDER BY col DESC"); + + assertThat(select.orderBy()).hasSize(1); + assertThat(select.orderBy().get(0).column()).isEqualTo("col"); + assertThat(select.orderBy().get(0).ascending()).isFalse(); + } + + @Test + void selectWithMultiColumnOrderBy() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT * FROM ns.t ORDER BY a ASC, b DESC"); + + assertThat(select.orderBy()).hasSize(2); + assertThat(select.orderBy().get(0).column()).isEqualTo("a"); + assertThat(select.orderBy().get(0).ascending()).isTrue(); + assertThat(select.orderBy().get(1).column()).isEqualTo("b"); + assertThat(select.orderBy().get(1).ascending()).isFalse(); + } + + // ── Backtick identifiers ────────────────────────────────────────────────── + + @Test + void backtickTableRef() { + QueryPlan plan = translator.translate("SELECT * FROM `my-ns`.`my-table`"); + + assertThat(plan).isInstanceOf(QueryPlan.Select.class); + QueryPlan.Select select = (QueryPlan.Select) plan; + assertThat(select.namespacedTable()).isEqualTo("my-ns.my-table"); + } + + @Test + void backtickColumnInWhere() { + QueryPlan.Select select = (QueryPlan.Select) translator.translate( + "SELECT * FROM ns.tbl WHERE `event-type` = 'click'"); + + assertThat(select.filter()).isNotNull(); + assertThat(select.filter().op()).isEqualTo(Expression.Operation.EQ); + } + + // ── Helper ─────────────────────────────────────────────────────────────── + + private Expression.Operation filterOp(String sql) { + return ((QueryPlan.Select) translator.translate(sql)).filter().op(); + } +}