Skip to content

QUA-1632: Add required permissions for each connector on the connections page#1086

Merged
shindiogawa merged 4 commits intomainfrom
qua-1632-add-required-permissions-for-each-connector-on-the
Mar 27, 2026
Merged

QUA-1632: Add required permissions for each connector on the connections page#1086
shindiogawa merged 4 commits intomainfrom
qua-1632-add-required-permissions-for-each-connector-on-the

Conversation

@RafaelOsiro
Copy link
Copy Markdown
Contributor

@RafaelOsiro RafaelOsiro commented Mar 26, 2026

Overview

This PR adds detailed permissions documentation for all connectors on the Connections page, addressing QUA-1632. Each connector now documents the minimum database/cloud permissions required to configure it as a source and/or enrichment datastore in Qualytics, following the same standardized format established by the Athena connector (QUA-1634).

The documentation was cross-referenced against the controlplane (connection.py, specifications/), dataplane (JDBCStore.scala, DFSStore.scala, NativeStore.scala, EnrichmentConfig.scala), and frontend (data-models.ts, datastore-form.ts) codebases to ensure all permissions, authentication methods, and connection properties are accurately documented.

Key Changes

Permissions and SQL/IAM Examples (18 connectors)

  • PostgreSQL, MySQL, MariaDB, TimescaleDB: SQL GRANT examples for PostgreSQL-family connectors, including read-only and read-write roles with ALTER DEFAULT PRIVILEGES for future tables
  • Microsoft SQL Server, Synapse: T-SQL GRANT examples with sys.schemas / sys.database_principals discovery notes and Service Principal authentication guidance
  • Oracle: CREATE SESSION and SELECT grants with both schema-level and role-based access control examples, plus TCP/TCPS protocol documentation
  • DB2: SYSCAT.SCHEMATA / SYSCAT.TABLES system catalog access, CREATEIN / ALTERIN / DROPIN grants for enrichment, and SSL toggle documentation
  • Redshift: Schema-level grants with ALTER DEFAULT PRIVILEGES for future tables
  • Databricks: Unity Catalog permissions (USAGE, SELECT, MODIFY, CREATE TABLE), CAN USE compute requirement, and OAuth M2M authentication documentation
  • Teradata: LOGON, SELECT, SHOW, and SELECT ON DBC.DatabasesV permissions with system database filtering details and LDAP authentication notes
  • Hive: SELECT permissions with Kerberos authentication and ZooKeeper HA toggle documentation
  • Presto, Trino: File-based access control (rules.json) and connector-level security model examples
  • Dremio: SQL GRANT examples with PAT and Basic authentication method details
  • Fabric Analytics: Service Principal permissions, Contributor role requirements, tenant setting prerequisites, and Azure CLI verification examples
  • Azure Datalake Storage: RBAC role requirements (Storage Blob Data Reader / Storage Blob Data Contributor) with Example IAM Role Assignment JSON and Azure CLI commands
  • Google Cloud Storage: IAM permissions with Example IAM Policy JSON (roles/storage.objectViewer / roles/storage.objectAdmin), GCS Roles Summary table, and gcloud CLI commands
  • BigQuery, Snowflake, Amazon S3: These already had permissions sections -- added Troubleshooting Common Errors and Detailed Troubleshooting Notes to match the standardized format

Troubleshooting (all 21 connectors)

Every connector now includes:

  • Troubleshooting Common Errors: Table with Error | Likely Cause | Fix columns
  • Detailed Troubleshooting Notes: In-depth subsections for Authentication Errors, Permission Errors, and Connection Errors with bullet-point common causes and debugging tips

Enrichment Permissions Fix

Updated enrichment permissions tables to match actual dataplane write operations (JDBCStore.scala, EnrichmentConfig.scala):

  • Added DROP TABLE to PostgreSQL, SQL Server, Synapse
  • Added ALTER TABLE + DROP TABLE to Redshift, Trino
  • Added DROPIN ON SCHEMA to DB2

Connection Properties from Controlplane

Documented missing connection properties found in connection.py:

  • PostgreSQL: track_commit_timestamp config for incremental profiling
  • Databricks: OAuth M2M authentication (Service Principal Application ID + OAuth Secret)
  • Oracle: TCP/TCPS protocol selector
  • Hive: ZooKeeper HA toggle
  • DB2: SSL toggle
  • Teradata: SELECT ON DBC.DatabasesV for catalog discovery

Greptile Review Fix

  • MySQL, MariaDB: Added missing GRANT PROCESS ON *.* to both source and enrichment SQL examples -- PROCESS is a global-level privilege that cannot be granted at database scope

Spell Check

  • Added CREATEIN and ALTERIN to .typos.toml dictionary

Screenshots

N/A -- text-only documentation changes.

…connections page

Add detailed Setup Guide sections with minimum permissions, SQL grant
examples, and troubleshooting tables for 18 connectors: PostgreSQL,
MySQL, MariaDB, TimescaleDB, Microsoft SQL Server, Synapse, Oracle,
DB2, Redshift, Databricks, Teradata, Hive, Presto, Trino, Dremio,
Fabric Analytics, Azure Datalake Storage, and Google Cloud Storage.

Each connector now documents:
- Minimum permissions for source datastore (read-only)
- Additional permissions for enrichment datastore (read-write) where supported
- Ready-to-use SQL scripts or IAM policies
- Troubleshooting common errors table
@RafaelOsiro RafaelOsiro self-assigned this Mar 26, 2026
@RafaelOsiro RafaelOsiro added the documentation Improvements or additions to documentation label Mar 26, 2026
…pts for all connectors

Standardize all 18 connector permission sections to match the Athena
documentation pattern. Each connector now includes:

- Example scripts: Added ready-to-copy code blocks for Presto
  (rules.json), Trino (rules.json), Dremio (SQL GRANT), Fabric
  Analytics (Azure CLI), Azure Data Lake Storage (az role assignment),
  and Google Cloud Storage (gsutil iam ch)
- Detailed Troubleshooting Notes: Added subsections for Authentication
  Errors, Permission Errors, and Connection Errors with bullet-point
  common causes and debugging tips for all 18 connectors
- Admonitions: Added missing notes for DB2 (SYSCAT system catalogs)
  and Trino (connector-level security)
@RafaelOsiro RafaelOsiro marked this pull request as ready for review March 26, 2026 21:58
@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 26, 2026

Greptile Summary

This PR adds detailed permissions documentation and troubleshooting guides for 18 connector types that were previously missing Setup Guide sections. Each connector now documents minimum required privileges for source and/or enrichment datastores, includes copy-paste-ready SQL/CLI grant examples, and provides a structured troubleshooting table plus detailed notes for authentication, permission, and connection errors.\n\nKey observations:\n\n- The documentation is comprehensive, consistently formatted, and technically accurate across the vast majority of connectors.\n- MySQL and MariaDB: The PROCESS privilege is listed as a minimum required permission in both files' permission tables, but it is absent from all example GRANT statements. In MySQL and MariaDB, PROCESS is a global (administrative) privilege that must be granted at the *.* scope (GRANT PROCESS ON *.* TO 'user'@'%') — it cannot be included in a database-scoped grant. The examples need a separate global GRANT PROCESS statement added to be complete and actionable.\n- Connectors correctly marked as source-only (no enrichment): Oracle, Hive, TimescaleDB, Teradata, Dremio, Presto, Fabric Analytics.\n- Trino correctly supports enrichment and documents the additional CREATE TABLE / INSERT / DELETE permissions needed.\n- The SQL Server and Synapse docs appropriately share the same permission model while calling out Synapse-specific nuances (e.g., paused SQL pool).\n- The .typos.toml update correctly adds CREATEIN and ALTERIN as known DB2 keywords.

Confidence Score: 4/5

Safe to merge after fixing the PROCESS privilege omission in the MySQL and MariaDB grant examples.

The PR is a large, well-structured documentation addition that is accurate across 16 of 18 connectors. The one concrete fix needed is adding a global GRANT PROCESS ON . TO ... statement to the MySQL and MariaDB grant examples, since PROCESS is listed as required but absent from the copy-paste examples. This would leave users with an incomplete setup for those two connectors. All other content — permissions tables, SQL/CLI examples, troubleshooting tables, and detailed notes — is technically sound and consistently formatted.

docs/source-datastore/add-datastores/mysql.md and docs/source-datastore/add-datastores/maria-db.md — both are missing the global PROCESS privilege grant in their example SQL blocks.

Important Files Changed

Filename Overview
docs/source-datastore/add-datastores/mysql.md Adds MySQL permissions documentation, but the PROCESS privilege (listed as required) is absent from the example GRANT statements; PROCESS is a global privilege and must be granted separately at the . scope.
docs/source-datastore/add-datastores/maria-db.md Same PROCESS privilege omission as mysql.md — the minimum permissions table lists PROCESS as required but the example GRANT statements omit it; PROCESS must be granted globally with ON . in MariaDB.
docs/source-datastore/add-datastores/postgresql.md Adds complete PostgreSQL permissions with ALTER DEFAULT PRIVILEGES for future table coverage; includes both read-only and read-write role examples.
docs/source-datastore/add-datastores/databricks.md Adds Unity Catalog permissions (USAGE, SELECT, MODIFY, CREATE TABLE) and compute access requirements with PAT and OAuth M2M notes; troubleshooting covers authentication, permission, and compute errors.
docs/source-datastore/add-datastores/db2.md Adds DB2-specific permissions including SYSCAT catalog access and CREATEIN/ALTERIN schema grants for enrichment; SQL examples and error code troubleshooting are accurate.
docs/source-datastore/add-datastores/microsoft-sql-server.md Adds T-SQL GRANT examples for both SQL auth and Service Principal auth, with sys.schemas discovery notes and detailed troubleshooting for login vs. user model errors.
docs/source-datastore/add-datastores/synapse.md Adds Synapse-specific permissions mirroring SQL Server's model; correctly notes paused SQL pool as a connection error cause unique to Synapse.
docs/source-datastore/add-datastores/azure-datalake-storage.md Adds comprehensive RBAC role documentation for both Access Key and Service Principal authentication, with CLI examples and detailed troubleshooting for authentication, permission, and connection errors.
docs/source-datastore/add-datastores/google-cloud-storage.md Adds IAM permissions with objectViewer/objectAdmin role summary, gcloud CLI examples, and uniform bucket-level access note; also removes a trailing whitespace from the existing warning block.
docs/source-datastore/add-datastores/redshift.md Adds Redshift permissions with ALTER DEFAULT PRIVILEGES for future tables; correctly notes VPC/security-group as the key connection barrier distinct from PostgreSQL.
docs/source-datastore/add-datastores/fabric-analytics.md Adds Fabric Analytics Contributor role requirements and tenant setting prerequisites; correctly notes enrichment is not supported and covers Service Principal setup verification via Azure CLI.
docs/source-datastore/add-datastores/oracle.md Adds CREATE SESSION and SELECT permissions with both broad (SELECT ANY TABLE) and restrictive (per-table) options, plus a role-based PL/SQL example; correctly notes Oracle is source-only.
docs/source-datastore/add-datastores/trino.md Adds Trino permissions for all three security models including enrichment write permissions; consistent with Presto treatment except Trino supports enrichment datastores.
docs/source-datastore/add-datastores/presto.md Adds Presto permissions covering all three security models (none, file-based, connector-level) with a rules.json example; correctly notes enrichment is not supported.
docs/source-datastore/add-datastores/hive.md Adds Hive SELECT permissions with Kerberos authentication notes and Ranger/Sentry policy callout; correctly notes enrichment is not supported.
docs/source-datastore/add-datastores/teradata.md Adds Teradata LOGON, SELECT, and SHOW permissions with LDAP authentication notes and system database filtering details; correctly notes enrichment is not supported.
docs/source-datastore/add-datastores/timescale-db.md Adds TimescaleDB permissions following PostgreSQL conventions with hypertable-specific callouts; correctly notes enrichment is not supported and filters timescaledb_* internal schemas.
docs/source-datastore/add-datastores/dremio.md Adds Dremio SELECT permissions with PAT and Basic auth methods; correctly notes enrichment is not supported and covers Cloud-specific Project ID requirement.
.typos.toml Adds CREATEIN and ALTERIN to the spellcheck dictionary to prevent false positives for valid DB2 privilege keywords.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Qualytics Connector] --> B{Auth Method}
    B -->|SQL Auth / Basic| C[Username + Password]
    B -->|Service Principal / PAT / OAuth| D[Token / Client Credentials]
    B -->|Access Key| E[Account Key / HMAC Key]

    C --> F{Check Permissions}
    D --> F
    E --> F

    F -->|Source Datastore| G[Read-Only Grants\nSELECT / USAGE / CONNECT\nObjectViewer / Data Reader]
    F -->|Enrichment Datastore| H[Read-Write Grants\n+ CREATE TABLE / INSERT\n+ UPDATE / DELETE\n+ Contributor Role]

    G --> I{Supported?}
    H --> J{Supported?}

    I -->|Yes - all connectors| K[Source Datastore Ready]
    J -->|No - Oracle, Hive, TimescaleDB\nTeradata, Dremio, Presto, Fabric| L[Use Different Enrichment Datastore]
    J -->|Yes - PostgreSQL, MySQL, MariaDB\nSQL Server, Synapse, Redshift\nDatabricks, DB2, Trino\nAzure ADLS, GCS| M[Enrichment Datastore Ready]
Loading

Comments Outside Diff (1)

  1. docs/source-datastore/add-datastores/mysql.md, line 1033-1070 (link)

    P1 PROCESS privilege missing from example GRANT statements

    The minimum permissions table lists PROCESS as a required permission, but neither the source nor enrichment example GRANT statements include it. This means users following the examples will end up with an incomplete setup.

    Critically, PROCESS is a global-level privilege in MySQL — it cannot be granted using ON <database_name>.* syntax. It must be granted separately at the *.* scope:

    GRANT PROCESS ON *.* TO 'qualytics_read'@'%';

    The example should be updated to include this grant. For instance, the source datastore example should be:

    -- Create a dedicated read-only user
    CREATE USER 'qualytics_read'@'%' IDENTIFIED BY '<password>';
    
    -- Grant read access to all tables and views
    GRANT SELECT, SHOW VIEW ON <database_name>.* TO 'qualytics_read'@'%';
    
    -- Grant the global PROCESS privilege (required by the JDBC driver)
    GRANT PROCESS ON *.* TO 'qualytics_read'@'%';
    
    -- Apply the changes
    FLUSH PRIVILEGES;

    The same fix applies to the enrichment datastore example. This same issue exists in docs/source-datastore/add-datastores/maria-db.md at the equivalent grant examples.

Reviews (1): Last reviewed commit: "docs(connectors): add detailed troublesh..." | Re-trigger Greptile

…nrichment permissions, and document missing connection properties

Add Troubleshooting Common Errors and Detailed Troubleshooting Notes
sections to BigQuery, Snowflake, and Amazon S3 to match the Athena
documentation pattern.

Fix enrichment permissions tables with missing DROP TABLE (PostgreSQL,
SQL Server, Synapse), ALTER TABLE + DROP TABLE (Redshift, Trino), and
DROPIN (DB2) based on actual dataplane write operations.

Document missing connection properties found in controlplane code:
- Databricks: OAuth M2M authentication (Service Principal + OAuth Secret)
- Oracle: TCP/TCPS protocol selector
- Hive: ZooKeeper HA toggle
- DB2: SSL toggle
- Teradata: SELECT ON DBC.DatabasesV permission for catalog discovery
- PostgreSQL: track_commit_timestamp config for incremental profiling

Add Example IAM Policy JSON sections to Azure Data Lake Storage and
Google Cloud Storage with ready-to-copy role assignments.
…GRANT examples

Add GRANT PROCESS ON *.* to both source and enrichment SQL examples
in MySQL and MariaDB. PROCESS is a global-level privilege that cannot
be granted at database scope — it was listed in the permissions table
but missing from the copy-paste examples, leaving users with an
incomplete setup.
@shindiogawa shindiogawa merged commit b6b2e1f into main Mar 27, 2026
1 check passed
@RafaelOsiro RafaelOsiro deleted the qua-1632-add-required-permissions-for-each-connector-on-the branch March 27, 2026 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants