Describe the bug
1. Issue Summary
We are experiencing a critical performance degradation where the create_namespace API operation consistently times out (>30s), causing cascading failures in our downstream workload infrastructure (Traefik 504 Gateway Timeouts and pyiceberg have an unexpected behaviour).
2. Remediation Status
To restore production stability, we are manually applying the following schema changes:
- Applied:
idx_entities_catalog_id_id on entities. (Result: Latency dropped from ~36s to ~17s).
- Applying Now:
idx_grants_realm_grantee and idx_grants_realm_securable on grant_records to resolve 100% CPU usage during permission checks and GC.
3. Root Cause Analysis (Database Layer)
Our investigation confirmed that the default schema lacks secondary indexes required for the application's actual query patterns.
A. The "Grant Records" Bottleneck (High Frequency)
The application performs thousands of permission checks per operation.
- Observation: 100% Sequential Scans.
- Metrics: 65,000+ executions of
SELECT ... WHERE grantee_id = ....
- Impact: Without an index starting with
grantee_id or realm_id, the database reads 13.7 Billion rows to serve these requests.
- Fix: We are adding indexes to support lookups by
grantee_id and securable_id.
B. The "Entities" Bottleneck (Bulk Fetch)
- Observation: Namespace create operation triggers a query with a massive
IN clause containing ~7,800 tuples: WHERE (catalog_id, id) IN ((?,?), (?,?)...).
- Impact: The query planner could not use the PK (starting with
realm_id) efficiently, resulting in 17s+ execution time.
- Fix: We added
idx_entities_catalog_id_id to optimize this lookup.
4. Request for Engineering: Query Optimization
While the indexes mitigate the immediate outage, the underlying query patterns appear inefficient and likely need refactoring in the Polaris codebase.
1. Review "N+1" Permission Checks
We observed 65,415 executions of the permission check query during a short window.
- Question: Is the application iterating through a list and checking permissions one-by-one instead of performing a bulk permission check? This volume suggests an "N+1" query issue that indexes can only partially band-aid.
2. Review Bulk Entity Fetch Strategy
The query SELECT ... FROM entities WHERE (catalog_id, id) IN (...) is passing ~15,000 bind parameters (7,800 pairs).
- Recommendation: This pattern scales poorly. Please consider paginating this operation or refactoring the logic to avoid passing thousands of IDs in a single SQL statement.
5. SQL for Upstream Merge
We strongly recommend including these indexes in the next Polaris release to prevent similar degradation for other users:
-- 1. Optimizes High-Volume Permission Checks (Fixes API CPU load)
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_grants_realm_grantee
ON polaris_schema.grant_records (realm_id, grantee_id);
-- 2. Optimizes Garbage Collection (Fixes background job hangs)
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_grants_realm_securable
ON polaris_schema.grant_records (realm_id, securable_id);
-- 3. Optimizes Bulk Entity Lookups (Fixes Namespace listing latency)
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_entities_catalog_id_id
ON polaris_schema.entities (catalog_id, id);
To Reproduce
- Environment: Deploy Apache Polaris backed by PostgreSQL with a moderate dataset (~170k grant records, ~160k entities).
- Action: Execute the
create_namespace API operation on a catalog with ~22k namespaces.
- Conditions: Ensure standard concurrency; the issue manifests even without much load due to table scans.
- Observation: Monitor API response time and database activity via
pg_stat_statements.
Actual Behavior
- Performance: The operation consistently takes 33s - 36s to complete.
- Database Execution: Permission checks trigger a Sequential Scan of the entire
grant_records table for every single lookup. We observed queries reading ~174,000 rows per execution (100% of the table) thousands of times per minute.
- Client Experience: The API request times out. Proxies (e.g., Traefik) return
504 Gateway Timeout at the 30s mark, and client applications (crawlers) hang or fail silently.
Expected Behavior
- Performance: The
create_namespace operation should complete in sub-second time to ~2 seconds.
- Database Execution: Permission checks filtering by
grantee_id should utilize an Index Scan, reading only the specific rows relevant to the user (typically <100 rows).
- Client Experience: The request should return
200 OK well within standard proxy/client timeout windows (30s).
Additional context
All this information was very consistent along the research we did, after applying the first index idx_entities_catalog_id_id we seen a consistent drop on the response time from ~31s into ~16s.
After the index was applied the stats were reset and and pg statements cleaned, we kept the system running for +-24h and the data we see in the ticket is from this period after the resets.
System information
OS: Linux
Polaris Catalog Version: Docker image 1.3.0 (initial symptoms were present in 1.0.1 and we upgraded to validate if this was caused by the version, the problem persisted)
Object storage & setup: AWS S3
Describe the bug
1. Issue Summary
We are experiencing a critical performance degradation where the
create_namespaceAPI operation consistently times out (>30s), causing cascading failures in our downstream workload infrastructure (Traefik 504 Gateway Timeouts and pyiceberg have an unexpected behaviour).2. Remediation Status
To restore production stability, we are manually applying the following schema changes:
idx_entities_catalog_id_idonentities. (Result: Latency dropped from ~36s to ~17s).idx_grants_realm_granteeandidx_grants_realm_securableongrant_recordsto resolve 100% CPU usage during permission checks and GC.3. Root Cause Analysis (Database Layer)
Our investigation confirmed that the default schema lacks secondary indexes required for the application's actual query patterns.
A. The "Grant Records" Bottleneck (High Frequency)
The application performs thousands of permission checks per operation.
SELECT ... WHERE grantee_id = ....grantee_idorrealm_id, the database reads 13.7 Billion rows to serve these requests.grantee_idandsecurable_id.B. The "Entities" Bottleneck (Bulk Fetch)
INclause containing ~7,800 tuples:WHERE (catalog_id, id) IN ((?,?), (?,?)...).realm_id) efficiently, resulting in 17s+ execution time.idx_entities_catalog_id_idto optimize this lookup.4. Request for Engineering: Query Optimization
While the indexes mitigate the immediate outage, the underlying query patterns appear inefficient and likely need refactoring in the Polaris codebase.
1. Review "N+1" Permission Checks
We observed 65,415 executions of the permission check query during a short window.
2. Review Bulk Entity Fetch Strategy
The query
SELECT ... FROM entities WHERE (catalog_id, id) IN (...)is passing ~15,000 bind parameters (7,800 pairs).5. SQL for Upstream Merge
We strongly recommend including these indexes in the next Polaris release to prevent similar degradation for other users:
To Reproduce
create_namespaceAPI operation on a catalog with ~22k namespaces.pg_stat_statements.Actual Behavior
grant_recordstable for every single lookup. We observed queries reading ~174,000 rows per execution (100% of the table) thousands of times per minute.504 Gateway Timeoutat the 30s mark, and client applications (crawlers) hang or fail silently.Expected Behavior
create_namespaceoperation should complete in sub-second time to ~2 seconds.grantee_idshould utilize an Index Scan, reading only the specific rows relevant to the user (typically <100 rows).200 OKwell within standard proxy/client timeout windows (30s).Additional context
All this information was very consistent along the research we did, after applying the first index
idx_entities_catalog_id_idwe seen a consistent drop on the response time from ~31s into ~16s.After the index was applied the stats were reset and and pg statements cleaned, we kept the system running for +-24h and the data we see in the ticket is from this period after the resets.
System information
OS: Linux
Polaris Catalog Version: Docker image 1.3.0 (initial symptoms were present in 1.0.1 and we upgraded to validate if this was caused by the version, the problem persisted)
Object storage & setup: AWS S3