Summary
We're seeing a repeatable initial checkpoint delay on self-hosted PowerSync with Postgres storage and many parameterized bucket definitions.
The hot query shape in the storage database is a SELECT DISTINCT ON (...) against powersync.bucket_parameters. On our staging deployment, the default planner path fell back to a large scan + sort, and a local composite index changed the plan from ~384ms to ~33ms for the same representative lookup.
This did not eliminate all startup latency on its own, but it materially improved the specific bucket_parameters lookup path and looks like something PowerSync may want to handle upstream in the storage schema or query shape.
Environment
- PowerSync service:
powersync-js/1.48.0 clients hitting self-hosted powersync-service
- Storage backend: Postgres (
powersync_storage)
- Active sync rule: 1
- Bucket definitions: 89 total, 78 parameterized
Observed runtime pattern
On the service side, the repeated timing pattern was:
Sync stream started
- ~5.5s to ~6.5s later:
New checkpoint
The active checkpoint log included:
buckets: 988
param_results: 975
That suggested bucket-parameter fan-out, so we inspected the storage DB.
Dominant storage query
From pg_stat_statements on the storage DB, the dominant lookup was:
SELECT DISTINCT ON (lookup, source_table, source_key)
lookup, source_table, source_key, id, bucket_parameters
FROM bucket_parameters
WHERE group_id = $1
AND lookup = ANY (...)
AND id <= $3
ORDER BY lookup, source_table, source_key, id DESC
Representative aggregated stats before the workaround:
- calls:
47073
- mean_exec_time:
56.1ms
- max_exec_time:
1206.0ms
Because the active rule had 89 bucket definitions (78 parameterized), that fan-out alone was already enough to explain several seconds of checkpoint setup time.
Representative EXPLAIN before index
For one real live lookup key, EXPLAIN (ANALYZE, BUFFERS) showed:
Unique (actual time=275.878..382.311 rows=480 loops=1)
-> Gather Merge
-> Sort
Sort Method: external merge Disk: ~7MB per worker
-> Parallel Seq Scan on bucket_parameters
Execution Time: 384.211 ms
Additional shape details:
- raw rows matched:
102894
- distinct
(source_table, source_key) pairs returned: 480
- temp spill occurred
Existing indexes on powersync.bucket_parameters were:
(group_id, lookup, id DESC)
(group_id, source_table, source_key)
Even with enable_seqscan = off, Postgres still chose a broad bitmap path + sort instead of a path that matched the full dedup/order shape.
Local workaround index
We added this index locally as an operational workaround:
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_bucket_parameters_lookup_source_latest
ON powersync.bucket_parameters (group_id, lookup, source_table, source_key, id DESC);
Representative EXPLAIN after index
The same representative lookup changed to:
Result (actual time=0.032..32.928 rows=480 loops=1)
-> Unique
-> Index Only Scan using idx_bucket_parameters_lookup_source_latest
Execution Time: 32.999 ms
Notable differences:
- execution time:
384ms -> 33ms
- no external merge sort
- no parallel seq scan
- index-only scan chosen
Important caveat
This did not eliminate all initial checkpoint latency for us. There is still residual startup work after this lookup path is improved.
So this report is not claiming "this one index fixes all startup delay".
What it does show is:
bucket_parameters lookups can dominate startup wall clock through fan-out even when no single query looks catastrophically slow in isolation.
- The current default index shape on
bucket_parameters does not support this DISTINCT ON ... ORDER BY ... id DESC access pattern well enough on our dataset.
- A composite index on
(group_id, lookup, source_table, source_key, id DESC) materially improves that hot path.
Question
Would PowerSync accept either:
- an upstream storage-schema index along these lines, or
- a query/storage change that avoids the large scan + sort path for
bucket_parameters lookups under many parameterized buckets?
Happy to provide more detail if useful.
Summary
We're seeing a repeatable initial checkpoint delay on self-hosted PowerSync with Postgres storage and many parameterized bucket definitions.
The hot query shape in the storage database is a
SELECT DISTINCT ON (...)againstpowersync.bucket_parameters. On our staging deployment, the default planner path fell back to a large scan + sort, and a local composite index changed the plan from ~384ms to ~33ms for the same representative lookup.This did not eliminate all startup latency on its own, but it materially improved the specific
bucket_parameterslookup path and looks like something PowerSync may want to handle upstream in the storage schema or query shape.Environment
powersync-js/1.48.0clients hitting self-hostedpowersync-servicepowersync_storage)Observed runtime pattern
On the service side, the repeated timing pattern was:
Sync stream startedNew checkpointThe active checkpoint log included:
buckets: 988param_results: 975That suggested bucket-parameter fan-out, so we inspected the storage DB.
Dominant storage query
From
pg_stat_statementson the storage DB, the dominant lookup was:Representative aggregated stats before the workaround:
4707356.1ms1206.0msBecause the active rule had 89 bucket definitions (78 parameterized), that fan-out alone was already enough to explain several seconds of checkpoint setup time.
Representative EXPLAIN before index
For one real live lookup key,
EXPLAIN (ANALYZE, BUFFERS)showed:Additional shape details:
102894(source_table, source_key)pairs returned:480Existing indexes on
powersync.bucket_parameterswere:(group_id, lookup, id DESC)(group_id, source_table, source_key)Even with
enable_seqscan = off, Postgres still chose a broad bitmap path + sort instead of a path that matched the full dedup/order shape.Local workaround index
We added this index locally as an operational workaround:
Representative EXPLAIN after index
The same representative lookup changed to:
Notable differences:
384ms -> 33msImportant caveat
This did not eliminate all initial checkpoint latency for us. There is still residual startup work after this lookup path is improved.
So this report is not claiming "this one index fixes all startup delay".
What it does show is:
bucket_parameterslookups can dominate startup wall clock through fan-out even when no single query looks catastrophically slow in isolation.bucket_parametersdoes not support thisDISTINCT ON ... ORDER BY ... id DESCaccess pattern well enough on our dataset.(group_id, lookup, source_table, source_key, id DESC)materially improves that hot path.Question
Would PowerSync accept either:
bucket_parameterslookups under many parameterized buckets?Happy to provide more detail if useful.