Skip to content

perf: speed up v0 /posts/ listing for filtered queries#1133

Merged
odesenfans merged 1 commit into
mainfrom
od/optimize-merged-posts-query
May 13, 2026
Merged

perf: speed up v0 /posts/ listing for filtered queries#1133
odesenfans merged 1 commit into
mainfrom
od/optimize-merged-posts-query

Conversation

@odesenfans
Copy link
Copy Markdown
Collaborator

Summary

A (owner, type, channel) filter on /api/v0/posts.json was logging at ~3s on cold cache (~80ms hot). Two changes together collapse both:

  • Partial composite index ix_posts_owner_type_channel ON posts (owner, type, channel) WHERE amends IS NULL (deployment/migrations/versions/0058_4f1e8d2a6c3b_*.py). Built CONCURRENTLY so it ships without taking a write lock on posts.
  • Restructured get_matching_posts_legacy: filter + sort + LIMIT the originals+amend subquery first, then join messages on the bounded result. Previously the planner materialised the full posts+messages product before applying LIMIT, doing 5000+ messages probes to keep 100 rows.

Diagnosis

From EXPLAIN (ANALYZE, BUFFERS) on the slow query (hot cache, 80ms total):

Step Buffers % Time
Bitmap heap scan on posts_1 (5,015 rows after recheck) 7,971 24% 25 ms
messages_1 join × 5,015 loops 25,075 76% 35 ms
Top-N sort 5,015 → 100 - - 3 ms

The bitmap scan was BitmapAnd(ix_posts_owner=212k rows, ix_posts_type=9k rows), then heap recheck for channel and amends IS NULL. The new partial composite goes straight to the 5,015-row set. The messages joins now run on 100 rows instead of 5,015.

Sort is fine at ~3ms for 5,015 rows; not worth indexing for.

Implementation notes

  • make_select_merged_post_with_message_info_stmt is gone; replaced by private _make_select_merged_post_v0_base_stmt that returns only post-side columns plus latest_amend so callers can join messages after LIMIT.
  • v1 path (get_matching_posts) is unchanged.
  • ORDER BY is re-applied on the outer wrap so the wrap-around does not lose the inner sort. The TX_TIME branch re-joins chain_tx confirmations on the LIMITed set (cheap since at most a few hundred rows).
  • New hash column was previously aliased on top of item_hash in the v0 SELECT but never read (the controller uses original_item_hash as hash). Dropped from the projection.

Test plan

  • tests/db/test_posts.py (9 tests pass)
  • tests/api/test_posts.py (5 tests pass)
  • tests/message_processing/test_process_posts.py (11 tests pass)
  • Apply migration on staging and EXPLAIN ANALYZE the original slow query to confirm:
    • Bitmap heap scan replaced by an Index Scan on ix_posts_owner_type_channel
    • Messages joins run with loops=100 instead of loops=5015

🤖 Generated with Claude Code

@odesenfans odesenfans force-pushed the od/optimize-merged-posts-query branch from 8df7385 to 75f0d1b Compare May 13, 2026 13:09
Copy link
Copy Markdown

@foxpatch-aleph foxpatch-aleph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-structured performance optimization that defers messages joins until after LIMIT, combined with a new partial composite index to eliminate the bitmap scan bottleneck. The migration correctly uses CREATE INDEX CONCURRENTLY with proper transaction management. The subquery wrapping is properly handled with an outer re-ORDER BY. All protocol fields remain satisfied, test coverage is adequate, and no correctness or security issues were found.

Copy link
Copy Markdown

@foxpatch-aleph foxpatch-aleph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sound performance optimization that restructures the v0 /posts/ query to apply filters and LIMIT on posts before joining messages, reducing message table lookups from thousands to the paginated row count. The new partial composite index (CONCURRENTLY, no lock) covers the common (owner, type, channel) filter pattern. Code is correct, tests pass, migration is safe. A minor observation: the TX_TIME sort path introduces a redundant inner join on the confirmation subquery via filter_post_select_stmt, but this produces correct results and the planner may optimize it away.

src/aleph/db/accessors/posts.py (line 349): When sort_by=SortBy.TX_TIME, filter_post_select_stmt on line 349 already joins the select_earliest_confirmation subquery on the inner limited query. Then lines 404-416 re-join it on the outer query. The inner join is unnecessary since the outer query re-applies ordering and the inner order is discarded by the subquery wrap. Consider passing sort_by=None to filter_post_select_stmt to avoid this redundant join, or handle TX_TIME sorting only in the outer query.

Two related changes that together turn a 3-second cold-cache query
for ``(owner, type, channel)`` into a sub-second one:

1. New partial composite index
   ``ix_posts_owner_type_channel ON posts (owner, type, channel)
   WHERE amends IS NULL``. The planner was previously stuck doing a
   BitmapAnd of ``ix_posts_owner`` and ``ix_posts_type`` and
   rechecking ``channel`` from the heap, which on high-volume owners
   scans many thousands of unrelated rows.

2. Restructure ``get_matching_posts_legacy`` so the originals + amend
   subquery is filtered, sorted, and LIMITed before the ``messages``
   joins fire. Previously the planner materialised the full
   posts+messages product, then sorted and LIMITed, fetching ~5000
   message rows just to keep 100. Joining ``messages`` on the
   bounded result drops 76% of the buffer reads in the trace
   collected from the production slow log.

ORDER BY is re-applied on the outer wrap so the wrap-around does
not lose the inner sort, including for ``sort_by=TX_TIME`` which
re-joins ``chain_tx`` confirmations on the LIMITed set.
@odesenfans odesenfans force-pushed the od/optimize-merged-posts-query branch from 75f0d1b to 6480a08 Compare May 13, 2026 21:40
Copy link
Copy Markdown

@foxpatch-aleph foxpatch-aleph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR applies a well-motivated optimization to the v0 /posts/ endpoint by (1) adding a partial composite index on posts(owner, type, channel) WHERE amends IS NULL, and (2) restructuring get_matching_posts_legacy to defer message-table joins until after LIMIT is applied. The index uses CONCURRENTLY to avoid write locks. The query restructuring correctly preserves all output columns, sort orders (including TX_TIME), cursor pagination, and filtering semantics. The redundant re-join of select_earliest_confirmation in the TX_TIME branch is harmless and operates on the already-limited set. No correctness, security, or testing issues found.

src/aleph/db/accessors/posts.py (line 412): The TX_TIME sort re-joins select_earliest_confirmation here even though filter_post_select_stmt already joined it (on the pre-limited data). This is harmless since both are outer joins and the second join operates on the limited set, but it's worth noting the redundancy for future readers.

@odesenfans odesenfans merged commit 69e3d8a into main May 13, 2026
4 checks passed
@odesenfans odesenfans deleted the od/optimize-merged-posts-query branch May 13, 2026 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants