Skip to content

No reranker between Qdrant top-k and document-rag synthesis — chunk ordering is raw cosine #910

@aviyashchin

Description

@aviyashchin

Observation

trustgraph/retrieval/document_rag/document_rag.py:78-136 does:

results = await asyncio.gather(*[query_concept(v) for v in vectors])
# dedupe by chunk_id, fetch from Garage
# ... pass deduped chunks straight to document_prompt synthesis

There is no reranker stage. Qdrant returns raw cosine top-k, deduplicated, passed straight to the synthesis LLM. No MMR, no diversity penalty, no token-budget cap, no cross-encoder rerank.

Why this matters

Cosine top-k is approximate-and-topical, not answer-aware. For executive-synthesis questions ("Who are X's main competitors?"), the top-3 cosine matches may all be the same paragraph rephrased, or all from the same source document, when the answer needs diversity across sources to be trustworthy.

Issue #878 (open, 2026-05-07) raises the cross-encoder reranking concern. This issue is the same concern, scoped specifically to the document-rag synthesis path (vs the general retrieval surface).

Measured impact

In our Sizzl deployment, raising --doc-limit from 3 to 30 only moved the rubric needle +0.46 points — meaning the additional chunks at the tail of the top-30 were not materially improving synthesis. A reranker that surfaces 10 diverse, high-relevance chunks out of 30 retrieved would likely beat 30-unreranked on both quality and latency (fewer tokens to synthesize).

Proposal

Add a reranker stage between get_docs() and document_prompt() in document_rag.py. Pluggable design:

  • Cohere Rerank (API)
  • BGE-reranker (local)
  • Cross-encoder (local, e.g. ms-marco-MiniLM)

Insertion point: inside get_docs() after the Qdrant gather and before fetch_chunk() — rerank the chunk_id list, fetch fewer chunks, send leaner context to synthesis.

Estimated latency cost: <500ms for local cross-encoder, ~100-300ms for Cohere API.
Estimated quality lift: 5-15% on synthesis rubric metrics (anecdotal, varies by corpus).

Related

Stack

TrustGraph 2.3.21.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions