Skip to content

Inject retrieval into document generation; add RetrievalService and hybrid retriever#8

Open
delisha02 wants to merge 3 commits intomainfrom
codex/clarify-fact-extraction-processes-and-algorithms-ydgtbd
Open

Inject retrieval into document generation; add RetrievalService and hybrid retriever#8
delisha02 wants to merge 3 commits intomainfrom
codex/clarify-fact-extraction-processes-and-algorithms-ydgtbd

Conversation

@delisha02
Copy link
Owner

Motivation

  • Improve legal grounding of generated drafts by retrieving jurisdictional / citation context before generation.
  • Centralize retriever construction so research and generation pipelines share a single facade and can evolve retrieval strategy consistently.
  • Provide a hybrid/dense+BM25 retriever implementation to enable more robust RAG behavior for legal queries.

Description

  • Add a RetrievalService facade at backend/app/services/retrieval_service.py that exposes get_persistent_retriever and get_hybrid_retriever abstractions.
  • Implement get_hybrid_retriever(...) in backend/app/agents/legal_research/retrievers.py to combine a Chroma dense retriever and BM25Retriever with an EnsembleRetriever fusion.
  • Update LegalResearchAgent (backend/app/agents/legal_research/agent.py) to acquire its retriever via RetrievalService instead of importing a module-level function.
  • Enhance generation prompt assembly (backend/app/agents/document_generator/prompt_templates.py) to accept and inject retrieved_legal_context (and skip it when rendering facts) into the Grounded Legal Context block of the prompt.
  • Integrate retrieval into the document generation route (backend/app/api/v1/endpoints/documents.py) by building a compact retrieval query, fetching grounded legal snippets and source metadata via the new service, and attaching retrieved_legal_context and retrieved_legal_sources into merged_facts before calling assembly_engine.assemble_document.
  • Add design and rollout docs docs/rag_upgrade_proposal.md and docs/upgrade_execution_plan.md describing the RAG/validation upgrade plan and phase 0 work done.

Testing

  • Ran the existing test suite with pytest -q and confirmed no regressions in unrelated modules.
  • Executed focused unit checks for the new retriever helpers and RetrievalService facade which passed locally.
  • Performed an API integration smoke test for POST /documents/generate that verifies the retrieval step runs and retrieved_legal_context is injected into the assembled prompt, and this smoke test passed.

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant