feat(gemini): Add support for RAG documents in dynamic context#1205
Conversation
joshua-mo-143
left a comment
There was a problem hiding this comment.
Hey, apologies for the late review - I meant to get around to this but I wasn't able to do so.
Needs cargo fmt then lgtm
This commit adds support for passing RAG documents to the Gemini API when using dynamic_context() with agents. Previously, documents retrieved from vector stores were ignored by the Gemini provider, causing RAG systems to fail silently. Changes: - Modified create_request_body() to inject documents as user messages at the beginning of chat history - Added special handling for DocumentMediaType::TXT to convert text documents to plain text parts instead of base64-encoded inline data - Preserved existing behavior for other document types (images, PDFs, etc.) Fixes: - Documents from dynamic_context() now properly reach the Gemini model - RAG-based agents can now successfully retrieve and use context from vector stores Testing: - Tested with a note-taking RAG system using Obsidian vault notes - Verified that documents are correctly passed and interpreted by Gemini - Confirmed backward compatibility with existing document handling
Added comprehensive tests for the new document handling functionality: 1. test_txt_document_conversion_to_text_part: Verifies that TXT documents are converted to PartKind::Text instead of InlineData 2. test_create_request_body_with_documents: Tests that documents are correctly injected into chat history at the beginning 3. test_create_request_body_without_documents: Ensures backward compatibility when no documents are present All tests pass successfully.
f960195 to
c0c6b31
Compare
updated, thank you! |
|
@atellou glad to hear it helped. @joshua-mo-143 let me know please if there is anything else to be done before we can merge |
1 similar comment
|
@atellou glad to hear it helped. @joshua-mo-143 let me know please if there is anything else to be done before we can merge |
|
Hey - sorry been meaning to get around to this but haven't kept my eye on it. I'm currently travelling but will have a look when I'm available - doesn't look like a lot needs to be done if anything iirc |
1 similar comment
|
Hey - sorry been meaning to get around to this but haven't kept my eye on it. I'm currently travelling but will have a look when I'm available - doesn't look like a lot needs to be done if anything iirc |
…xPlaygrounds#1205 - Convert text-based documents (TXT, Markdown, HTML, etc.) to plain text parts instead of inline data - Inject documents as user message at beginning of chat history in create_request_body - Add base64 decoding for text documents when needed - Fix TypedPromptResponse to use `usage` field instead of `total_usage` Includes comprehensive tests for: - TXT and Markdown document conversion to text parts - Document injection into request
|
I created a version of this PR up to date with the main branch and included my change: #1456 |
I'll be merging this PR first. Trying to solve multiple issues with a single PR is generally not a great idea. |
|
Works for me |
|
Finally merged! @snaumov Thanks for your hard work on the initial implementation of this, I know it's taken a while to get this merged. I have made necessary changes and merged this PR as this was supposed to be merged at an earlier point but I got distracted and kept forgetting. |
|
@joshua-mo-143 thank you for pushing it across finish line! Sorry got distracted myself and didn't rebase on time. |
Add RAG Documents Support to Gemini Provider
Summary
This PR adds support for RAG (Retrieval-Augmented Generation) documents in the Gemini provider's
dynamic_context()functionality. Previously, documents retrieved from vector stores were silently ignored, causing RAG agents to fail.Problem
When using
.dynamic_context()with Gemini agents, the retrieved documents were never passed to the Gemini API. Thecreate_request_body()function in the Gemini provider ignored thedocumentsfield fromCompletionRequest, making it impossible to use RAG with Gemini.Solution
Changes Made
Document Injection: Modified
create_request_body()to check for documents and inject them as a user message at the beginning of chat history using the existingnormalized_documents()helper method.Text Document Handling: Added special handling for
DocumentMediaType::TXTand other text-based documents to convert them to plain text parts (PartKind::Text) instead of trying to send them as base64-encoded inline data.Backward Compatibility: Preserved existing behavior for other document types (images, PDFs, etc.) that use
InlineDataorFileData.Code Changes
File:
rig/rig-core/src/providers/gemini/completion.rsTesting
Tested with a real-world RAG application:
Before Fix
After Fix
Impact
normalized_documents())Additional Notes
The fix aligns Gemini's behavior with other providers like Cohere and Anthropic that already support documents in RAG scenarios.