A powerful, full-featured web application for managing File Search Stores and performing Retrieval Augmented Generation (RAG) using Google's Gemini File Search API with bilingual support (Spanish/English).
- Bilingual Support: Complete Spanish/English translation system
- 3 Main Tabs: Chat, File Stores Management, Documentation & HTTP Generator
- Real-time Chat: Interactive RAG queries with document citations
- Enhanced File Upload:
- 60+ file extensions supported (PDF, DOCX, XLSX, CSV, code files, and more)
- Smart MIME type detection with 70+ pre-mapped formats
- Dual-method upload system (direct + fallback) for maximum reliability
- Automatic handling of problematic file formats (CSV, large files)
- AI-Powered Metadata β NEW:
- Automatic metadata extraction using Gemini models
- Bilingual metadata generation - respects your interface language
- DOCX/XLSX analysis support - extract metadata from Office documents
- Smart text extraction for unsupported file types
- Quality over quantity (5-12 metadata fields)
- Create and manage multiple File Search Stores
- Upload documents with automatic chunking and embedding
- View documents with state tracking (PENDING, ACTIVE, FAILED)
- Edit metadata with visual tag system
- Delete stores and documents with safety confirmations
- 6 Output Formats: cURL, Python, JavaScript, n8n, Make.com, Postman
- 8 Operations: Create Store, List Stores, Upload Document, Chat RAG, List Documents, Delete Document, Delete Store, Get Operation Status
- Dynamic parameter forms for each operation
- Ready-to-copy code with explanations
- Fully translated interface
- Complete translation system (Spanish/English)
- Persistent language preference
- All UI elements translated including:
- Forms and labels
- Buttons and messages
- Code explanations
- Error messages
- Smart Citations: Automatic grounding metadata extraction
- Conversation History: Full chat history with timestamps
- Model Selection: Support for multiple Gemini models
- Metadata Editor: Visual interface for custom metadata
- Store Selector: Easy switching between File Search Stores
- Responsive Design: Works on desktop and mobile devices
Modern bilingual interface with upload, chat, file stores, and settings tabs
Automatic metadata extraction powered by Gemini AI
Interactive RAG chat with real-time responses
Manage multiple file search stores with document tracking
Configure API keys and model preferences
Built-in HTTP request generator with 6 output formats
- Python 3.8 or higher
- Google Gemini API Key (Get one here)
-
Clone the repository
git clone https://github.com/webcomunicasolutions/gemini-rag.git cd gemini-rag -
Create and activate virtual environment
cd web_app python -m venv venv # On Windows: venv\Scripts\activate # On Linux/Mac: source venv/bin/activate
-
Install dependencies
pip install -r requirements.txt
-
Set up your API key
- Copy
.env.exampleto.env - Add your Gemini API key:
GEMINI_API_KEY=your_api_key_here
- Copy
-
Run the application
python app.py
-
Open in browser
http://localhost:5001
Go to Settings tab and enter your Gemini API Key. The app will remember it for future sessions.
- Create a new File Search Store in the File Stores tab
- Or select an existing store from the dropdown
- Click Browse Files or drag & drop
- 60+ file extensions supported including:
- π Documents: PDF, DOC, DOCX, ODT, RTF
- π Spreadsheets: CSV, XLSX, XLS, XLSM, ODS
- π Presentations: PPTX, PPT, ODP
- πΎ Data: JSON, XML, YAML, SQL
- π» Code: 30+ programming languages (Python, JavaScript, Java, C++, Go, Rust, TypeScript, PHP, Ruby, Swift, Kotlin, Scala, and more)
- π Markup: HTML, Markdown, LaTeX, Jupyter Notebooks
- Smart MIME type detection with automatic fallback
- Robust upload system: Direct upload with Files API fallback for problematic files
- See complete list: SUPPORTED_FORMATS.md - 200+ MIME types officially supported by Gemini
- Add custom metadata (optional)
- Select AI model for metadata generation (optional)
- Go to Chat tab
- Ask questions about your uploaded documents
- Get answers with citations and source references
- View full conversation history
- Go to Docs & Help tab
- Select an operation
- Configure parameters
- Choose output format (cURL, Python, JavaScript, etc.)
- Copy ready-to-use code
βββββββββββββββββββββββββββββββββββββββββββββββ
β User (Browser) β
ββββββββββββββββββ¬βββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββ
β Flask Web Application β
β βββββββββββββββββββββββββββββββββββββββ β
β β Frontend (HTML/CSS/JS) β β
β β - Bilingual UI β β
β β - Real-time chat β β
β β - File upload β β
β β - HTTP generator β β
β βββββββββββββββββββββββββββββββββββββββ β
β βββββββββββββββββββββββββββββββββββββββ β
β β Backend (Python/Flask) β β
β β - API endpoints β β
β β - State management β β
β β - File processing β β
β β - Smart MIME detection β β
β β - Upload fallback system β β
β βββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββ¬βββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββ
β Google Gemini File Search API β
β - Document vectorization β
β - Semantic search β
β - RAG responses β
β - Citations & grounding β
βββββββββββββββββββββββββββββββββββββββββββββββ
The application implements a robust dual-method upload system to ensure maximum compatibility:
Primary Method: Direct Upload
- Uses
uploadToFileSearchStoreAPI - Single-step process
- Automatic MIME type detection by API
- Fastest and most efficient
Fallback Method: Files API + Import
- Activates when direct upload fails
- Two-step process: upload to Files API β import to store
- Explicit MIME type specification
- Handles edge cases (CSV, large files, etc.)
Smart MIME Type Detection
- 70+ pre-mapped file extensions
- Fallback to Python
mimetypesmodule - Safe default for unknown types
- Optimized for Gemini File Search API
Supported Extensions:
Documents: txt, pdf, doc, docx, odt, rtf, md
Spreadsheets: csv, tsv, xlsx, xls, xlsm, xlsb, ods
Presentations: pptx, ppt, odp
Data Formats: json, xml, yaml, yml, sql
Code Files: py, js, jsx, ts, tsx, java, c, cpp, cs, go, rs, php, rb, swift, kt, scala, pl, r, hs, erl, lua, sh, dart
Web: html, htm, css, scss, sass
Scientific: ipynb, bib, tex
Archives: zip
... and more (60+ total)
This dual-method approach ensures files are uploaded successfully even when encountering API-specific issues with certain formats.
- Backend: Flask (Python 3.8+)
- Frontend: Vanilla JavaScript, HTML5, CSS3
- AI/ML: Google Gemini 2.5 Flash API
- Storage: JSON-based state persistence
- Dependencies:
google-genai: Official Gemini SDKflask: Web frameworkpython-dotenv: Environment management
- Installation Guide: Detailed setup instructions
- Features: Complete feature list
- API Reference: Backend API documentation
- Contributing: How to contribute
- CLAUDE.md: Gemini File Search API reference
Contributions are welcome! Please see CONTRIBUTING.md for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Original Inspiration: This project was inspired by the YouTube video "Gemini's File Search API Makes RAG Easy (and CHEAP!)" by Mark Kashef - Thank you for showing the potential of Gemini's File Search API!
- Google Gemini Team: For the amazing File Search API
- Flask Community: For the excellent web framework
- Built with β€οΈ by Webcomunica Solutions & Optimizaconia
- Website: webcomunica.solutions
- GitHub: @webcomunicasolutions
- LinkedIn: Juan JosΓ© SΓ‘nchez Bernal
- Instagram: @webcomunica_soluciones
If you find this project useful, please consider giving it a star on GitHub!
Last Updated: November 21, 2025 Version: 1.2.0 Status: Production Ready β
- β Bilingual Metadata Generation: AI now generates metadata in your interface language (Spanish/English)
- β DOCX/XLSX Metadata Analysis: Extract metadata from Word and Excel documents with AI
- β Text Extraction: Smart local text extraction for Office documents
- β Windows File Lock Fix: Proper file handle management prevents locking issues
- β Enhanced File Upload System: Dual-method upload with automatic fallback
- β Smart MIME Detection: Comprehensive MIME type mapping for 70+ formats
- β Extended Format Support: 60+ file extensions supported
- β CSV/Excel Upload Fix: Resolved upload issues with spreadsheet formats

