diff --git a/CHANGELOG.md b/CHANGELOG.md
index 104b884..42af325 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/).
 
 ## [Unreleased]
 
+## [0.0.5] - 2026-05-19
+
 ### Breaking
 
 - Removed `epsilon` from `RedisVectorQueryConfig`. `EPSILON` is a
@@ -21,14 +23,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/).
   statements (with optional `params` for placeholders) against a bound
   Redis index. Installed via the new `adk-redis[sql]` extra
   (`redisvl[sql-redis]>=0.18.2`).
-- `create_redisvl_mcp_toolset(...)` (`adk_redis.tools.mcp_search`): helper
-  that returns an ADK `McpToolset` wired to RedisVL's own MCP server
-  (`rvl mcp`). Supports `stdio`, `sse`, and `streamable-http` transports;
-  bearer auth on HTTP transports; `--read-only` default for stdio.
-  Installed via the new `adk-redis[mcp-search]` extra
-  (`redisvl[mcp]>=0.18.2`).
-- Module constants `REDISVL_MCP_TOOL_SEARCH` and `REDISVL_MCP_TOOL_UPSERT`
-  for symbolic tool filtering.
+- New `examples/redisvl_mcp_search/`: the MCP-path mirror of
+  `examples/redis_search_tools/`. Same knowledge-base corpus, served by
+  a `rvl mcp` server in hybrid (BM25 + vector) mode; the agent connects
+  via ADK's standard `McpToolset`. No adk-redis wrapper is needed; users
+  wire `StdioConnectionParams` / `SseConnectionParams` /
+  `StreamableHTTPConnectionParams` directly, matching the pattern used
+  by every catalog MCP integration page.
 - `make redis-up` / `make redis-down` / `make test-integration` targets
   for the new `tests/integration/` suite. Integration tests skip
   cleanly when no Redis with the RediSearch module is reachable at
@@ -49,8 +50,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/).
   (`tests/tools/test_vector_search.py::TestRedisVectorQueryConfigEpsilonRemoval`).
 - New cache provider tests (`tests/cache/test_provider.py`) including
   a no-`DeprecationWarning` assertion on the import path.
-- New unit tests for `RedisSQLSearchTool` (`tests/tools/test_sql_search.py`)
-  and `create_redisvl_mcp_toolset` (`tests/tools/test_mcp_search.py`).
+- New unit tests for `RedisSQLSearchTool`
+  (`tests/tools/test_sql_search.py`).
 - New integration suite under `tests/integration/` that round-trips
   vector, text, range, native hybrid, and SQL queries plus a cache
   round-trip against a real Redis 8.4 container, and confirms tools
diff --git a/README.md b/README.md
index 85979eb..b5a319d 100644
--- a/README.md
+++ b/README.md
@@ -21,520 +21,326 @@
 
 ---
 
-## Introduction
-
-**adk-redis** provides Redis integrations for Google's Agent Development Kit (ADK). Implements ADK's `BaseMemoryService`, `BaseSessionService`, tool interfaces, and semantic caching using Redis Agent Memory Server and RedisVL.
-
-<div align="center">
-
-| 🔌 [**ADK Services**](#memory-services) | 🔧 [**Agent Tools**](#search-tools) | ⚡ [**Semantic Caching**](#semantic-caching) |
-|:---:|:---:|:---:|
-| **Memory Service**<br/>*Long-term memory via Agent Memory Server* | **Memory Tools**<br/>*LLM-controlled memory operations* | **LLM Response Cache**<br/>*Reduce latency & costs* |
-| Semantic search & auto-extraction | REST API or MCP protocol | Similarity-based cache lookup |
-| Cross-session knowledge retrieval | search, create, update, delete | Configurable distance threshold |
-| Recency-boosted search | Namespace & user isolation | TTL-based expiration |
-| **Session Service**<br/>*Working memory via Agent Memory Server* | **Search Tools**<br/>*RAG via RedisVL* | **Tool Cache**<br/>*Avoid redundant calls* |
-| Context window management | Vector, hybrid, text, range search | Cache tool execution results |
-| Auto-summarization | Multiple vectorizers supported | Reduce API calls |
-| Background memory promotion | **MCP Tools**<br/>*Model Context Protocol* | **LangCache**<br/>*Managed semantic cache* |
-| | SSE-based tool discovery | Cloud-hosted, no local vectorizer |
-
-</div>
+## What it does
 
+`adk-redis` is the Redis layer for [Google ADK](https://github.com/google/adk-python) agents. It implements ADK's `BaseMemoryService`, `BaseSessionService`, and `BaseTool` interfaces against Redis, [RedisVL](https://docs.redisvl.com), and the [Redis Agent Memory Server](https://github.com/redis/agent-memory-server). It also ships MCP toolset helpers and semantic-cache providers.
 
+| Surface | What you get | Backed by |
+|---|---|---|
+| **Sessions** (`RedisWorkingMemorySessionService`) | `BaseSessionService` with auto-summarization and context-window management | Agent Memory Server (REST) |
+| **Long-term memory** (`RedisLongTermMemoryService`) | `BaseMemoryService` with semantic search and recency boosting | Agent Memory Server (REST) |
+| **Memory tools** (`SearchMemoryTool`, `CreateMemoryTool`, ...) | LLM-controlled memory operations | Agent Memory Server (REST) |
+| **AMS MCP toolset** (`create_memory_mcp_toolset`) | Exposes `search_long_term_memory`, `create_long_term_memories`, `edit_long_term_memory`, `delete_long_term_memories`, `get_long_term_memory`, `memory_prompt`, and `set_working_memory` over SSE | Agent Memory Server (MCP) |
+| **RedisVL MCP** (native `McpToolset` against `rvl mcp`) | <ul><li>Tools exposed: `search-records`, `upsert-records` (gate writes with `--read-only`).</li><li>Search modes (one per server, chosen via YAML): `vector` KNN, `fulltext` BM25, or `hybrid` (LINEAR or RRF fusion).</li><li>Server-side query embedding via a configured RedisVL vectorizer; agents never load one locally.</li><li>Schema-aware tool descriptions: filter and return-field hints derived from the bound `IndexSchema`.</li><li>JSON filter language with tag, text, and numeric operators (`eq`, `in`, `between`, `gt`, `lt`, `ne`).</li><li>Transports: stdio, sse, streamable-http. Bearer auth on HTTP. Pagination via `limit` / `offset`.</li></ul> | `rvl mcp` server (`redisvl[mcp]`) |
+| **Search tools with REST** (5 in-process tools) | Vector, hybrid, range, text, SQL search as `BaseTool` subclasses | RedisVL (Python) |
+| **Semantic cache** (`RedisVLCacheProvider`, `LangCacheProvider`) | Skip repeat LLM calls and tool calls by semantic similarity | RedisVL `SemanticCache` or [Redis LangCache](https://redis.io/langcache) |
 
 ---
 
 ## Installation
 
-### Install from PyPI
-
 ```bash
 pip install adk-redis
 ```
 
-### Optional Dependencies
-
-Install with optional features based on your use case:
+Optional extras (combine as needed):
 
 ```bash
-# Memory & session services (Redis Agent Memory Server integration)
-pip install adk-redis[memory]
-
-# Search tools (RedisVL integration)
-pip install adk-redis[search]
-
-# LangCache (managed semantic cache service)
-pip install adk-redis[langcache]
-
-# SQL-to-Redis search tool (RedisSQLSearchTool, requires sql-redis)
-pip install adk-redis[sql]
-
-# RedisVL MCP toolset helper (`create_redisvl_mcp_toolset`)
-pip install adk-redis[mcp-search]
-
-# All library features
-pip install adk-redis[all]
-
-# Running the examples (adds python-dotenv and other example dependencies)
-pip install adk-redis[all,examples]
+pip install 'adk-redis[memory]'      # sessions + long-term memory services
+pip install 'adk-redis[search]'      # RedisVL-backed search tools
+pip install 'adk-redis[sql]'         # RedisSQLSearchTool (sql-redis)
+pip install 'adk-redis[langcache]'   # managed semantic cache provider
+pip install 'adk-redis[all]'         # all of the above
+pip install 'adk-redis[all,examples]'  # plus dotenv etc. for running examples
+
+# For the RedisVL MCP server (used with ADK's native McpToolset):
+pip install 'redisvl[mcp]>=0.18.2'
 ```
 
-### Verify Installation
+### Verify
 
 ```bash
 python -c "from adk_redis import __version__; print(__version__)"
 ```
 
-### Development Installation
-
-For contributors or those who want the latest unreleased changes:
+### Development install
 
 ```bash
-# Clone the repository
 git clone https://github.com/redis-developer/adk-redis.git
 cd adk-redis
-
-# Install with uv (recommended for development)
 pip install uv
 uv sync --all-extras
-
-# Or install directly from GitHub
-pip install git+https://github.com/redis-developer/adk-redis.git@main
-```
-
----
-
-## Getting Started
-
-### Prerequisites
-
-**For memory/session services:**
-- [Redis Agent Memory Server](https://github.com/redis/agent-memory-server) (port 8088)
-- Redis 8.4+ or Redis Cloud (backend for Agent Memory Server)
-
-**For search tools:**
-- Redis 8.4+ or Redis Cloud with Search capability
-
-**Quick start:**
-
-#### 1. Start Redis 8.4
-
-Redis is required for all examples in this repository. Choose one of the following options:
-
-**Option A: Automated setup (recommended)**
-
-```bash
-# Run from the repository root
-./scripts/start-redis.sh
-```
-
-This script will:
-- Check if Docker is installed and running
-- Check if Redis is already running on port 6379
-- Start Redis 8.4 in a Docker container with health checks
-- Verify the Redis container is healthy and accepting connections
-- Provide helpful commands for managing Redis
-
-**Option B: Manual setup**
-
-```bash
-docker run -d --name redis -p 6379:6379 redis:8.4-alpine
-```
-
-> **Note**: Redis 8.4 includes the Redis Query Engine (evolved from RediSearch) with native support for vector search, full-text search, and JSON operations. Docker will automatically download the image (~40MB) on first run.
-
-**Verify Redis is running:**
-
-```bash
-# Check container status
-docker ps | grep redis
-
-# Test connection
-docker exec redis redis-cli ping
-# Should return: PONG
-
-# Or if you have redis-cli installed locally
-redis-cli -p 6379 ping
-```
-
-**Common Redis commands:**
-
-```bash
-# View logs
-docker logs redis
-docker logs -f redis  # Follow logs in real-time
-
-# Stop Redis
-docker stop redis
-
-# Restart Redis
-docker restart redis
-
-# Remove Redis (stops and deletes container)
-docker rm -f redis
-```
-
-**Troubleshooting:**
-
-- **Port 6379 already in use**: Another process is using the port. Find it with `lsof -i :6379` or use a different port: `docker run -d --name redis -p 6380:6379 redis:8.4-alpine`
-- **Docker not running**: Start Docker Desktop or the Docker daemon
-- **Permission denied**: Run with `sudo` or add your user to the docker group
-- **Container won't start**: Check logs with `docker logs redis`
-
-#### 2. Start Agent Memory Server
-
-```bash
-docker run -d --name agent-memory-server -p 8088:8088 \
-  -e REDIS_URL=redis://host.docker.internal:6379 \
-  -e GEMINI_API_KEY=your-gemini-api-key \
-  -e GENERATION_MODEL=gemini/gemini-2.0-flash \
-  -e EMBEDDING_MODEL=gemini/text-embedding-004 \
-  -e FAST_MODEL=gemini/gemini-2.0-flash \
-  -e SLOW_MODEL=gemini/gemini-2.0-flash \
-  -e EXTRACTION_DEBOUNCE_SECONDS=5 \
-  redislabs/agent-memory-server:0.13.2 \
-  agent-memory api --host 0.0.0.0 --port 8088 --task-backend=asyncio
 ```
 
-> **Configuration Options:**
-> - **LLM Provider**: Agent Memory Server uses [LiteLLM](https://docs.litellm.ai/) and supports 100+ providers (OpenAI, Gemini, Anthropic, AWS Bedrock, Ollama, etc.). Set the appropriate environment variables for your provider (e.g., `GEMINI_API_KEY`, `GENERATION_MODEL=gemini/gemini-2.0-flash`). See the [Agent Memory Server LLM Providers docs](https://redis.github.io/agent-memory-server/llm-providers/) for details.
-> - **Model Configuration**: Set `GENERATION_MODEL`, `FAST_MODEL` (for quick tasks like extraction), and `SLOW_MODEL` (for complex tasks) to your preferred models. All default to OpenAI models if not specified.
-> - **Memory Extraction Debounce**: `EXTRACTION_DEBOUNCE_SECONDS` controls how long to wait before extracting memories from a conversation (default: 300 seconds). Lower values (e.g., 5) provide faster memory extraction, while higher values reduce API calls.
-> - **Embedding Models**: Agent Memory Server also uses LiteLLM for embeddings. For local/offline embeddings, use Ollama (e.g., `EMBEDDING_MODEL=ollama/nomic-embed-text`, `REDISVL_VECTOR_DIMENSIONS=768`). See [Embedding Providers docs](https://redis.github.io/agent-memory-server/embedding-providers/) for all options.
-
-**See detailed setup guides:**
-- [Redis Setup Guide](docs/redis-setup.md) - All Redis deployment options
-- [Agent Memory Server Setup](docs/agent-memory-server-setup.md) - Complete configuration
-- [Integration Guide](docs/integration-guide.md) - End-to-end setup with code examples
-
 ---
 
 ## Quick Start
 
-### Two-Tier Memory Architecture
+### Prerequisites
 
-Uses both working memory (session-scoped) and long-term memory (persistent):
+- Python 3.10+
+- Redis 8.4+ with the Redis Query Engine (Search). Local Docker:
+  ```bash
+  docker run -d --name redis -p 6379:6379 redis:8.4
+  docker exec redis redis-cli ping   # -> PONG
+  ```
+- For session / memory services: a running [Agent Memory Server](https://github.com/redis/agent-memory-server) (default port 8088). Quick start:
+  ```bash
+  docker run -d --name agent-memory-server -p 8088:8088 \
+    -e REDIS_URL=redis://host.docker.internal:6379 \
+    -e GEMINI_API_KEY=YOUR_KEY \
+    -e GENERATION_MODEL=gemini/gemini-2.5-flash \
+    -e EMBEDDING_MODEL=gemini/text-embedding-004 \
+    redislabs/agent-memory-server:0.13.2 \
+    agent-memory api --host 0.0.0.0 --port 8088 --task-backend=asyncio
+  ```
+  On Linux, `host.docker.internal` is not routable by default; use `--network=host` and `REDIS_URL=redis://127.0.0.1:6379`, or set `REDIS_URL` to the Docker-bridge gateway (typically `redis://172.17.0.1:6379`). AMS supports 100+ LLM and embedding providers via [LiteLLM](https://docs.litellm.ai/). See [Agent Memory Server setup](docs/user_guide/how_to_guides/memory_server_setup.md) for the full configuration matrix.
+
+For Redis Cloud, Redis Enterprise, or troubleshooting, see [Redis setup](docs/user_guide/how_to_guides/redis_setup.md).
+
+### Sessions + long-term memory
+
+Two-tier memory: working memory (per session) and long-term memory (cross-session). Both implement ADK's service interfaces and slot into any `Runner`.
 
 ```python
 from google.adk import Agent
 from google.adk.runners import Runner
 
-from adk_redis.memory import RedisLongTermMemoryService, RedisLongTermMemoryServiceConfig
-from adk_redis.sessions import (
+from adk_redis import (
+    RedisLongTermMemoryService,
+    RedisLongTermMemoryServiceConfig,
     RedisWorkingMemorySessionService,
     RedisWorkingMemorySessionServiceConfig,
 )
 
-# Configure session service (Tier 1: Working Memory)
-session_config = RedisWorkingMemorySessionServiceConfig(
-    api_base_url="http://localhost:8088",  # Agent Memory Server URL
-    default_namespace="my_app",
-    model_name="gpt-4o",  # Model for auto-summarization
-    context_window_max=8000,  # Trigger summarization at this token count
+session_service = RedisWorkingMemorySessionService(
+    config=RedisWorkingMemorySessionServiceConfig(
+        api_base_url="http://localhost:8088",
+        default_namespace="my_app",
+        model_name="gpt-4o",
+        context_window_max=8000,
+    ),
 )
-session_service = RedisWorkingMemorySessionService(config=session_config)
-
-# Configure memory service (Tier 2: Long-Term Memory)
-memory_config = RedisLongTermMemoryServiceConfig(
-    api_base_url="http://localhost:8088",
-    default_namespace="my_app",
-    extraction_strategy="discrete",  # Extract individual facts
-    recency_boost=True,  # Prioritize recent memories in search
+memory_service = RedisLongTermMemoryService(
+    config=RedisLongTermMemoryServiceConfig(
+        api_base_url="http://localhost:8088",
+        default_namespace="my_app",
+        extraction_strategy="discrete",
+        recency_boost=True,
+    ),
 )
-memory_service = RedisLongTermMemoryService(config=memory_config)
 
-# Create agent
 agent = Agent(
+    model="gemini-2.5-flash",
     name="memory_agent",
-    model="gemini-2.0-flash",
     instruction="You are a helpful assistant with long-term memory.",
 )
 
-# Create runner with both services
 runner = Runner(
-    agent=agent,
     app_name="my_app",
+    agent=agent,
     session_service=session_service,
     memory_service=memory_service,
 )
 ```
 
-**How it works:**
-
-1. **Working Memory**: Stores session messages, state, and handles auto-summarization
-2. **Background Extraction**: Automatically promotes important information to long-term memory
-3. **Long-Term Memory**: Provides semantic search across all sessions for relevant context
-4. **Recency Boosting**: Prioritizes recent memories while maintaining access to historical knowledge
+How it works: the session service stores conversation events in working memory and auto-summarizes when the token budget is hit; the memory service runs background extraction to long-term memory and surfaces a recency-boosted semantic search.
 
-### Vector Search Tools
-
-RAG with semantic search using RedisVL:
+### Search over a Redis index (in-process)
 
 ```python
 from google.adk import Agent
 from redisvl.index import SearchIndex
 from redisvl.utils.vectorize import HFTextVectorizer
 
-from adk_redis.tools import RedisVectorSearchTool, RedisVectorQueryConfig
-
-# Create a vectorizer (HuggingFace, OpenAI, Cohere, Mistral, Voyage AI, etc.)
-vectorizer = HFTextVectorizer(model="sentence-transformers/all-MiniLM-L6-v2")
+from adk_redis import RedisVectorQueryConfig, RedisVectorSearchTool
 
-# Connect to existing search index
+vectorizer = HFTextVectorizer(model="redis/langcache-embed-v2")
 index = SearchIndex.from_existing("products", redis_url="redis://localhost:6379")
 
-# Create the search tool with custom name and description
 search_tool = RedisVectorSearchTool(
     index=index,
     vectorizer=vectorizer,
-    config=RedisVectorQueryConfig(
-        vector_field_name="embedding",
-        return_fields=["name", "description", "price"],
-        num_results=5,
-    ),
-    # Customize the tool name and description for your domain
+    config=RedisVectorQueryConfig(num_results=5),
+    return_fields=["name", "description", "price"],
     name="search_product_catalog",
-    description="Search to find relevant products in the product catalog by description semantic similarity",
+    description="Find products by semantic similarity to the user's query.",
 )
 
-# Use with an ADK agent
 agent = Agent(
+    model="gemini-2.5-flash",
     name="search_agent",
-    model="gemini-2.0-flash",
-    instruction="Help users find products using semantic search.",
+    instruction="Help users find products.",
     tools=[search_tool],
 )
 ```
 
-**Customizing Tool Prompts:**
+All five search tools accept custom `name` / `description` so the LLM sees a domain-specific tool rather than a generic search helper.
 
-All search tools (`RedisVectorSearchTool`, `RedisHybridSearchTool`, `RedisTextSearchTool`, `RedisRangeSearchTool`, `RedisSQLSearchTool`) support custom `name` and `description` parameters to make them domain-specific:
+### Search over a Redis index (MCP)
 
-```python
-# Example: Medical knowledge base
-medical_search = RedisVectorSearchTool(
-    index=medical_index,
-    vectorizer=vectorizer,
-    name="search_medical_knowledge",
-    description="Search medical literature and clinical guidelines for relevant information",
-)
+Run `rvl mcp --config mcp_config.yaml` separately, then connect the agent with ADK's standard `McpToolset`:
 
-# Example: Customer support FAQ
-faq_search = RedisTextSearchTool(
-    index=faq_index,
-    name="search_support_articles",
-    description="Search customer support articles and FAQs by keywords",
-)
+```python
+from google.adk import Agent
+from google.adk.tools.mcp_tool import McpToolset
+from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
+from mcp import StdioServerParameters
 
-# Example: Legal document search
-legal_search = RedisHybridSearchTool(
-    index=legal_index,
-    vectorizer=vectorizer,
-    name="search_legal_documents",
-    description="Search legal documents using both semantic similarity and keyword matching",
+agent = Agent(
+    model="gemini-2.5-flash",
+    name="mcp_search_agent",
+    instruction="Use the search-records tool to answer questions.",
+    tools=[
+        McpToolset(
+            connection_params=StdioConnectionParams(
+                server_params=StdioServerParameters(
+                    command="rvl",
+                    args=[
+                        "mcp",
+                        "--config",
+                        "/path/to/mcp_config.yaml",
+                        "--read-only",
+                    ],
+                ),
+                timeout=30,
+            ),
+            tool_filter=["search-records"],
+        ),
+    ],
 )
 ```
 
-> **Note:** RedisVL supports many vectorizers including OpenAI, HuggingFace, Cohere, Mistral, Voyage AI, and more. See [RedisVL documentation](https://docs.redisvl.com/) for the full list.
+For an already-running remote server, swap `StdioConnectionParams` for `StreamableHTTPConnectionParams(url="http://localhost:8765/mcp", headers={"Authorization": "Bearer ..."})`.
 
-> **Future Enhancement:** We plan to add native support for ADK embeddings classes through a union type or wrapper, allowing seamless integration with ADK's embedding infrastructure alongside RedisVL vectorizers.
+See [examples/redisvl_mcp_search/](examples/redisvl_mcp_search/) for a runnable demo (knowledge-base corpus, hybrid mode, paired with the in-process [examples/redis_search_tools/](examples/redis_search_tools/) example).
 
----
+### Semantic cache
+
+```python
+from google.adk import Agent
+from redisvl.utils.vectorize import HFTextVectorizer
 
-## Features Overview
+from adk_redis import (
+    LLMResponseCache,
+    RedisVLCacheProvider,
+    RedisVLCacheProviderConfig,
+    create_llm_cache_callbacks,
+)
+
+provider = RedisVLCacheProvider(
+    config=RedisVLCacheProviderConfig(
+        redis_url="redis://localhost:6379",
+        ttl=3600,
+        distance_threshold=0.1,
+    ),
+    vectorizer=HFTextVectorizer(model="redis/langcache-embed-v2"),
+)
+llm_cache = LLMResponseCache(provider=provider)
+before_cb, after_cb = create_llm_cache_callbacks(llm_cache)
 
-### Memory Services
+agent = Agent(
+    model="gemini-2.5-flash",
+    name="cached_agent",
+    before_model_callback=before_cb,
+    after_model_callback=after_cb,
+)
+```
 
-Implements ADK's `BaseMemoryService` interface for persistent agent memory:
+For a managed alternative that needs no local vectorizer, swap in `LangCacheProvider` / `LangCacheProviderConfig` from `adk_redis`.
 
-| Feature | Description |
-|---------|-------------|
-| **Semantic Search** | Vector-based similarity search across all sessions |
-| **Recency Boosting** | Prioritize recent memories while maintaining historical access |
-| **Auto-Extraction** | LLM-based extraction of facts, preferences, and episodic memories |
-| **Cross-Session Retrieval** | Access knowledge from any previous conversation |
-| **Background Processing** | Non-blocking memory promotion and indexing |
+---
 
-**Implementation:** `RedisLongTermMemoryService`
+## Search tools
 
-### Session Services
+Two parallel paths for RAG over a Redis index. Pick by deployment shape.
 
-Implements ADK's `BaseSessionService` interface for conversation management:
+| Path | Use when |
+|---|---|
+| **In-process** | Single ADK process, fast onboarding, Python-side `FilterExpression` composition, per-tool customization. |
+| **MCP** (ADK's `McpToolset` against `rvl mcp`) | One Redis index served to multiple agents (Python, JS, Claude Desktop). Server-side `--read-only` / bearer auth. Schema-aware tool descriptions. |
 
-| Feature | Description |
-|---------|-------------|
-| **Message Storage** | Persist conversation messages and session state |
-| **Auto-Summarization** | Automatic summarization when context window limits are exceeded |
-| **Memory Promotion** | Trigger background extraction to long-term memory |
-| **State Management** | Store and retrieve arbitrary session state |
-| **Token Tracking** | Monitor context window usage |
+### In-process tools
 
-**Implementation:** `RedisWorkingMemorySessionService`
+| Tool | Best for | Notes |
+|---|---|---|
+| `RedisVectorSearchTool` | Semantic similarity | KNN vector search with metadata filters |
+| `RedisHybridSearchTool` | Combined search | Vector + BM25; native `FT.HYBRID` on Redis 8.4+, aggregation fallback below |
+| `RedisRangeSearchTool` | Threshold retrieval | Distance-bounded vector search. **No MCP equivalent.** |
+| `RedisTextSearchTool` | Keyword search | BM25 full-text; no embeddings needed |
+| `RedisSQLSearchTool` | SQL-style filters | `SELECT ... WHERE` with `:param` placeholders. Requires `adk-redis[sql]`. **No MCP equivalent.** |
 
-### Search Tools
+All five accept any vectorizer supported by RedisVL (OpenAI, HuggingFace, Cohere, Mistral, Voyage AI, custom) and any `FilterExpression` from `redisvl.query.filter`.
 
-Five specialized search tools for different RAG use cases:
+### MCP
 
-| Tool | Best For | Key Features |
-|------|----------|--------------|
-| **`RedisVectorSearchTool`** | Semantic similarity | Vector embeddings, KNN search, metadata filtering |
-| **`RedisHybridSearchTool`** | Combined search | Vector + text search, Redis 8.4+ native support, aggregation fallback |
-| **`RedisRangeSearchTool`** | Threshold-based retrieval | Distance-based filtering, similarity radius |
-| **`RedisTextSearchTool`** | Keyword search | Full-text search, no embeddings required |
-| **`RedisSQLSearchTool`** | SQL-style filters | `SELECT ... WHERE` against a bound index, parameterized queries (requires `adk-redis[sql]`) |
+Use ADK's standard `McpToolset` against a running [RedisVL MCP server](https://docs.redisvl.com) (`rvl mcp`). The server is configured per index via YAML and exposes:
 
-> All search tools support multiple vectorizers (OpenAI, HuggingFace, Cohere, Mistral, Voyage AI, etc.) and advanced filtering.
+- `search-records`: `vector`, `fulltext`, or `hybrid` (chosen at server start). Tool description includes filter and return-field hints derived from the bound index schema.
+- `upsert-records`: write path (suppress with `--read-only`).
 
-### RedisVL MCP toolset
+Supports `stdio`, `sse`, and `streamable-http` transports; bearer auth on HTTP. Requires `redisvl[mcp]` and a `rvl mcp` server. See the [Quick Start MCP snippet](#search-over-a-redis-index-mcp) above for the wiring.
 
-`create_redisvl_mcp_toolset(...)` returns an ADK `McpToolset` wired to RedisVL's own MCP server (`rvl mcp`). The server exposes schema-aware `search-records` and `upsert-records` tools whose descriptions include filter and return-field hints derived from the bound index. Use it when you want one Redis index served to multiple agents (Python, JS, Claude Desktop) over `stdio`, `sse`, or `streamable-http`. Requires `adk-redis[mcp-search]` and a Redis-side `rvl mcp` server (or YAML config). See [the search-tools guide](docs/user_guide/how_to_guides/search_tools.md) for the decision matrix vs the in-process tools above.
+For the full decision matrix and runnable demo, see [docs/user_guide/how_to_guides/search_tools.md](docs/user_guide/how_to_guides/search_tools.md).
 
-### Semantic Caching
+---
 
-Reduce latency and costs with similarity-based caching:
+## Memory backends
 
-| Feature | Description |
-|---------|-------------|
-| **LLM Response Cache** | Cache LLM responses and return similar cached results |
-| **Tool Result Cache** | Cache tool execution results to avoid redundant calls |
-| **Similarity Threshold** | Configurable distance threshold for cache hits |
-| **TTL Support** | Time-based cache expiration |
-| **Multiple Vectorizers** | Support for OpenAI, HuggingFace, local embeddings, etc. |
-| **LangCache (Managed)** | Cloud-hosted semantic cache — no local vectorizer needed |
+Three ways to ingest, store, and retrieve memory with Agent Memory Server, all interoperable:
 
-**Cache Providers:**
+| Approach | What it is | Reach for it when |
+|---|---|---|
+| **ADK Services** (`RedisLongTermMemoryService`, `RedisWorkingMemorySessionService`) | The `BaseMemoryService` and `BaseSessionService` implementations. ADK calls AMS for you. | You want framework-managed sessions and automatic memory extraction. Most production cases. |
+| **REST tools** (`MemoryPromptTool`, `SearchMemoryTool`, `CreateMemoryTool`, `UpdateMemoryTool`, `DeleteMemoryTool`, `GetMemoryTool`) | `BaseTool` subclasses that call AMS REST directly. The LLM decides when to invoke them. | You want the agent to control when memory is read or written. |
+| **MCP toolset** (`create_memory_mcp_toolset`) | Same memory operations surfaced over MCP/SSE. Standard MCP tool discovery. | You want one AMS instance shared across many agents, or you prefer MCP wiring over Python tool wrappers. |
 
-| Provider | Description | Vectorizer Required |
-|----------|-------------|:-------------------:|
-| `RedisVLCacheProvider` | Self-hosted semantic cache using RedisVL | Yes |
-| `LangCacheProvider` | Managed semantic cache via [Redis LangCache](https://redis.io/langcache) | No (server-side) |
+The protocol is REST in both of the first two; MCP for the third. All three operate on the same underlying memory.
 
-Both providers implement `BaseCacheProvider` and work with `LLMResponseCache` and `ToolCache`.
+---
 
-**LangCache Quick Start:**
+## Semantic cache
 
-```python
-from adk_redis import LangCacheProvider, LangCacheProviderConfig, LLMResponseCache
+Two providers, both implementing `BaseCacheProvider`. Pair either with `LLMResponseCache` or `ToolCache` and wire via `create_llm_cache_callbacks` / `create_tool_cache_callbacks`.
 
-# Configure LangCache (managed — no local embeddings needed)
-langcache_config = LangCacheProviderConfig(
-    cache_id="your-cache-id",
-    api_key="your-api-key",
-    ttl=3600,
-)
-cache_provider = LangCacheProvider(config=langcache_config)
+| Provider | Hosted | Vectorizer | Best for |
+|---|---|---|---|
+| `RedisVLCacheProvider` | Self-hosted Redis | Required (any RedisVL vectorizer) | Full control, your data stays in your Redis |
+| `LangCacheProvider` | Managed via [Redis LangCache](https://redis.io/langcache) | Server-side (none needed locally) | Zero-infra cache; no local Redis needed |
 
-# Use with LLM response caching
-llm_cache = LLMResponseCache(provider=cache_provider)
-```
+Both honor a configurable `distance_threshold` and per-entry `ttl`.
 
 ---
 
 ## Requirements
 
-- **Python** 3.10, 3.11, 3.12, or 3.13
-- **Google ADK** 1.0.0+ (validated against 1.23.0; 2.0.0 GA support tracked in the next release)
-- **RedisVL** 0.18.2+ (when the `search`, `langcache`, `sql`, or `mcp-search` extra is installed)
-- **For memory/session services:** [Redis Agent Memory Server](https://github.com/redis/agent-memory-server)
-- **For search tools:** Redis 8.4+ or Redis Cloud with Search capability
-- **For `RedisSQLSearchTool`:** `sql-redis` (installed by `adk-redis[sql]`)
-- **For RedisVL MCP toolset:** `redisvl[mcp]` and the `rvl mcp` CLI
+- Python 3.10, 3.11, 3.12, or 3.13
+- Google ADK 1.0+ (tested through 2.0 GA)
+- RedisVL 0.18.2+ when the `search`, `langcache`, or `sql` extra is installed
+- Redis 8.4+ (or Redis Cloud with Search) when using search tools or the cache providers
+- For session / memory services: a running [Agent Memory Server](https://github.com/redis/agent-memory-server)
+- For `RedisSQLSearchTool`: `sql-redis` (installed by `adk-redis[sql]`)
+- For the RedisVL MCP server: install `redisvl[mcp]>=0.18.2` and use the `rvl mcp` CLI; connect from ADK with `McpToolset`
 
 ---
 
 ## Examples
 
-Complete working examples with ADK web runner integration:
-
-| Example | Description | Features |
-|---------|-------------|----------|
-| **[simple_redis_memory](examples/simple_redis_memory/)** | Agent with two-tier memory architecture | Working memory, long-term memory, auto-summarization, semantic search |
-| **[semantic_cache](examples/semantic_cache/)** | Semantic caching for LLM responses | Vector-based cache, reduced latency, cost optimization, local embeddings |
-| **[langcache_cache](examples/langcache_cache/)** | Managed semantic caching via LangCache | Cloud-hosted cache, no local vectorizer, no Redis instance needed |
-| **[redis_search_tools](examples/redis_search_tools/)** | RAG with search tools | Vector search, hybrid search, range search, text search |
-| **[travel_agent_memory_hybrid](examples/travel_agent_memory_hybrid/)** | Travel agent with framework-managed memory | Redis session + memory services, automatic memory extraction, web search, calendar export, itinerary planning |
-| **[travel_agent_memory_tools](examples/travel_agent_memory_tools/)** | Travel agent with LLM-controlled memory | REST memory tools (search/create/update/delete), in-memory session, web search, calendar export, itinerary planning |
-| **[fitness_coach_mcp](examples/fitness_coach_mcp/)** | Fitness coach with MCP memory tools | MCP-based memory via SSE, semantic + episodic memory, workout tracking, injury awareness |
-
-### Memory Integration Approaches
-
-There are **three ways** to integrate memory with ADK agents using Redis Agent Memory Server:
+All examples run via `adk web` and ship with a README and `.env.example`.
 
-| Approach | Example | Protocol | Best For |
-|----------|---------|----------|----------|
-| **ADK Services** | `simple_redis_memory`, `travel_agent_memory_hybrid` | REST | Full framework integration (`BaseSessionService` + `BaseMemoryService`) |
-| **REST Tools** | `travel_agent_memory_tools` | REST | LLM-controlled memory with explicit tool calls |
-| **MCP Tools** | `fitness_coach_mcp` | SSE | Standard MCP protocol, automatic tool discovery |
+| Example | Demonstrates |
+|---|---|
+| [`simple_redis_memory`](examples/simple_redis_memory/) | Two-tier memory + auto-summarization |
+| [`travel_agent_memory_hybrid`](examples/travel_agent_memory_hybrid/) | Framework-managed memory: `RedisWorkingMemorySessionService` + `RedisLongTermMemoryService` in a custom FastAPI runner |
+| [`travel_agent_memory_tools`](examples/travel_agent_memory_tools/) | LLM-controlled memory: REST `SearchMemoryTool` / `CreateMemoryTool` / etc. in a default ADK runner |
+| [`fitness_coach_mcp`](examples/fitness_coach_mcp/) | AMS memory exposed over MCP via `create_memory_mcp_toolset` |
+| [`redis_search_tools`](examples/redis_search_tools/) | In-process RAG with vector + text + range search tools |
+| [`redis_sql_search`](examples/redis_sql_search/) | `RedisSQLSearchTool` answering catalog questions via parameterized SQL |
+| [`redisvl_mcp_search`](examples/redisvl_mcp_search/) | Same knowledge base as `redis_search_tools/`, served via `rvl mcp` over MCP |
+| [`semantic_cache`](examples/semantic_cache/) | Self-hosted semantic cache via `RedisVLCacheProvider` |
+| [`langcache_cache`](examples/langcache_cache/) | Managed semantic cache via `LangCacheProvider` |
 
-### MCP Integration
-
-MCP (Model Context Protocol) provides a standardized way to connect agents to tools. The `fitness_coach_mcp` example demonstrates this approach:
-
-```python
-from adk_redis import create_memory_mcp_toolset
-
-memory_tools = create_memory_mcp_toolset(
-    server_url="http://localhost:8088",
-    tool_filter=["search_long_term_memory", "create_long_term_memories"],
-)
-
-agent = Agent(model="gemini-2.0-flash", tools=[memory_tools])
-```
-
-**Available MCP Tools:**
-- `search_long_term_memory` - Semantic search across memories
-- `create_long_term_memories` - Store new memories (semantic or episodic)
-- `get_long_term_memory` - Retrieve memory by ID
-- `edit_long_term_memory` - Update existing memories
-- `delete_long_term_memories` - Remove memories
-- `memory_prompt` - Get context-enriched prompts
-- `set_working_memory` - Update working memory
-
-For a complete MCP example, see the [fitness_coach_mcp example](examples/fitness_coach_mcp/).
-
-#### RedisVL MCP server
-
-In addition to Agent Memory Server MCP tools, you can connect an agent to RedisVL's own MCP server (one Redis index per server) with `create_redisvl_mcp_toolset(...)`:
-
-```python
-from pydantic import SecretStr
-from adk_redis import create_redisvl_mcp_toolset
-
-# Remote server, streamable-http (default), read-only by default.
-search_tools = create_redisvl_mcp_toolset(
-    url="http://localhost:8000/mcp",
-    auth_token=SecretStr("..."),  # optional bearer
-)
-
-# Or spawn `rvl mcp --config <path>` in-process over stdio.
-search_tools = create_redisvl_mcp_toolset(
-    transport="stdio",
-    config_path="/etc/redisvl/mcp.yaml",
-    read_only=True,
-)
-
-agent = Agent(model="gemini-2.0-flash", tools=[search_tools])
-```
-
-Available tool names: `search-records`, `upsert-records` (also exported as `REDISVL_MCP_TOOL_SEARCH` / `REDISVL_MCP_TOOL_UPSERT`).
-
-### Travel Agent Examples Comparison
-
-Both travel agent examples use **Redis Agent Memory Server** for long-term memory. The difference is in how they integrate with ADK:
-
-| Aspect | `travel_agent_memory_hybrid` | `travel_agent_memory_tools` |
-|--------|------------------------------|----------------------------|
-| **How to Run** | `python main.py` (custom FastAPI) | `adk web .` (standard ADK CLI) |
-| **Session Service** | `RedisWorkingMemorySessionService` (Redis-backed) | ADK default (in-memory) |
-| **Memory Service** | `RedisLongTermMemoryService` (ADK interface) | REST tools only |
-| **Best For** | Full ADK service integration | Tool-based integration |
-
-Each example includes:
-- Complete runnable code
-- ADK web runner integration
-- Configuration examples
-- Setup instructions
+The two travel-agent examples use the same Agent Memory Server backend; the difference is whether the agent talks to AMS through framework services (`hybrid`) or LLM-driven tool calls (`tools`).
 
 ---
 
@@ -542,68 +348,53 @@ Each example includes:
 
 This project follows the [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html), matching the [ADK-Python core](https://github.com/google/adk-python) project conventions.
 
-### Quick Start
+### Setup
 
 ```bash
-# Clone the repository
 git clone https://github.com/redis-developer/adk-redis.git
 cd adk-redis
-
-# Install development dependencies
-make dev
-
-# Run all checks (format, lint, type-check, test)
-make check
+make dev               # install with all extras + dev deps
+make check             # format-check + lint + type-check + test
 ```
 
-### Available Commands
+### Targets
 
 ```bash
-make format      # Format code with pyink and isort
-make lint        # Run ruff linter
-make type-check  # Run mypy type checker
-make test        # Run pytest test suite
-make coverage    # Generate coverage report
+make format            # apply pyink + isort
+make lint              # ruff check
+make type-check        # mypy
+make test              # all tests (unit + integration)
+make test-unit         # unit tests only (no Redis required)
+make test-integration  # integration suite (needs Redis 8.4+ at REDIS_URL)
+make redis-up          # start a redis:8.4 container on :6399 for integration
+make redis-down        # stop and remove that container
+make test-cov          # coverage report
 ```
 
-### Code Quality
-See **[CONTRIBUTING.md](CONTRIBUTING.md)** for coding style, type hints, testing, and PR guidelines.
+See [CONTRIBUTING.md](CONTRIBUTING.md) for testing, style, and PR conventions.
 
 ---
 
 ## Contributing
 
-Please help us by contributing PRs, opening GitHub issues for bugs or new feature ideas, improving documentation, or increasing test coverage. See the following steps for contributing:
-
-1. [Open an issue](https://github.com/redis-developer/adk-redis/issues) for bugs or feature requests
-2. Read [CONTRIBUTING.md](CONTRIBUTING.md) and submit a pull request
-3. Improve documentation and examples
+Open an [issue](https://github.com/redis-developer/adk-redis/issues) for bugs and feature requests, or submit a PR following [CONTRIBUTING.md](CONTRIBUTING.md). Documentation and example contributions are equally welcome.
 
 ---
 
 ## License
 
-Apache 2.0 - See [LICENSE](LICENSE) for details.
+Apache 2.0. See [LICENSE](LICENSE).
 
 ---
 
-## Helpful Links
-
-### Documentation & Resources
-- **[PyPI Package](https://pypi.org/project/adk-redis/)** - Install with `pip install adk-redis`
-- **[GitHub Repository](https://github.com/redis-developer/adk-redis)** - Source code and issue tracking
-- **[Examples](examples/)** - Complete working examples with ADK web runner
-- **[Contributing Guide](CONTRIBUTING.md)** - How to contribute to the project
-
-### Setup Guides
-- **[Redis Setup Guide](docs/redis-setup.md)** - All Redis deployment options
-- **[Agent Memory Server Setup](docs/agent-memory-server-setup.md)** - Complete configuration
-- **[Integration Guide](docs/integration-guide.md)** - End-to-end setup with code examples
-
-### Related Projects
-- **[Google ADK](https://github.com/google/adk-python)** - Agent Development Kit framework
-- **[Redis Agent Memory Server](https://github.com/redis/agent-memory-server)** - Memory layer for AI agents
-- **[RedisVL](https://docs.redisvl.com/)** - Redis Vector Library documentation
-- **[Redis](https://redis.io/)** - Redis 8.4+ with Search, JSON, and vector capabilities
-
----
+## Helpful links
+
+- [PyPI](https://pypi.org/project/adk-redis/): install with `pip install adk-redis`
+- [Redis setup](docs/user_guide/how_to_guides/redis_setup.md): local, Docker, and Redis Cloud
+- [Agent Memory Server setup](docs/user_guide/how_to_guides/memory_server_setup.md): full AMS configuration
+- [Integration walkthrough](docs/user_guide/01_integration.md): end-to-end wiring
+- [Search tools guide](docs/user_guide/how_to_guides/search_tools.md): in-process vs MCP, decision matrix
+- [Google ADK](https://github.com/google/adk-python): agent framework
+- [Agent Memory Server](https://github.com/redis/agent-memory-server): memory backend
+- [RedisVL](https://docs.redisvl.com/): Redis Vector Library
+- [Redis LangCache](https://redis.io/langcache): managed semantic cache
diff --git a/SKILL.md b/SKILL.md
index bb100c3..52b176a 100644
--- a/SKILL.md
+++ b/SKILL.md
@@ -27,9 +27,9 @@ links:
   hybrid, range, BM25 text, or SQL `SELECT` over a RedisVL index).
 - The user wants persistent ADK sessions or long-term memory and is willing
   to run [Redis Agent Memory Server](https://github.com/redis/agent-memory-server).
-- The user wants to expose a Redis index to ADK via MCP, either through
-  the `rvl mcp` server (`create_redisvl_mcp_toolset`) or Agent Memory
-  Server's MCP endpoint (`create_memory_mcp_toolset`).
+- The user wants to expose a Redis index to ADK via MCP. For the index
+  itself, point ADK's native `McpToolset` at a `rvl mcp` server. For
+  Agent Memory Server's MCP endpoint, use `create_memory_mcp_toolset`.
 - The user wants semantic caching for an ADK agent (self-hosted via
   RedisVL or managed via Redis LangCache).
 
@@ -50,7 +50,6 @@ Optional extras (combine as needed):
 pip install 'adk-redis[memory]'      # sessions + long-term memory services
 pip install 'adk-redis[search]'      # RedisVL-backed search tools
 pip install 'adk-redis[sql]'         # RedisSQLSearchTool (sql-redis)
-pip install 'adk-redis[mcp-search]'  # create_redisvl_mcp_toolset helper
 pip install 'adk-redis[langcache]'   # managed semantic cache provider
 pip install 'adk-redis[all]'         # all of the above
 ```
@@ -124,18 +123,25 @@ runner = Runner(
 )
 ```
 
-### 4. RedisVL MCP toolset
+### 4. RedisVL MCP (native McpToolset)
 
 ```python
-from pydantic import SecretStr
-from adk_redis import create_redisvl_mcp_toolset
-
-mcp_tools = create_redisvl_mcp_toolset(
-    url="http://localhost:8000/mcp",
-    auth_token=SecretStr("..."),
-    read_only=True,
+from google.adk.tools.mcp_tool import McpToolset
+from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
+from mcp import StdioServerParameters
+
+mcp_tools = McpToolset(
+    connection_params=StdioConnectionParams(
+        server_params=StdioServerParameters(
+            command="rvl",
+            args=["mcp", "--config", "/path/to/mcp_config.yaml", "--read-only"],
+        ),
+        timeout=30,
+    ),
+    tool_filter=["search-records"],
 )
 ```
+For a remote server, swap in `StreamableHTTPConnectionParams(url=..., headers={"Authorization": "Bearer ..."})`.
 
 ### 5. Semantic cache for LLM responses
 
@@ -177,8 +183,10 @@ root_agent = Agent(
   (KNN). It lives on `RedisRangeQueryConfig`.
 - **Stopwords**: `RedisTextQueryConfig.stopwords` defaults to `"english"`
   which requires `nltk`. Set to `None` if `nltk` is unavailable.
-- **MCP transports**: `create_redisvl_mcp_toolset` accepts only
-  `"stdio"`, `"sse"`, `"streamable-http"`; unknown values raise.
+- **MCP transports**: ADK's `McpToolset` accepts
+  `StdioConnectionParams`, `SseConnectionParams`, or
+  `StreamableHTTPConnectionParams`. Pick the connection-params class
+  for your transport rather than passing a string.
 - **Vector dtype**: must match the index schema. Default is `"float32"`.
 - **Async loops**: the session service builds a new `MemoryAPIClient` per
   call to avoid event-loop bleed across `Runner.run` invocations; do not
@@ -191,9 +199,10 @@ When this skill is loaded:
 1. Confirm whether the user already has a Redis index. If not, walk them
    through `IndexSchema.from_yaml(...)` + `SearchIndex.create(overwrite=True)`
    before introducing any search tool.
-2. Prefer the helper (`create_redisvl_mcp_toolset`) over hand-rolled
-   `StdioConnectionParams` for the RedisVL MCP path. The helper does
-   transport validation, bearer auth, and `--read-only` defaults.
+2. For the RedisVL MCP path, use ADK's native `McpToolset` with the
+   appropriate `*ConnectionParams` class. Set `tool_filter=["search-records"]`
+   to suppress writes, or pass `--read-only` to the `rvl mcp` invocation
+   in stdio mode.
 3. Never invent class or method names. Only those documented at
    `links.docs`.
 4. For breaking-change questions, consult `CHANGELOG.md` in the repo.
diff --git a/blog_post.docx b/blog_post.docx
deleted file mode 100644
index 541f150..0000000
Binary files a/blog_post.docx and /dev/null differ
diff --git a/blog_post.md b/blog_post.md
deleted file mode 100644
index db2e074..0000000
--- a/blog_post.md
+++ /dev/null
@@ -1,540 +0,0 @@
-# Redis as the Memory Layer for Google ADK Agents
-
-Google's Agent Development Kit (ADK) provides clean abstractions for building AI agents. It defines interfaces for memory, sessions, tools, and callbacks. But the default implementations store everything in process memory, which means state disappears on restart and there is no path from prototype to production without replacing the storage layer.
-
-We built `adk-redis` to be that storage layer. It is a Python package that implements ADK's `BaseMemoryService`, `BaseSessionService`, and tool interfaces using Redis, giving agents persistent two-tier memory, semantic search for RAG, and response caching, all without requiring changes to agent logic. The package connects to the Redis Agent Memory Server for long-term memory extraction and to RedisVL, the Redis Vector Library that provides a Python client for vector similarity, hybrid search, and semantic caching on top of Redis, for all retrieval and caching operations. Together, these let agents built with ADK move from a local demo to a deployed system by swapping a few service configurations.
-
-For teams already running Redis, this means first-class ADK integration on top of infrastructure you already operate and understand. And for teams that want to avoid tying their agent stack to a single cloud provider, `adk-redis` provides a production-quality alternative that runs anywhere Redis runs, whether that is Docker on a laptop, a managed Redis service on any cloud, or bare-metal servers in your own data center.
-
-In this post, we walk through the architecture and implementation of `adk-redis`. We cover how the memory system works, the four search tools available, how semantic caching works, and the three distinct integration patterns for connecting agents to memory. We build up from individual components to a fully wired travel planning agent that uses all of them together.
-
-## What `adk-redis` Actually Provides
-
-Before diving into implementation, it helps to see the full landscape. The package is organized around six capabilities.
-
-**Memory Services** implement ADK's `BaseMemoryService`. This is long-term memory. The service connects to the Redis Agent Memory Server, which handles semantic search, automatic fact extraction, and recency-boosted retrieval across all of your agent's past conversations.
-
-**Session Services** implement ADK's `BaseSessionService`. This is working memory. Sessions store the current conversation, manage session state, and automatically summarize older messages when the context window gets too large.
-
-**Memory Tools** give the LLM explicit, fine-grained control over long-term memory through ADK-compatible tool calls. Rather than relying on the framework to manage memory automatically, the agent can search, create, update, and delete memories on its own. These tools communicate with the Agent Memory Server over REST and include `SearchMemoryTool`, `CreateMemoryTool`, `UpdateMemoryTool`, `DeleteMemoryTool`, `GetMemoryTool`, and `MemoryPromptTool` (which enriches a prompt with relevant memories before sending it to the model).
-
-**MCP Tools** expose the same memory operations through the Model Context Protocol (MCP) instead of REST. You point ADK's `McpToolset` at the Agent Memory Server's SSE endpoint, and tool discovery happens automatically. This is useful when you want a standardized protocol layer between your agent and its memory backend, or when you are already using MCP for other tool integrations.
-
-**Search Tools** wrap RedisVL (the Redis Vector Library) into ADK-compatible tools that your agent's LLM can call directly. There are four variants covering vector search, hybrid search, text search, and range search.
-
-**Semantic Caching** intercepts LLM calls and tool executions, checking whether a semantically similar prompt has been seen before. If so, it returns the cached response instead of making a new API call. This works through ADK's callback system, so enabling it requires no changes to your agent's core logic.
-
-The package is modular. You install only what you need.
-
-```bash
-pip install adk-redis[memory]     # Memory and session services
-pip install adk-redis[search]     # Search tools via RedisVL
-pip install adk-redis[langcache]  # Managed semantic caching
-pip install adk-redis[all]        # Everything
-```
-
-## The Two-Tier Memory Architecture
-
-The central design idea behind `adk-redis` is a two-tier memory system that mirrors how human memory works. There is a fast, limited working memory for the current conversation, and a slower, persistent long-term memory for facts and preferences that should survive across sessions.
-
-### Tier 1 (Working Memory via `RedisWorkingMemorySessionService`)
-
-Working memory handles the current session. Every message exchanged between the user and the agent is stored in the Redis Agent Memory Server. When the conversation grows long enough to approach the model's context window limit, the service automatically summarizes older messages, compressing them into a summary while preserving the most recent exchanges in full.
-
-This is a surprisingly important feature. Without it, you face a hard tradeoff. Either you truncate old messages and lose context, or you send the full conversation and hit token limits (and costs). Auto-summarization gives you a middle path.
-
-Here is how you configure it.
-
-```python
-from adk_redis.sessions import (
-    RedisWorkingMemorySessionService,
-    RedisWorkingMemorySessionServiceConfig,
-)
-
-session_config = RedisWorkingMemorySessionServiceConfig(
-    api_base_url="http://localhost:8088",
-    default_namespace="my_app",
-    model_name="gpt-4o",
-    context_window_max=8000,
-)
-session_service = RedisWorkingMemorySessionService(config=session_config)
-```
-
-The `context_window_max` parameter is what triggers summarization. When the token count of stored messages crosses this threshold, the Agent Memory Server uses the model specified in `model_name` to summarize older turns. The `default_namespace` isolates your application's data from other applications sharing the same Redis instance.
-
-Under the hood, the session service implements all of ADK's required methods. `create_session`, `get_session`, `list_sessions`, `delete_session`, and `append_event`. The `append_event` method is particularly worth noting. Rather than re-sending the entire conversation on every turn, it uses an incremental append API, sending only the new message. This keeps network overhead proportional to the message size, not the conversation length.
-
-### Tier 2 (Long-Term Memory via `RedisLongTermMemoryService`)
-
-Long-term memory is where the real intelligence lives. After each conversation (or on a configurable debounce), the Agent Memory Server extracts structured information from the dialogue. "The user prefers window seats." "The user is allergic to shellfish." "The user visited Tokyo last March." These extracted memories are embedded as vectors and stored in Redis, where they become searchable across all past sessions.
-
-```python
-from adk_redis.memory import (
-    RedisLongTermMemoryService,
-    RedisLongTermMemoryServiceConfig,
-)
-
-memory_config = RedisLongTermMemoryServiceConfig(
-    api_base_url="http://localhost:8088",
-    default_namespace="my_app",
-    extraction_strategy="discrete",
-    recency_boost=True,
-    semantic_weight=0.7,
-    recency_weight=0.3,
-)
-memory_service = RedisLongTermMemoryService(config=memory_config)
-```
-
-
-The `extraction_strategy` parameter controls how the server breaks down conversations into storable facts. The `"discrete"` strategy extracts individual facts as separate memories, which makes them independently searchable. Other options include `"summary"` (a narrative summary of the conversation) and `"preferences"` (focused on user preferences).
-
-Recency boosting deserves a closer look. When searching memories, raw semantic similarity alone often isn't enough. A user might have said "I love Italian food" three years ago, and "Actually, I've been getting into Japanese cuisine lately" last week. Both are semantically relevant to a query about food preferences, but the recent one matters more.
-
-The recency boosting system addresses this by combining two scores. The `semantic_weight` controls how much the vector similarity matters, while `recency_weight` controls how much recency matters. Within the recency score itself, `freshness_weight` favors memories that were recently accessed, and `novelty_weight` favors memories that were recently created. The `half_life_last_access_days` and `half_life_created_days` parameters control how quickly each signal decays. A half-life of 7 days means that a memory's freshness score drops to 50% after a week of not being accessed.
-
-This is a thoughtful design. It avoids the common failure mode of semantic search systems that return stale information with high confidence.
-
-### Wiring Both Tiers Together
-
-With both services configured, you connect them to an ADK `Runner`.
-
-```python
-from google.adk import Agent
-from google.adk.runners import Runner
-
-agent = Agent(
-    name="memory_agent",
-    model="gemini-2.5-flash",
-    instruction="You are a helpful assistant with long-term memory.",
-)
-
-runner = Runner(
-    agent=agent,
-    app_name="my_app",
-    session_service=session_service,
-    memory_service=memory_service,
-)
-```
-
-The flow is now automatic. Messages are stored in working memory as the conversation happens. When the agent finishes a turn, a callback can trigger `add_session_to_memory()`, which pushes the conversation to the Agent Memory Server for background extraction. On subsequent sessions, the memory service's `search_memory` method retrieves relevant facts from across all past conversations.
-
-
-## Three Ways to Integrate Memory
-
-One of the more interesting design decisions in `adk-redis` is that it offers three distinct approaches for connecting agents to memory. Each approach has different tradeoffs around control, complexity, and standardization.
-
-### Approach 1. ADK Services (Framework-Managed)
-
-This is what we covered in the two-tier memory section. You configure `RedisWorkingMemorySessionService` and `RedisLongTermMemoryService`, pass them to the `Runner`, and the framework handles everything automatically. Memory extraction happens in the background. Search happens before each agent turn. The agent code itself never directly interacts with memory.
-
-This approach is the simplest to implement and the hardest to customize. The agent has no explicit control over *what* gets stored or *when* it searches. It is best for applications where you want memory to be invisible infrastructure.
-
-### Approach 2. REST Tools (LLM-Controlled)
-
-Instead of (or in addition to) framework-managed services, you can give the agent explicit memory tools. These are ADK tools that the LLM calls like any other function.
-
-```python
-from adk_redis.tools.memory import (
-    SearchMemoryTool, CreateMemoryTool,
-    UpdateMemoryTool, DeleteMemoryTool,
-    MemoryToolConfig,
-)
-
-memory_config = MemoryToolConfig(
-    api_base_url="http://localhost:8088",
-    default_namespace="my_app",
-    recency_boost=True,
-)
-
-tools = [
-    SearchMemoryTool(config=memory_config),
-    CreateMemoryTool(config=memory_config),
-    UpdateMemoryTool(config=memory_config),
-    DeleteMemoryTool(config=memory_config),
-]
-```
-
-With this approach, the LLM decides when to search memory, what to store, and what to update. The agent prompt needs to instruct the LLM on memory management strategy. This requires more prompt engineering, but it gives the agent genuine autonomy over its own memory.
-
-The travel agent example in the repo uses a hybrid of both approaches. Framework services handle session persistence and automatic background extraction. Memory tools give the LLM explicit CRUD control over long-term memories. This is arguably the most powerful configuration, because the agent gets both automatic memory management and the ability to deliberately store or retrieve specific facts.
-
-### Approach 3. MCP Tools (Model Context Protocol)
-
-MCP is a standardized protocol for connecting agents to tools via Server-Sent Events (SSE). Instead of REST-based tool implementations, you point the agent at the Agent Memory Server's MCP endpoint and let ADK's `McpToolset` handle tool discovery automatically.
-
-```python
-from adk_redis.tools.mcp_memory import create_memory_mcp_toolset
-
-memory_tools = create_memory_mcp_toolset(
-    server_url="http://localhost:9000",
-    tool_filter=["search_long_term_memory", "create_long_term_memories"],
-)
-
-agent = Agent(
-    model="gemini-2.5-flash",
-    name="fitness_coach",
-    tools=[memory_tools],
-)
-```
-
-The `tool_filter` parameter controls which MCP tools are exposed to the LLM. The Agent Memory Server exposes seven tools through MCP, including `search_long_term_memory`, `create_long_term_memories`, `get_long_term_memory`, `edit_long_term_memory`, `delete_long_term_memories`, `memory_prompt`, and `set_working_memory`.
-
-The fitness coach example in the repo demonstrates this approach. It connects to memory via MCP and stores both semantic memories (user profile, injuries, equipment) and episodic memories (workouts with timestamps, milestones). The distinction between semantic and episodic memory types is particularly useful. Semantic memories represent timeless facts ("user has a knee injury"), while episodic memories represent events ("user completed 3x12 rows on March 9th").
-
-MCP is the most standardized approach and makes it easy to swap memory backends without changing agent code. The tradeoff is that it requires running the Agent Memory Server with MCP support enabled on a separate port.
-
-
-## Search Tools for RAG
-
-Memory services handle what the agent remembers from past conversations. But what about external knowledge? Product catalogs, documentation, knowledge bases? This is the domain of retrieval-augmented generation (RAG), and `adk-redis` provides four search tools that plug directly into ADK's tool system.
-
-Each tool wraps a RedisVL query type and exposes itself as a function the LLM can call. The LLM sees a function declaration with a `query` parameter, decides when to use it, and gets back structured results.
-
-### RedisVectorSearchTool
-
-The most straightforward option. It embeds the query using a vectorizer, performs K-nearest-neighbor search against a Redis index, and returns the top results.
-
-```python
-from redisvl.index import SearchIndex
-from redisvl.utils.vectorize import HFTextVectorizer
-from adk_redis.tools import RedisVectorSearchTool, RedisVectorQueryConfig
-
-vectorizer = HFTextVectorizer(model="redis/langcache-embed-v2")
-index = SearchIndex.from_existing("products", redis_url="redis://localhost:6379")
-
-search_tool = RedisVectorSearchTool(
-    index=index,
-    vectorizer=vectorizer,
-    config=RedisVectorQueryConfig(
-        vector_field_name="embedding",
-        return_fields=["name", "description", "price"],
-        num_results=5,
-    ),
-    name="search_product_catalog",
-    description="Find products by semantic similarity to a description.",
-)
-```
-
-The `name` and `description` parameters matter more than they might seem. These are what the LLM reads to decide whether and when to call the tool. A vague description like "search documents" will lead to the LLM calling it at the wrong times. A specific one like "Find products by semantic similarity to a description" gives the LLM the context it needs.
-
-### RedisHybridSearchTool
-
-Hybrid search combines vector similarity with BM25 keyword matching. This is valuable when queries contain specific terms (product IDs, technical acronyms, exact names) that semantic search alone might miss.
-
-The tool auto-detects whether your Redis server and RedisVL version support native hybrid search (Redis 8.4+ with RedisVL 0.13+). If they do, it uses the server-side `FT.HYBRID` command. If not, it falls back to a client-side aggregation approach. This version detection happens at initialization, so you don't need to think about it.
-
-```python
-from adk_redis.tools import RedisHybridSearchTool, RedisHybridQueryConfig
-
-hybrid_tool = RedisHybridSearchTool(
-    index=index,
-    vectorizer=vectorizer,
-    config=RedisHybridQueryConfig(
-        text_field_name="content",
-        combination_method="LINEAR",
-        linear_alpha=0.7,
-    ),
-    name="search_legal_documents",
-    description="Search legal documents using both semantic and keyword matching.",
-)
-```
-
-### RedisTextSearchTool and RedisRangeSearchTool
-
-`RedisTextSearchTool` performs pure BM25 keyword search. No embeddings, no vectorizer needed. It is the right choice when the query is about exact terms, error messages, or API names.
-
-`RedisRangeSearchTool` is a less common but useful variant. Instead of returning the top-K results, it returns all documents within a distance threshold. This is useful for exhaustive retrieval, such as "find everything related to authentication in our documentation," where you want comprehensive coverage rather than a ranked list.
-
-Here is a concrete example from the `redis_search_tools` example in the repo, which wires all three search modalities into a single agent.
-
-```python
-from adk_redis.tools import (
-    RedisVectorSearchTool, RedisVectorQueryConfig,
-    RedisTextSearchTool, RedisTextQueryConfig,
-    RedisRangeSearchTool, RedisRangeQueryConfig,
-)
-
-tools = [
-    RedisVectorSearchTool(
-        name="semantic_search",
-        description="Semantic similarity search for conceptual queries.",
-        index=index, vectorizer=vectorizer,
-        config=RedisVectorQueryConfig(num_results=5),
-        return_fields=["title", "content", "category"],
-    ),
-    RedisTextSearchTool(
-        name="keyword_search",
-        description="Keyword search for exact terms and phrases.",
-        index=index,
-        config=RedisTextQueryConfig(
-            text_field_name="content", text_scorer="BM25STD"
-        ),
-        return_fields=["title", "content", "category"],
-    ),
-    RedisRangeSearchTool(
-        name="range_search",
-        description="Returns ALL documents within a semantic distance threshold.",
-        index=index, vectorizer=vectorizer,
-        config=RedisRangeQueryConfig(distance_threshold=0.5),
-        return_fields=["title", "content", "category"],
-    ),
-]
-
-agent = Agent(
-    model="gemini-2.5-flash",
-    name="search_agent",
-    instruction=(
-        "You have three search tools. Use semantic_search for conceptual "
-        "queries, keyword_search for exact terms, range_search for exhaustive "
-        "retrieval."
-    ),
-    tools=tools,
-)
-```
-
-The instruction prompt is doing real work here. It teaches the LLM when to use each tool and what to expect from each. This kind of prompt engineering is not optional. Without it, the LLM will default to calling whichever tool appears first or whichever has the most generic description.
-
-## Semantic Caching
-
-LLM API calls are slow and expensive. If your agent handles support queries, a significant fraction of incoming questions will be semantically similar. "How do I reset my password?" and "I need to change my password" should produce the same response, and there is no reason to pay for two LLM calls.
-
-`adk-redis` provides semantic caching at two levels, LLM response caching and tool result caching, both backed by Redis.
-
-### LLM Response Cache
-
-The LLM cache intercepts calls to the language model through ADK's callback system. Before each model call, it checks whether a semantically similar prompt already exists in Redis. If it does, it returns the cached response immediately, skipping the LLM entirely. If it doesn't, it lets the call proceed and stores the response for future lookups.
-
-```python
-from redisvl.utils.vectorize import HFTextVectorizer
-from adk_redis.cache import (
-    RedisVLCacheProvider, RedisVLCacheProviderConfig,
-    LLMResponseCache, LLMResponseCacheConfig,
-    create_llm_cache_callbacks,
-)
-
-vectorizer = HFTextVectorizer(model="redis/langcache-embed-v1")
-
-provider = RedisVLCacheProvider(
-    config=RedisVLCacheProviderConfig(
-        redis_url="redis://localhost:6379",
-        name="my_llm_cache",
-        ttl=3600,
-        distance_threshold=0.1,
-    ),
-    vectorizer=vectorizer,
-)
-
-llm_cache = LLMResponseCache(
-    provider=provider,
-    config=LLMResponseCacheConfig(
-        first_message_only=True,
-        include_app_name=True,
-        include_user_id=True,
-    ),
-)
-
-before_cb, after_cb = create_llm_cache_callbacks(llm_cache)
-
-agent = Agent(
-    name="cached_agent",
-    model="gemini-2.0-flash",
-    instruction="You are a helpful assistant.",
-    before_model_callback=before_cb,
-    after_model_callback=after_cb,
-)
-```
-
-A few design decisions are worth noting here.
-
-The `distance_threshold` parameter (set to 0.1 in this example) controls how similar two prompts need to be for a cache hit. A value of 0.0 means exact match only. A value of 0.1 allows small variations in phrasing. Going much higher risks returning cached responses for genuinely different questions. Tuning this threshold is application-specific and worth experimenting with.
-
-The `first_message_only` option is a practical default. In a multi-turn conversation, later messages depend heavily on prior context, making semantic cache hits unreliable. Caching only the first message (which is typically a standalone question) avoids returning contextually wrong responses.
-
-The cache is also smart about what it does *not* cache. Function call responses (where the LLM is invoking a tool) are skipped, as are error responses. This prevents caching intermediate steps that shouldn't be reused.
-
-### Managed Caching with LangCache
-
-If you'd rather not manage your own Redis instance and embedding model for caching, `adk-redis` also supports LangCache, a managed semantic caching service from Redis. With LangCache, embeddings are generated server-side, so you don't need a local vectorizer at all.
-
-```python
-from adk_redis.cache import LangCacheProvider, LangCacheProviderConfig
-
-provider = LangCacheProvider(
-    config=LangCacheProviderConfig(
-        cache_id="your-cache-id",
-        api_key="your-api-key",
-        ttl=3600,
-    )
-)
-```
-
-The same `LLMResponseCache` and `ToolCache` classes work with either provider. You just swap the backend.
-
-### Tool Result Cache
-
-The tool cache follows the same pattern but for tool executions rather than LLM calls. If your agent calls an external API with the same arguments repeatedly, the tool cache can short-circuit the call and return the cached result.
-
-```python
-from adk_redis.cache import ToolCache, ToolCacheConfig, create_tool_cache_callbacks
-
-tool_cache = ToolCache(
-    provider=provider,
-    config=ToolCacheConfig(
-        tool_names={"web_search", "get_weather"},
-    ),
-)
-
-before_tool_cb, after_tool_cb = create_tool_cache_callbacks(tool_cache)
-```
-
-The `tool_names` set lets you specify exactly which tools should be cached. This is important because not all tools are idempotent. You probably want to cache `get_weather` (same city, same hour, same result) but not `send_email` (same arguments, but each call should actually execute).
-
-## Walking Through the Travel Agent
-
-To make all of this concrete, let's trace through the `travel_agent_memory_hybrid` example, which is the most complete example in the repo. It combines framework-managed services, LLM-controlled memory tools, web search, itinerary planning, and calendar export into a single agent.
-
-### The Entrypoint
-
-The `main.py` file sets up the infrastructure. It registers custom service factories with ADK's service registry, creates both the session service and memory service, and launches a FastAPI app with the ADK web runner.
-
-```python
-from adk_redis.memory import RedisLongTermMemoryService, RedisLongTermMemoryServiceConfig
-from adk_redis.sessions import RedisWorkingMemorySessionService, RedisWorkingMemorySessionServiceConfig
-
-# Register factories so ADK can instantiate them from URIs
-registry = get_service_registry()
-registry.register_session_service("redis-working-memory", redis_session_factory)
-registry.register_memory_service("redis-long-term-memory", redis_memory_factory)
-
-# Build URIs and create the FastAPI app
-app = get_fast_api_app(
-    agents_dir=".",
-    session_service_uri="redis-working-memory://localhost:8088",
-    memory_service_uri="redis-long-term-memory://localhost:8088",
-    web=True,
-)
-```
-
-The URI-based factory pattern is worth noting. ADK's service registry lets you register custom service implementations behind URI schemes. This means you can switch between in-memory and Redis-backed services by changing a URI string, without modifying any agent code.
-
-### The Agent
-
-The agent itself is defined in `agent.py`. It assembles a rich set of tools spanning memory, search, and planning.
-
-```python
-from adk_redis.tools.memory import (
-    SearchMemoryTool, CreateMemoryTool,
-    UpdateMemoryTool, DeleteMemoryTool,
-    MemoryToolConfig,
-)
-from google.adk.tools import preload_memory, load_memory
-
-memory_config = MemoryToolConfig(
-    api_base_url="http://localhost:8088",
-    default_namespace="travel_agent_memory_hybrid",
-    recency_boost=True,
-    search_top_k=10,
-)
-
-tools = [
-    SearchMemoryTool(config=memory_config),
-    CreateMemoryTool(config=memory_config),
-    UpdateMemoryTool(config=memory_config),
-    DeleteMemoryTool(config=memory_config),
-    preload_memory,
-    load_memory,
-    CalendarExportTool(),
-    ItineraryPlannerTool(),
-]
-```
-
-Notice the layered memory strategy. `preload_memory` and `load_memory` are ADK's built-in tools that hook into the `RedisLongTermMemoryService` we configured in `main.py`. These provide automatic, framework-controlled memory retrieval. The `SearchMemoryTool`, `CreateMemoryTool`, and friends give the LLM explicit control on top of that.
-
-The agent also has an `after_agent_callback` that calls `add_session_to_memory()` after each turn. This is what triggers background extraction of facts and preferences into long-term memory.
-
-```python
-async def after_agent(callback_context: CallbackContext):
-    await callback_context.add_session_to_memory()
-
-root_agent = Agent(
-    model="gemini-2.5-flash",
-    name="travel_agent",
-    tools=tools,
-    after_agent_callback=after_agent,
-    instruction="...",  # Detailed prompt with memory management strategy
-)
-```
-
-### What Happens at Runtime
-
-When a user starts a conversation, the following sequence plays out.
-
-1. ADK creates (or retrieves) a session via `RedisWorkingMemorySessionService`. The session is stored in the Agent Memory Server.
-2. The agent's `preload_memory` tool automatically searches long-term memory for context relevant to the current conversation.
-3. The user sends a message. The message is appended to working memory via the incremental append API.
-4. The LLM generates a response. If it needs travel information, it can call web search tools. If it wants to check the user's preferences, it calls `SearchMemoryTool`. If the user shares a new preference, the LLM calls `CreateMemoryTool`.
-5. The response is appended to working memory.
-6. The `after_agent_callback` fires, sending the conversation to the Agent Memory Server for background extraction. The server pulls out facts like "user prefers direct flights" or "user wants to visit Japan in spring" and stores them as searchable long-term memories.
-7. If the conversation grows long, the working memory service automatically summarizes older turns to stay within the context window.
-
-All of this happens with a `pip install adk-redis[memory]`, a running Redis instance, and a running Agent Memory Server. The agent's Python code is clean, focused on domain logic rather than infrastructure plumbing.
-
-## Getting Started
-
-To run any of the examples, you need two things running.
-
-**Redis 8.4** provides the storage backend for everything. Vector indices, session data, cache entries.
-
-```bash
-docker run -d --name redis -p 6379:6379 redis:8.4-alpine
-```
-
-**Redis Agent Memory Server** handles memory extraction, summarization, and the working memory API. It sits between your agent and Redis, adding the intelligence layer.
-
-```bash
-docker run -d --name agent-memory-server -p 8088:8088 \
-  -e REDIS_URL=redis://host.docker.internal:6379 \
-  -e GEMINI_API_KEY=your-key \
-  -e GENERATION_MODEL=gemini/gemini-2.0-flash \
-  -e EMBEDDING_MODEL=gemini/text-embedding-004 \
-  redislabs/agent-memory-server:0.13.2 \
-  agent-memory api --host 0.0.0.0 --port 8088 --task-backend=asyncio
-```
-
-The Agent Memory Server uses LiteLLM under the hood, which means it supports 100+ LLM providers. You can swap in OpenAI, Anthropic, AWS Bedrock, or even local models via Ollama.
-
-Then install the package and run an example.
-
-```bash
-pip install adk-redis[all]
-cd examples/simple_redis_memory
-python main.py
-```
-
-## Conclusion
-
-`adk-redis` is a focused library that solves a specific problem well. It takes the interfaces that ADK defines, `BaseMemoryService`, `BaseSessionService`, the tool system, the callback system, and provides Redis-backed implementations that are production-grade rather than toy-grade.
-
-The key ideas worth taking away from this are the following.
-
-The **two-tier memory architecture** (working memory for sessions, long-term memory for persistent facts) is a pattern that scales well. It mirrors how real applications need to manage state, keeping the current context fast and small while maintaining a durable knowledge base.
-
-The **three integration approaches** (framework services, REST tools, MCP tools) give you a spectrum from fully automatic to fully LLM-controlled memory management. The hybrid approach, combining framework services with LLM-controlled tools, is particularly effective.
-
-**Semantic caching** is a straightforward way to reduce costs and latency, and `adk-redis` makes it easy to enable without changing your agent's core logic.
-
-The **search tools** provide a clean abstraction over RedisVL's query types, making it simple to add RAG capabilities to any ADK agent.
-
-All of this runs on Redis, a system that most teams already know how to operate, monitor, and scale.
-
-The [GitHub repository](https://github.com/redis-developer/adk-redis) includes seven complete examples, each focused on a different capability described in this post.
-
-- **`simple_redis_memory`** is the minimal starting point. It wires up `RedisWorkingMemorySessionService` and `RedisLongTermMemoryService` with a basic conversational agent, demonstrating two-tier memory with no other moving parts.
-- **`travel_agent_memory_hybrid`** is the most complete example. It combines framework-managed memory services with LLM-controlled memory tools, web search, itinerary planning, and calendar export into a single agent (this is the example we walked through above).
-- **`travel_agent_memory_tools`** uses the REST-based memory tools exclusively, without framework-managed services. The LLM has full control over when to search, create, update, and delete memories.
-- **`fitness_coach_mcp`** demonstrates MCP-based memory integration. The agent connects to the Agent Memory Server via SSE and manages semantic and episodic memories for workout tracking.
-- **`redis_search_tools`** shows all four RedisVL search tools (vector, hybrid, text, and range) plugged into a single agent with a product catalog dataset.
-- **`semantic_cache`** demonstrates local semantic caching using RedisVL, including both LLM response caching and tool result caching with ADK callbacks.
-- **`langcache_cache`** uses the managed LangCache service for semantic caching, with server-side embeddings and no local vectorizer required.
-
-The [Redis Agent Memory Server documentation](https://github.com/redis/agent-memory-server) covers the memory backend in detail, and the [RedisVL documentation](https://docs.redisvl.com) covers the vector search and caching capabilities that power the tools and cache providers.
\ No newline at end of file
diff --git a/blog_post_0.md b/blog_post_0.md
deleted file mode 100644
index 6f38b29..0000000
--- a/blog_post_0.md
+++ /dev/null
@@ -1,527 +0,0 @@
-# Give Your AI Agents a Brain with Redis
-
-## How `adk-redis` Brings Persistent Memory, Semantic Search, and Caching to Google's Agent Development Kit
-
-AI agents are only as useful as what they can remember. An agent that forgets your name between sessions, re-fetches the same data on every call, or can't search its own knowledge base isn't really an agent. It's a stateless function with a chat interface.
-
-Google's Agent Development Kit (ADK) provides strong abstractions for building agents, but it leaves a critical question unanswered out of the box. Where does the state actually live? ADK defines interfaces like `BaseMemoryService` and `BaseSessionService`, but the default implementations store everything in memory. Restart the process, and everything is gone.
-
-`adk-redis` is a Python package that fills this gap. It implements ADK's core interfaces using Redis as the storage backbone, giving your agents persistent memory, intelligent session management, production-grade search, and semantic caching. The result is that you can go from a toy demo to a production-ready agent by swapping in a few Redis-backed services, without changing your agent logic at all.
-
-This post walks through the full surface area of `adk-redis`. We will cover its two-tier memory architecture, the four search tools it provides for RAG, how semantic caching can cut your LLM costs, and the three distinct approaches for integrating memory into your agents. Along the way, we will build up from simple examples to a fully wired travel planning agent.
-
-## What `adk-redis` Actually Provides
-
-Before diving into implementation, it helps to see the full landscape. The package is organized around four pillars.
-
-**Memory Services** implement ADK's `BaseMemoryService`. This is long-term memory. The service connects to the Redis Agent Memory Server, which handles semantic search, automatic fact extraction, and recency-boosted retrieval across all of your agent's past conversations.
-
-**Session Services** implement ADK's `BaseSessionService`. This is working memory. Sessions store the current conversation, manage session state, and automatically summarize older messages when the context window gets too large.
-
-**Search Tools** wrap RedisVL (the Redis Vector Library) into ADK-compatible tools that your agent's LLM can call directly. There are four variants covering vector search, hybrid search, text search, and range search.
-
-**Semantic Caching** intercepts LLM calls and tool executions, checking whether a semantically similar prompt has been seen before. If so, it returns the cached response instead of making a new API call. This works through ADK's callback system, so enabling it requires no changes to your agent's core logic.
-
-The package is modular. You install only what you need.
-
-```bash
-pip install adk-redis[memory]     # Memory and session services
-pip install adk-redis[search]     # Search tools via RedisVL
-pip install adk-redis[langcache]  # Managed semantic caching
-pip install adk-redis[all]        # Everything
-```
-
-## The Two-Tier Memory Architecture
-
-The central design idea behind `adk-redis` is a two-tier memory system that mirrors how human memory works. There is a fast, limited working memory for the current conversation, and a slower, persistent long-term memory for facts and preferences that should survive across sessions.
-
-### Tier 1 (Working Memory via `RedisWorkingMemorySessionService`)
-
-Working memory handles the current session. Every message exchanged between the user and the agent is stored in the Redis Agent Memory Server. When the conversation grows long enough to approach the model's context window limit, the service automatically summarizes older messages, compressing them into a summary while preserving the most recent exchanges in full.
-
-This is a surprisingly important feature. Without it, you face a hard tradeoff. Either you truncate old messages and lose context, or you send the full conversation and hit token limits (and costs). Auto-summarization gives you a middle path.
-
-Here is how you configure it.
-
-```python
-from adk_redis.sessions import (
-    RedisWorkingMemorySessionService,
-    RedisWorkingMemorySessionServiceConfig,
-)
-
-session_config = RedisWorkingMemorySessionServiceConfig(
-    api_base_url="http://localhost:8088",
-    default_namespace="my_app",
-    model_name="gpt-4o",
-    context_window_max=8000,
-)
-session_service = RedisWorkingMemorySessionService(config=session_config)
-```
-
-The `context_window_max` parameter is what triggers summarization. When the token count of stored messages crosses this threshold, the Agent Memory Server uses the model specified in `model_name` to summarize older turns. The `default_namespace` isolates your application's data from other applications sharing the same Redis instance.
-
-Under the hood, the session service implements all of ADK's required methods. `create_session`, `get_session`, `list_sessions`, `delete_session`, and `append_event`. The `append_event` method is particularly worth noting. Rather than re-sending the entire conversation on every turn, it uses an incremental append API, sending only the new message. This keeps network overhead proportional to the message size, not the conversation length.
-
-### Tier 2 (Long-Term Memory via `RedisLongTermMemoryService`)
-
-Long-term memory is where the real intelligence lives. After each conversation (or on a configurable debounce), the Agent Memory Server extracts structured information from the dialogue. "The user prefers window seats." "The user is allergic to shellfish." "The user visited Tokyo last March." These extracted memories are embedded as vectors and stored in Redis, where they become searchable across all past sessions.
-
-```python
-from adk_redis.memory import (
-    RedisLongTermMemoryService,
-    RedisLongTermMemoryServiceConfig,
-)
-
-memory_config = RedisLongTermMemoryServiceConfig(
-    api_base_url="http://localhost:8088",
-    default_namespace="my_app",
-    extraction_strategy="discrete",
-    recency_boost=True,
-    semantic_weight=0.7,
-    recency_weight=0.3,
-)
-memory_service = RedisLongTermMemoryService(config=memory_config)
-```
-
-
-The `extraction_strategy` parameter controls how the server breaks down conversations into storable facts. The `"discrete"` strategy extracts individual facts as separate memories, which makes them independently searchable. Other options include `"summary"` (a narrative summary of the conversation) and `"preferences"` (focused on user preferences).
-
-Recency boosting deserves a closer look. When searching memories, raw semantic similarity alone often isn't enough. A user might have said "I love Italian food" three years ago, and "Actually, I've been getting into Japanese cuisine lately" last week. Both are semantically relevant to a query about food preferences, but the recent one matters more.
-
-The recency boosting system addresses this by combining two scores. The `semantic_weight` controls how much the vector similarity matters, while `recency_weight` controls how much recency matters. Within the recency score itself, `freshness_weight` favors memories that were recently accessed, and `novelty_weight` favors memories that were recently created. The `half_life_last_access_days` and `half_life_created_days` parameters control how quickly each signal decays. A half-life of 7 days means that a memory's freshness score drops to 50% after a week of not being accessed.
-
-This is a thoughtful design. It avoids the common failure mode of semantic search systems that return stale information with high confidence.
-
-### Wiring Both Tiers Together
-
-With both services configured, you connect them to an ADK `Runner`.
-
-```python
-from google.adk import Agent
-from google.adk.runners import Runner
-
-agent = Agent(
-    name="memory_agent",
-    model="gemini-2.5-flash",
-    instruction="You are a helpful assistant with long-term memory.",
-)
-
-runner = Runner(
-    agent=agent,
-    app_name="my_app",
-    session_service=session_service,
-    memory_service=memory_service,
-)
-```
-
-The flow is now automatic. Messages are stored in working memory as the conversation happens. When the agent finishes a turn, a callback can trigger `add_session_to_memory()`, which pushes the conversation to the Agent Memory Server for background extraction. On subsequent sessions, the memory service's `search_memory` method retrieves relevant facts from across all past conversations.
-
-
-## Search Tools for RAG
-
-Memory services handle what the agent remembers from past conversations. But what about external knowledge? Product catalogs, documentation, knowledge bases? This is the domain of retrieval-augmented generation (RAG), and `adk-redis` provides four search tools that plug directly into ADK's tool system.
-
-Each tool wraps a RedisVL query type and exposes itself as a function the LLM can call. The LLM sees a function declaration with a `query` parameter, decides when to use it, and gets back structured results.
-
-### RedisVectorSearchTool
-
-The most straightforward option. It embeds the query using a vectorizer, performs K-nearest-neighbor search against a Redis index, and returns the top results.
-
-```python
-from redisvl.index import SearchIndex
-from redisvl.utils.vectorize import HFTextVectorizer
-from adk_redis.tools import RedisVectorSearchTool, RedisVectorQueryConfig
-
-vectorizer = HFTextVectorizer(model="redis/langcache-embed-v2")
-index = SearchIndex.from_existing("products", redis_url="redis://localhost:6379")
-
-search_tool = RedisVectorSearchTool(
-    index=index,
-    vectorizer=vectorizer,
-    config=RedisVectorQueryConfig(
-        vector_field_name="embedding",
-        return_fields=["name", "description", "price"],
-        num_results=5,
-    ),
-    name="search_product_catalog",
-    description="Find products by semantic similarity to a description.",
-)
-```
-
-The `name` and `description` parameters matter more than they might seem. These are what the LLM reads to decide whether and when to call the tool. A vague description like "search documents" will lead to the LLM calling it at the wrong times. A specific one like "Find products by semantic similarity to a description" gives the LLM the context it needs.
-
-### RedisHybridSearchTool
-
-Hybrid search combines vector similarity with BM25 keyword matching. This is valuable when queries contain specific terms (product IDs, technical acronyms, exact names) that semantic search alone might miss.
-
-The tool auto-detects whether your Redis server and RedisVL version support native hybrid search (Redis 8.4+ with RedisVL 0.13+). If they do, it uses the server-side `FT.HYBRID` command. If not, it falls back to a client-side aggregation approach. This version detection happens at initialization, so you don't need to think about it.
-
-```python
-from adk_redis.tools import RedisHybridSearchTool, RedisHybridQueryConfig
-
-hybrid_tool = RedisHybridSearchTool(
-    index=index,
-    vectorizer=vectorizer,
-    config=RedisHybridQueryConfig(
-        text_field_name="content",
-        combination_method="LINEAR",
-        linear_alpha=0.7,
-    ),
-    name="search_legal_documents",
-    description="Search legal documents using both semantic and keyword matching.",
-)
-```
-
-### RedisTextSearchTool and RedisRangeSearchTool
-
-`RedisTextSearchTool` performs pure BM25 keyword search. No embeddings, no vectorizer needed. It is the right choice when the query is about exact terms, error messages, or API names.
-
-`RedisRangeSearchTool` is a less common but useful variant. Instead of returning the top-K results, it returns all documents within a distance threshold. This is useful for exhaustive retrieval, such as "find everything related to authentication in our documentation," where you want comprehensive coverage rather than a ranked list.
-
-Here is a concrete example from the `redis_search_tools` example in the repo, which wires all three search modalities into a single agent.
-
-```python
-from adk_redis.tools import (
-    RedisVectorSearchTool, RedisVectorQueryConfig,
-    RedisTextSearchTool, RedisTextQueryConfig,
-    RedisRangeSearchTool, RedisRangeQueryConfig,
-)
-
-tools = [
-    RedisVectorSearchTool(
-        name="semantic_search",
-        description="Semantic similarity search for conceptual queries.",
-        index=index, vectorizer=vectorizer,
-        config=RedisVectorQueryConfig(num_results=5),
-        return_fields=["title", "content", "category"],
-    ),
-    RedisTextSearchTool(
-        name="keyword_search",
-        description="Keyword search for exact terms and phrases.",
-        index=index,
-        config=RedisTextQueryConfig(
-            text_field_name="content", text_scorer="BM25STD"
-        ),
-        return_fields=["title", "content", "category"],
-    ),
-    RedisRangeSearchTool(
-        name="range_search",
-        description="Returns ALL documents within a semantic distance threshold.",
-        index=index, vectorizer=vectorizer,
-        config=RedisRangeQueryConfig(distance_threshold=0.5),
-        return_fields=["title", "content", "category"],
-    ),
-]
-
-agent = Agent(
-    model="gemini-2.5-flash",
-    name="search_agent",
-    instruction=(
-        "You have three search tools. Use semantic_search for conceptual "
-        "queries, keyword_search for exact terms, range_search for exhaustive "
-        "retrieval."
-    ),
-    tools=tools,
-)
-```
-
-The instruction prompt is doing real work here. It teaches the LLM when to use each tool and what to expect from each. This kind of prompt engineering is not optional. Without it, the LLM will default to calling whichever tool appears first or whichever has the most generic description.
-
-## Semantic Caching
-
-LLM API calls are slow and expensive. If your agent handles support queries, a significant fraction of incoming questions will be semantically similar. "How do I reset my password?" and "I need to change my password" should produce the same response, and there is no reason to pay for two LLM calls.
-
-`adk-redis` provides semantic caching at two levels, LLM response caching and tool result caching, both backed by Redis.
-
-### LLM Response Cache
-
-The LLM cache intercepts calls to the language model through ADK's callback system. Before each model call, it checks whether a semantically similar prompt already exists in Redis. If it does, it returns the cached response immediately, skipping the LLM entirely. If it doesn't, it lets the call proceed and stores the response for future lookups.
-
-```python
-from redisvl.utils.vectorize import HFTextVectorizer
-from adk_redis.cache import (
-    RedisVLCacheProvider, RedisVLCacheProviderConfig,
-    LLMResponseCache, LLMResponseCacheConfig,
-    create_llm_cache_callbacks,
-)
-
-vectorizer = HFTextVectorizer(model="redis/langcache-embed-v1")
-
-provider = RedisVLCacheProvider(
-    config=RedisVLCacheProviderConfig(
-        redis_url="redis://localhost:6379",
-        name="my_llm_cache",
-        ttl=3600,
-        distance_threshold=0.1,
-    ),
-    vectorizer=vectorizer,
-)
-
-llm_cache = LLMResponseCache(
-    provider=provider,
-    config=LLMResponseCacheConfig(
-        first_message_only=True,
-        include_app_name=True,
-        include_user_id=True,
-    ),
-)
-
-before_cb, after_cb = create_llm_cache_callbacks(llm_cache)
-
-agent = Agent(
-    name="cached_agent",
-    model="gemini-2.0-flash",
-    instruction="You are a helpful assistant.",
-    before_model_callback=before_cb,
-    after_model_callback=after_cb,
-)
-```
-
-A few design decisions are worth noting here.
-
-The `distance_threshold` parameter (set to 0.1 in this example) controls how similar two prompts need to be for a cache hit. A value of 0.0 means exact match only. A value of 0.1 allows small variations in phrasing. Going much higher risks returning cached responses for genuinely different questions. Tuning this threshold is application-specific and worth experimenting with.
-
-The `first_message_only` option is a practical default. In a multi-turn conversation, later messages depend heavily on prior context, making semantic cache hits unreliable. Caching only the first message (which is typically a standalone question) avoids returning contextually wrong responses.
-
-The cache is also smart about what it does *not* cache. Function call responses (where the LLM is invoking a tool) are skipped, as are error responses. This prevents caching intermediate steps that shouldn't be reused.
-
-### Managed Caching with LangCache
-
-If you'd rather not manage your own Redis instance and embedding model for caching, `adk-redis` also supports LangCache, a managed semantic caching service from Redis. With LangCache, embeddings are generated server-side, so you don't need a local vectorizer at all.
-
-```python
-from adk_redis.cache import LangCacheProvider, LangCacheProviderConfig
-
-provider = LangCacheProvider(
-    config=LangCacheProviderConfig(
-        cache_id="your-cache-id",
-        api_key="your-api-key",
-        ttl=3600,
-    )
-)
-```
-
-The same `LLMResponseCache` and `ToolCache` classes work with either provider. You just swap the backend.
-
-### Tool Result Cache
-
-The tool cache follows the same pattern but for tool executions rather than LLM calls. If your agent calls an external API with the same arguments repeatedly, the tool cache can short-circuit the call and return the cached result.
-
-```python
-from adk_redis.cache import ToolCache, ToolCacheConfig, create_tool_cache_callbacks
-
-tool_cache = ToolCache(
-    provider=provider,
-    config=ToolCacheConfig(
-        tool_names={"web_search", "get_weather"},
-    ),
-)
-
-before_tool_cb, after_tool_cb = create_tool_cache_callbacks(tool_cache)
-```
-
-The `tool_names` set lets you specify exactly which tools should be cached. This is important because not all tools are idempotent. You probably want to cache `get_weather` (same city, same hour, same result) but not `send_email` (same arguments, but each call should actually execute).
-
-## Three Ways to Integrate Memory
-
-One of the more interesting design decisions in `adk-redis` is that it offers three distinct approaches for connecting agents to memory. Each approach has different tradeoffs around control, complexity, and standardization.
-
-### Approach 1. ADK Services (Framework-Managed)
-
-This is what we covered in the two-tier memory section. You configure `RedisWorkingMemorySessionService` and `RedisLongTermMemoryService`, pass them to the `Runner`, and the framework handles everything automatically. Memory extraction happens in the background. Search happens before each agent turn. The agent code itself never directly interacts with memory.
-
-This approach is the simplest to implement and the hardest to customize. The agent has no explicit control over *what* gets stored or *when* it searches. It is best for applications where you want memory to be invisible infrastructure.
-
-### Approach 2. REST Tools (LLM-Controlled)
-
-Instead of (or in addition to) framework-managed services, you can give the agent explicit memory tools. These are ADK tools that the LLM calls like any other function.
-
-```python
-from adk_redis.tools.memory import (
-    SearchMemoryTool, CreateMemoryTool,
-    UpdateMemoryTool, DeleteMemoryTool,
-    MemoryToolConfig,
-)
-
-memory_config = MemoryToolConfig(
-    api_base_url="http://localhost:8088",
-    default_namespace="my_app",
-    recency_boost=True,
-)
-
-tools = [
-    SearchMemoryTool(config=memory_config),
-    CreateMemoryTool(config=memory_config),
-    UpdateMemoryTool(config=memory_config),
-    DeleteMemoryTool(config=memory_config),
-]
-```
-
-With this approach, the LLM decides when to search memory, what to store, and what to update. The agent prompt needs to instruct the LLM on memory management strategy. This requires more prompt engineering, but it gives the agent genuine autonomy over its own memory.
-
-The travel agent example in the repo uses a hybrid of both approaches. Framework services handle session persistence and automatic background extraction. Memory tools give the LLM explicit CRUD control over long-term memories. This is arguably the most powerful configuration, because the agent gets both automatic memory management and the ability to deliberately store or retrieve specific facts.
-
-### Approach 3. MCP Tools (Model Context Protocol)
-
-MCP is a standardized protocol for connecting agents to tools via Server-Sent Events (SSE). Instead of REST-based tool implementations, you point the agent at the Agent Memory Server's MCP endpoint and let ADK's `McpToolset` handle tool discovery automatically.
-
-```python
-from adk_redis.tools.mcp_memory import create_memory_mcp_toolset
-
-memory_tools = create_memory_mcp_toolset(
-    server_url="http://localhost:9000",
-    tool_filter=["search_long_term_memory", "create_long_term_memories"],
-)
-
-agent = Agent(
-    model="gemini-2.5-flash",
-    name="fitness_coach",
-    tools=[memory_tools],
-)
-```
-
-The `tool_filter` parameter controls which MCP tools are exposed to the LLM. The Agent Memory Server exposes seven tools through MCP, including `search_long_term_memory`, `create_long_term_memories`, `get_long_term_memory`, `edit_long_term_memory`, `delete_long_term_memories`, `memory_prompt`, and `set_working_memory`.
-
-The fitness coach example in the repo demonstrates this approach. It connects to memory via MCP and stores both semantic memories (user profile, injuries, equipment) and episodic memories (workouts with timestamps, milestones). The distinction between semantic and episodic memory types is particularly useful. Semantic memories represent timeless facts ("user has a knee injury"), while episodic memories represent events ("user completed 3x12 rows on March 9th").
-
-MCP is the most standardized approach and makes it easy to swap memory backends without changing agent code. The tradeoff is that it requires running the Agent Memory Server with MCP support enabled on a separate port.
-
-## Walking Through the Travel Agent
-
-To make all of this concrete, let's trace through the `travel_agent_memory_hybrid` example, which is the most complete example in the repo. It combines framework-managed services, LLM-controlled memory tools, web search, itinerary planning, and calendar export into a single agent.
-
-### The Entrypoint
-
-The `main.py` file sets up the infrastructure. It registers custom service factories with ADK's service registry, creates both the session service and memory service, and launches a FastAPI app with the ADK web runner.
-
-```python
-from adk_redis.memory import RedisLongTermMemoryService, RedisLongTermMemoryServiceConfig
-from adk_redis.sessions import RedisWorkingMemorySessionService, RedisWorkingMemorySessionServiceConfig
-
-# Register factories so ADK can instantiate them from URIs
-registry = get_service_registry()
-registry.register_session_service("redis-working-memory", redis_session_factory)
-registry.register_memory_service("redis-long-term-memory", redis_memory_factory)
-
-# Build URIs and create the FastAPI app
-app = get_fast_api_app(
-    agents_dir=".",
-    session_service_uri="redis-working-memory://localhost:8088",
-    memory_service_uri="redis-long-term-memory://localhost:8088",
-    web=True,
-)
-```
-
-The URI-based factory pattern is worth noting. ADK's service registry lets you register custom service implementations behind URI schemes. This means you can switch between in-memory and Redis-backed services by changing a URI string, without modifying any agent code.
-
-### The Agent
-
-The agent itself is defined in `agent.py`. It assembles a rich set of tools spanning memory, search, and planning.
-
-```python
-from adk_redis.tools.memory import (
-    SearchMemoryTool, CreateMemoryTool,
-    UpdateMemoryTool, DeleteMemoryTool,
-    MemoryToolConfig,
-)
-from google.adk.tools import preload_memory, load_memory
-
-memory_config = MemoryToolConfig(
-    api_base_url="http://localhost:8088",
-    default_namespace="travel_agent_memory_hybrid",
-    recency_boost=True,
-    search_top_k=10,
-)
-
-tools = [
-    SearchMemoryTool(config=memory_config),
-    CreateMemoryTool(config=memory_config),
-    UpdateMemoryTool(config=memory_config),
-    DeleteMemoryTool(config=memory_config),
-    preload_memory,
-    load_memory,
-    CalendarExportTool(),
-    ItineraryPlannerTool(),
-]
-```
-
-Notice the layered memory strategy. `preload_memory` and `load_memory` are ADK's built-in tools that hook into the `RedisLongTermMemoryService` we configured in `main.py`. These provide automatic, framework-controlled memory retrieval. The `SearchMemoryTool`, `CreateMemoryTool`, and friends give the LLM explicit control on top of that.
-
-The agent also has an `after_agent_callback` that calls `add_session_to_memory()` after each turn. This is what triggers background extraction of facts and preferences into long-term memory.
-
-```python
-async def after_agent(callback_context: CallbackContext):
-    await callback_context.add_session_to_memory()
-
-root_agent = Agent(
-    model="gemini-2.5-flash",
-    name="travel_agent",
-    tools=tools,
-    after_agent_callback=after_agent,
-    instruction="...",  # Detailed prompt with memory management strategy
-)
-```
-
-### What Happens at Runtime
-
-When a user starts a conversation, the following sequence plays out.
-
-1. ADK creates (or retrieves) a session via `RedisWorkingMemorySessionService`. The session is stored in the Agent Memory Server.
-2. The agent's `preload_memory` tool automatically searches long-term memory for context relevant to the current conversation.
-3. The user sends a message. The message is appended to working memory via the incremental append API.
-4. The LLM generates a response. If it needs travel information, it can call web search tools. If it wants to check the user's preferences, it calls `SearchMemoryTool`. If the user shares a new preference, the LLM calls `CreateMemoryTool`.
-5. The response is appended to working memory.
-6. The `after_agent_callback` fires, sending the conversation to the Agent Memory Server for background extraction. The server pulls out facts like "user prefers direct flights" or "user wants to visit Japan in spring" and stores them as searchable long-term memories.
-7. If the conversation grows long, the working memory service automatically summarizes older turns to stay within the context window.
-
-All of this happens with a `pip install adk-redis[memory]`, a running Redis instance, and a running Agent Memory Server. The agent's Python code is clean, focused on domain logic rather than infrastructure plumbing.
-
-## Getting Started
-
-To run any of the examples, you need two things running.
-
-**Redis 8.4** provides the storage backend for everything. Vector indices, session data, cache entries.
-
-```bash
-docker run -d --name redis -p 6379:6379 redis:8.4-alpine
-```
-
-**Redis Agent Memory Server** handles memory extraction, summarization, and the working memory API. It sits between your agent and Redis, adding the intelligence layer.
-
-```bash
-docker run -d --name agent-memory-server -p 8088:8088 \
-  -e REDIS_URL=redis://host.docker.internal:6379 \
-  -e GEMINI_API_KEY=your-key \
-  -e GENERATION_MODEL=gemini/gemini-2.0-flash \
-  -e EMBEDDING_MODEL=gemini/text-embedding-004 \
-  redislabs/agent-memory-server:0.13.2 \
-  agent-memory api --host 0.0.0.0 --port 8088 --task-backend=asyncio
-```
-
-The Agent Memory Server uses LiteLLM under the hood, which means it supports 100+ LLM providers. You can swap in OpenAI, Anthropic, AWS Bedrock, or even local models via Ollama.
-
-Then install the package and run an example.
-
-```bash
-pip install adk-redis[all]
-cd examples/simple_redis_memory
-python main.py
-```
-
-## Conclusion
-
-`adk-redis` is a focused library that solves a specific problem well. It takes the interfaces that ADK defines, `BaseMemoryService`, `BaseSessionService`, the tool system, the callback system, and provides Redis-backed implementations that are production-grade rather than toy-grade.
-
-The key ideas worth taking away from this are the following.
-
-The **two-tier memory architecture** (working memory for sessions, long-term memory for persistent facts) is a pattern that scales well. It mirrors how real applications need to manage state, keeping the current context fast and small while maintaining a durable knowledge base.
-
-The **three integration approaches** (framework services, REST tools, MCP tools) give you a spectrum from fully automatic to fully LLM-controlled memory management. The hybrid approach, combining framework services with LLM-controlled tools, is particularly effective.
-
-**Semantic caching** is a straightforward way to reduce costs and latency, and `adk-redis` makes it easy to enable without changing your agent's core logic.
-
-The **search tools** provide a clean abstraction over RedisVL's query types, making it simple to add RAG capabilities to any ADK agent.
-
-All of this runs on Redis, a system that most teams already know how to operate, monitor, and scale.
-
-If you want to dive deeper, the [GitHub repository](https://github.com/redis-developer/adk-redis) has seven complete examples covering every feature described here. The [Redis Agent Memory Server documentation](https://github.com/redis/agent-memory-server) covers the memory backend in detail, and the [RedisVL documentation](https://docs.redisvl.com) covers the vector search and caching capabilities that power the tools and cache providers.
\ No newline at end of file
diff --git a/docs/concepts/search.md b/docs/concepts/search.md
index c03d130..a385c7c 100644
--- a/docs/concepts/search.md
+++ b/docs/concepts/search.md
@@ -22,7 +22,7 @@ The search tools use RedisVL to perform vector similarity search over a Redis in
 | `redis_text_search` | Keyword full-text search via BM25 |
 | `redis_sql_search` | SQL `SELECT` against a bound index via `redisvl.query.SQLQuery`. Requires the `adk-redis[sql]` extra. |
 
-In addition to the in-process Python tools, you can connect an agent to RedisVL's own MCP server (one index per server) with `create_redisvl_mcp_toolset(...)`. The server exposes schema-aware `search-records` and `upsert-records` tools and is useful when the same index needs to be served to multiple agents or non-Python clients. See the [search tools how-to](../user_guide/how_to_guides/search_tools.md) for the decision matrix.
+In addition to the in-process Python tools, you can connect an agent to RedisVL's own MCP server (one index per server) using ADK's standard `McpToolset` pointed at a running `rvl mcp` instance. The server exposes schema-aware `search-records` and `upsert-records` tools and is useful when the same index needs to be served to multiple agents or non-Python clients. See the [search tools how-to](../user_guide/how_to_guides/search_tools.md) for the decision matrix and code samples.
 
 ## Indexing
 
diff --git a/docs/for-ais-only/FAILURE_MODES.md b/docs/for-ais-only/FAILURE_MODES.md
index 6f41f98..a3e8ff0 100644
--- a/docs/for-ais-only/FAILURE_MODES.md
+++ b/docs/for-ais-only/FAILURE_MODES.md
@@ -41,12 +41,16 @@ The `redisvl.extensions.llmcache` path still works but emits a
 to the old path; the regression test in
 `tests/cache/test_provider.py` asserts no `DeprecationWarning` fires.
 
-## Two MCP toolset helpers exist on purpose
-
-`create_memory_mcp_toolset(...)` targets Agent Memory Server; it is the
-memory surface. `create_redisvl_mcp_toolset(...)` targets RedisVL's own
-MCP server (`rvl mcp`); it is the index/search surface. They are not
-interchangeable. Do not merge them.
+## One MCP helper, not two
+
+`create_memory_mcp_toolset(...)` exists for Agent Memory Server because
+AMS's MCP URL has a non-trivial `/sse` suffix and the tool-name vocabulary
+is bespoke. There is **no** matching helper for the RedisVL MCP server
+(`rvl mcp`): users wire it with ADK's native `McpToolset` plus
+`StdioConnectionParams` / `SseConnectionParams` /
+`StreamableHTTPConnectionParams`. The maintainers chose this on purpose
+to keep the MCP wiring story aligned with every other ADK catalog
+integration. Do not reintroduce a `create_redisvl_mcp_toolset` wrapper.
 
 ## Two cache providers exist on purpose
 
diff --git a/docs/for-ais-only/REPOSITORY_MAP.md b/docs/for-ais-only/REPOSITORY_MAP.md
index 151c253..aa7fba9 100644
--- a/docs/for-ais-only/REPOSITORY_MAP.md
+++ b/docs/for-ais-only/REPOSITORY_MAP.md
@@ -35,9 +35,6 @@ src/adk_redis/
                           UpdateMemoryTool).
     mcp_memory.py         MCP tool surface for the same memory operations
                           (Agent Memory Server).
-    mcp_search.py         create_redisvl_mcp_toolset(...) for RedisVL's
-                          own MCP server (rvl mcp). Supports stdio, sse,
-                          streamable-http; bearer auth on HTTP transports.
   cache/
     __init__.py           Re-exports the cache providers.
     _provider.py          Provider protocol and base class.
@@ -64,8 +61,6 @@ tests/
     test_range_search.py       RedisRangeSearchTool.
     test_text_search.py        RedisTextSearchTool.
     test_sql_search.py         RedisSQLSearchTool.
-    test_mcp_search.py         create_redisvl_mcp_toolset (validation,
-                               three transports, bearer auth, tool filter).
   cache/
     test_provider.py           RedisVLCacheProvider (incl. no-DeprecationWarning
                                regression for the cache.llm import path).
@@ -87,7 +82,7 @@ tests/
 | Long-term memory + Memory Server proxy | `memory/long_term_memory.py`, `memory/_utils.py` |
 | ADK Memory tools (FunctionTool wrappers) | `tools/memory/` |
 | MCP memory tool surface (Agent Memory Server) | `tools/mcp_memory.py` |
-| MCP search tool surface (RedisVL MCP server) | `tools/mcp_search.py` |
+| MCP search (RedisVL `rvl mcp` server) | Use ADK's native `McpToolset` directly; no adk-redis wrapper. |
 | Vector / Hybrid / Range / Text / SQL search tools | `tools/search/` |
 | LLM and tool semantic caching | `cache/llm_cache.py`, `cache/tool_cache.py` |
 | Cache provider abstraction (RedisVL / LangCache) | `cache/_provider.py` |
diff --git a/docs/llms.txt b/docs/llms.txt
index 1281ed8..17d6e6a 100644
--- a/docs/llms.txt
+++ b/docs/llms.txt
@@ -56,9 +56,10 @@ When generating code that uses adk-redis:
   (e.g., `from adk_redis import RedisVectorSearchTool`).
 - Construct services with their `Config` dataclasses
   (`RedisLongTermMemoryServiceConfig`, etc.) so options are explicit.
-- Prefer `create_redisvl_mcp_toolset(...)` over hand-rolled
-  `StdioConnectionParams`; the helper does transport validation and
-  bearer-auth wiring.
+- For the RedisVL MCP path, use ADK's native `McpToolset` with the
+  appropriate `*ConnectionParams` class. Set
+  `tool_filter=["search-records"]` to suppress writes, or pass
+  `--read-only` to `rvl mcp` in stdio mode.
 - Set `stopwords=None` on text-search configs if `nltk` is not installed.
 - Pin `redisvl>=0.18.2` when documenting installs; the deprecated
   `redisvl.extensions.llmcache` path is on the way out.
diff --git a/docs/user_guide/how_to_guides/search_tools.md b/docs/user_guide/how_to_guides/search_tools.md
index f51895a..c4a2bbb 100644
--- a/docs/user_guide/how_to_guides/search_tools.md
+++ b/docs/user_guide/how_to_guides/search_tools.md
@@ -114,41 +114,57 @@ WHERE category = 'electronics' AND price < :max_price
 
 with `params={"max_price": 50}`. Install with `pip install 'adk-redis[sql]'`.
 
-## RedisVL MCP toolset
+## RedisVL MCP server
 
-`create_redisvl_mcp_toolset(...)` returns an ADK `McpToolset` that talks to RedisVL's own MCP server (`rvl mcp`). The server exposes two tools per index:
+Connect an ADK agent to RedisVL's own MCP server (`rvl mcp`) using ADK's standard `McpToolset`. The server exposes two tools per index:
 
-- `search-records`: schema-aware search with filter hints embedded in the tool description.
-- `upsert-records`: write path. Suppress with `read_only=True` (the default for stdio mode).
+- `search-records`: schema-aware search (vector / fulltext / hybrid, chosen at server start). Filter and return-field hints come from the bound index schema.
+- `upsert-records`: write path. Suppress with `--read-only` on the server, or with `tool_filter=["search-records"]` on the toolset.
 
 ```python
-from pydantic import SecretStr
-from adk_redis import create_redisvl_mcp_toolset
-
-# Remote server (default transport is streamable-http).
-toolset = create_redisvl_mcp_toolset(
-    url="http://localhost:8000/mcp",
-    auth_token=SecretStr("..."),  # optional bearer
+from google.adk import Agent
+from google.adk.tools.mcp_tool import McpToolset
+from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
+from mcp import StdioServerParameters
+
+# In-process stdio: spawn `rvl mcp --config <path>` next to the agent.
+agent = Agent(
+    model="gemini-2.5-flash",
+    name="redis_mcp_agent",
+    tools=[
+        McpToolset(
+            connection_params=StdioConnectionParams(
+                server_params=StdioServerParameters(
+                    command="rvl",
+                    args=[
+                        "mcp",
+                        "--config",
+                        "/etc/redisvl/mcp.yaml",
+                        "--read-only",
+                    ],
+                ),
+                timeout=30,
+            ),
+            tool_filter=["search-records"],
+        ),
+    ],
 )
+```
 
-# In-process stdio: spawn `rvl mcp --config <path>`.
-toolset = create_redisvl_mcp_toolset(
-    transport="stdio",
-    config_path="/etc/redisvl/mcp.yaml",
-    read_only=True,
-)
+For an already-running remote server, swap `StdioConnectionParams` for `StreamableHTTPConnectionParams(url=..., headers={"Authorization": "Bearer ..."})` (or `SseConnectionParams`).
 
-agent = Agent(model="gemini-2.0-flash", tools=[toolset])
-```
+Install the MCP CLI with `pip install 'redisvl[mcp]>=0.18.2'` and start the server with `rvl mcp --config <path> --transport streamable-http` (or `stdio`).
 
-Install with `pip install 'adk-redis[mcp-search]'`. The server-side dependency is shipped by `redisvl[mcp]` and started with `rvl mcp --config <path> --transport streamable-http`.
+!!! note
+    For other ADK language SDKs (TypeScript, etc.), see
+    [Custom MCP Tools](https://adk.dev/tools-custom/mcp-tools/).
 
 ## When to use which
 
 | Path | Use when |
 |---|---|
 | In-process tools (`RedisVectorSearchTool`, etc.) | Single Python agent, fine-grained control, no extra service to operate. |
-| `create_redisvl_mcp_toolset(...)` | Multiple agents (Python, JS, Claude Desktop) share one index, schema-aware tool descriptions, you want `--read-only` as a deployment-level guardrail. |
+| `McpToolset` against `rvl mcp` | Multiple agents (Python, JS, Claude Desktop) share one index, schema-aware tool descriptions, you want `--read-only` as a deployment-level guardrail. |
 
 ## Notes
 
diff --git a/examples/redisvl_mcp_search/README.md b/examples/redisvl_mcp_search/README.md
index 6bcbf04..617d32d 100644
--- a/examples/redisvl_mcp_search/README.md
+++ b/examples/redisvl_mcp_search/README.md
@@ -1,44 +1,44 @@
 # RedisVL MCP Search Agent
 
-This sample shows an ADK agent that talks to a separately-running
-**RedisVL MCP server** (`rvl mcp`) via the new
-`create_redisvl_mcp_toolset(...)` helper. The MCP server is configured
-to expose BM25 fulltext search over a small corpus of Redis articles.
+The **MCP-path counterpart** of [`redis_search_tools/`](../redis_search_tools/).
+A similar Redis knowledge-base corpus and the same kinds of prompts, but
+search is served by a separately-running `rvl mcp` server and the agent
+calls it via ADK's standard `McpToolset` over MCP. No adk-redis wrapper
+involved; this is the same pattern every MCP integration in the ADK
+catalog uses.
+
+(The MCP corpus is curated for hybrid demos and includes a couple of
+MCP-specific articles, so the dataset is overlapping rather than
+identical with `redis_search_tools/load_data.py`.)
+
+Use this example to compare the two deployment shapes side by side:
+
+| | `redis_search_tools/` | `redisvl_mcp_search/` (this) |
+|---|---|---|
+| Topology | One process: agent + index in-process | Two processes: agent connects to `rvl mcp` over MCP |
+| Tool count | 3 (semantic / keyword / range) | 1 (`search-records`, configured for hybrid) |
+| Search modes covered | vector, BM25, range | vector + BM25 fused via FT.HYBRID |
+| Where the vectorizer runs | In the agent process | In the `rvl mcp` server process |
+| Filter shape | Python `FilterExpression` | JSON filter object parsed server-side |
+| Use when | Single agent, fast onboarding, complex filters | Multi-agent / polyglot, server-side ops gates |
 
 ## What this sample shows
 
-- Configuring `rvl mcp` for a Redis search index with a YAML file.
-- Connecting ADK to that server with `create_redisvl_mcp_toolset(...)`
-  over the `streamable-http` transport.
+- Configuring `rvl mcp` for hybrid search via a YAML config.
+- Connecting ADK to that server with ADK's native `McpToolset` + one of
+  `StdioConnectionParams` / `StreamableHTTPConnectionParams`.
 - Using a `tool_filter` to expose only `search-records` (no upserts).
 - Reading the schema-aware tool description that RedisVL produces.
 
-## Architecture
-
-```
-                +-------------------+
-   "search-     |  rvl mcp server   |
-   records"     |  (streamable-http |
-       ^^^^^^^^>|   on :8765)       |---->  Redis 8.4+ (RediSearch)
-       MCP      +-------------------+
-       protocol             ^
-                            |
-                +-------------------+
-                | ADK agent         |
-                | (`adk web`)       |
-                | create_redisvl_   |
-                | mcp_toolset(...)  |
-                +-------------------+
-```
-
 ## Prerequisites
 
-1. **Redis 8.4** running locally or in Redis Cloud. The repo root has
-   `./scripts/start-redis.sh` for a one-shot start.
+1. **Redis 8.4** running locally (or Redis Cloud with the RediSearch
+   module enabled). Native `FT.HYBRID` requires 8.4+.
 2. **A Gemini API key**. Get one at
    [aistudio.google.com](https://aistudio.google.com/app/apikey).
-3. **The `mcp-search` extra** so the helper and `rvl mcp` CLI are
-   installed.
+3. **`redisvl[mcp]>=0.18.2`** for the `rvl mcp` CLI, plus
+   `sentence-transformers` (the loader and the MCP server both embed
+   docs / queries with a HuggingFace vectorizer).
 
 ## Setup
 
@@ -47,12 +47,9 @@ to expose BM25 fulltext search over a small corpus of Redis articles.
 From the repository root:
 
 ```bash
-uv pip install 'adk-redis[mcp-search,examples]'
+uv pip install 'adk-redis[examples]' 'redisvl[mcp]>=0.18.2' sentence-transformers
 ```
 
-The `mcp-search` extra pulls in `redisvl[mcp]>=0.18.2`, which provides
-the `rvl mcp` CLI and the FastMCP server.
-
 ### 2. Start Redis 8.4
 
 ```bash
@@ -62,19 +59,23 @@ docker exec redis redis-cli ping   # -> PONG
 
 ### 3. Set your Gemini API key
 
-Copy `.env.example` to `.env` and fill in `GOOGLE_API_KEY`. Optionally
-set `REDISVL_MCP_URL` if you plan to run the MCP server somewhere other
-than `http://127.0.0.1:8765/mcp`.
+Copy `.env.example` to `.env` and fill in `GOOGLE_API_KEY`. Optional:
+
+- `REDIS_URL` to point the loader at a non-default Redis.
+- `REDISVL_MCP_URL` if you run the MCP server somewhere other than
+  `http://127.0.0.1:8765/mcp`.
+- `REDISVL_MCP_AUTH_TOKEN` to attach a bearer token to MCP requests.
 
-### 4. Load the article index
+### 4. Load the knowledge base
 
 ```bash
 cd examples/redisvl_mcp_search
 python load_data.py
 ```
 
-This creates the `adk_mcp_articles` index and loads six short articles
-about Redis search, MCP, semantic caching, and agent memory.
+The loader creates the `adk_mcp_knowledge_base` index, embeds the
+documents with `redis/langcache-embed-v2` (768 dims), and writes them
+to Redis with stable keys so re-running is idempotent.
 
 ### 5. Start the RedisVL MCP server
 
@@ -87,9 +88,9 @@ rvl mcp --config mcp_config.yaml \
   --host 127.0.0.1 --port 8765
 ```
 
-The server inspects the configured index, registers its `search-records`
-tool with schema-aware filter hints, and starts listening on
-`http://127.0.0.1:8765/mcp`.
+The server inspects the configured index, registers a single hybrid
+`search-records` tool with schema-aware filter and return-field hints,
+and listens on `http://127.0.0.1:8765/mcp`.
 
 ### 6. Run the agent
 
@@ -100,35 +101,89 @@ adk web redisvl_mcp_search_agent
 ADK web opens at `http://127.0.0.1:8000`. Pick the
 `redisvl_mcp_search_agent` app from the dropdown.
 
-## Try these prompts
+## Example queries
+
+Mirror the prompts from `redis_search_tools/` so you can see the MCP path
+return analogous results:
 
-- "Find articles about FT.HYBRID."
-- "What does the MCP server expose?"
-- "Explain semantic caching."
-- "Tell me about HNSW runtime parameters."
+- **Semantic-leaning:** "What is Redis?", "How does RAG work?", "What is
+  a vector database?"
+- **Keyword-leaning:** "Tell me about HNSW.", "Explain BM25 scoring.",
+  "FT.HYBRID command."
+- **Mixed:** "What are RAG best practices?", "How do I build an
+  intelligent assistant?"
 
-The agent decides on a keyword phrase, calls `search-records` over MCP,
-and summarizes the matches with title and URL citations.
+Because the server is configured for hybrid mode, a single query
+exercises both the BM25 path (term matches in `content`) and the vector
+path (semantic similarity to the query embedding), then fuses with
+`LINEAR` weighting (50% text, 50% vector by default).
+
+## Files
+
+| File | Purpose |
+|------|---------|
+| `schema.yaml` | RedisVL index schema (text + tag + vector fields). |
+| `load_data.py` | Embeds and loads the knowledge-base corpus. |
+| `mcp_config.yaml` | `rvl mcp` server configuration: hybrid search + vectorizer + runtime field names. |
+| `redisvl_mcp_search_agent/agent.py` | The ADK agent. |
+| `.env.example` | Template for `GOOGLE_API_KEY` and optional overrides. |
 
 ## How it works
 
-`create_redisvl_mcp_toolset(...)` returns an ADK `McpToolset` with the
-right connection-params type for the transport you choose:
+1. **Agent constructs an MCP toolset.** ADK's `McpToolset` is wired to
+   the running `rvl mcp` server with either `StdioConnectionParams`
+   (default in this example, spawns `rvl mcp --config <path>`) or
+   `StreamableHTTPConnectionParams` (`REDISVL_MCP_URL` env var).
+   `tool_filter=["search-records"]` hides `upsert-records` so the agent
+   cannot write.
+2. **Agent emits a query.** The LLM calls `search-records({"query":
+   "...", "limit": 5})`. ADK relays the call to the MCP server.
+3. **MCP server runs hybrid search.** The server embeds the query with
+   `redis/langcache-embed-v2`, builds a `HybridQuery` against the
+   configured index, runs `FT.HYBRID` on Redis, normalizes scores, and
+   returns structured results with `{title, content, url, ...}` per
+   match.
+4. **Agent summarizes.** The LLM cites each match's title and url.
+
+## Customization
+
+### Switch fusion method
+
+Edit `mcp_config.yaml`:
+
+```yaml
+search:
+  type: hybrid
+  params:
+    combination_method: RRF
+    rrf_window: 20
+    rrf_constant: 60
+```
+
+### Add a bearer token
 
-- `transport="stdio"` (passes a `config_path`): spawns
-  `rvl mcp --config <path> --read-only` over stdio.
-- `transport="streamable-http"` (default, passes a `url`): connects to
-  a long-running server. Bearer auth is added to headers when
-  `auth_token` is set.
-- `transport="sse"` (passes a `url`): same as streamable-http but over
-  the SSE transport.
+Run the server behind a proxy that injects auth, then set
+`REDISVL_MCP_URL` and `REDISVL_MCP_AUTH_TOKEN`. The example agent reads
+both and attaches `Authorization: Bearer <token>` to every MCP request
+via `StreamableHTTPConnectionParams(headers=...)`.
 
-The agent in this sample uses the streamable-http path so the MCP server
-can stay up between agent invocations. Switch to stdio if you prefer a
-single process; the helper handles it.
+### Connect to Redis Cloud
+
+Set `REDIS_URL` before running both the loader and the MCP server. The
+config YAML uses `${REDIS_URL:-redis://localhost:6379}` so the override
+flows through automatically.
 
 ## Cleanup
 
 ```bash
 docker stop redis && docker rm redis
 ```
+
+## See also
+
+- [`redis_search_tools/`](../redis_search_tools/) for the in-process
+  Python version of the same demo.
+- [`redis_sql_search/`](../redis_sql_search/) for SQL-style filters
+  (in-process only; no MCP equivalent today).
+- [Search tools how-to](../../docs/user_guide/how_to_guides/search_tools.md)
+  for the full decision matrix.
diff --git a/examples/redisvl_mcp_search/load_data.py b/examples/redisvl_mcp_search/load_data.py
index b41f950..2eb9ee7 100644
--- a/examples/redisvl_mcp_search/load_data.py
+++ b/examples/redisvl_mcp_search/load_data.py
@@ -12,91 +12,169 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-"""Load sample articles for the redisvl_mcp_search demo.
+"""Load a Redis knowledge base for the MCP-path search demo.
 
-The rvl mcp server is configured to expose BM25 fulltext search over
-`content`, so the dataset is short prose suited to keyword matches.
+The corpus is curated for the MCP demo and overlaps with
+`examples/redis_search_tools/load_data.py` so the in-process and MCP
+demos answer similar questions, with MCP-specific docs added here
+(e.g., the "RedisVL MCP Server" entry). Documents are embedded with
+`redis/langcache-embed-v2` (768 dims) so the configured `rvl mcp`
+server can run vector or hybrid search against them.
 """
 
 import os
 from pathlib import Path
 
 from redisvl.index import SearchIndex
+from redisvl.utils.vectorize import HFTextVectorizer
 
-SAMPLE_ARTICLES = [
+SAMPLE_DOCS = [
+    # === SEMANTIC SEARCH DEMOS ===
     {
-        "title": "Vector Similarity Search in Redis",
+        "title": "Introduction to Redis",
         "content": (
-            "Redis supports approximate nearest neighbor search via FLAT and "
-            "HNSW indexes. HNSW trades index size and build time for sub-linear "
-            "query latency at high recall. Each algorithm has runtime parameters "
-            "such as EF for HNSW that tune the accuracy-latency tradeoff."
+            "Redis is a lightning-fast in-memory data store. It excels at"
+            " caching, session management, and real-time analytics. Think of"
+            " it as a Swiss Army knife for data: versatile, quick, and"
+            " reliable."
+        ),
+        "url": "https://redis.io/docs/about/",
+        "category": "redis",
+        "doc_type": "reference",
+        "difficulty": "beginner",
+    },
+    {
+        "title": "Understanding Vector Databases",
+        "content": (
+            "Vector databases store numerical representations of data called"
+            " embeddings. These embeddings capture semantic meaning, enabling"
+            " similarity search. Applications include recommendation engines,"
+            " image search, and chatbots."
         ),
-        "topic": "vectors",
         "url": "https://redis.io/docs/vectors/",
+        "category": "concepts",
+        "doc_type": "reference",
+        "difficulty": "intermediate",
+    },
+    {
+        "title": "Building Intelligent Assistants",
+        "content": (
+            "Modern AI assistants combine language models with external"
+            " knowledge. They can search databases, call APIs, and maintain"
+            " conversation context. The key is giving them the right tools"
+            " for each task."
+        ),
+        "url": "https://google.github.io/adk-docs/agents/",
+        "category": "adk",
+        "doc_type": "tutorial",
+        "difficulty": "intermediate",
+    },
+    # === KEYWORD-FRIENDLY DEMOS ===
+    {
+        "title": "HNSW Algorithm Deep Dive",
+        "content": (
+            "HNSW (Hierarchical Navigable Small World) is the algorithm Redis"
+            " uses for approximate nearest neighbor search. It builds a"
+            " multi-layer graph where each layer has exponentially fewer"
+            " nodes. Search starts at the top layer and navigates down."
+            " Parameters: M (connections per node), EF (search width)."
+        ),
+        "url": "https://redis.io/docs/hnsw/",
+        "category": "redis",
+        "doc_type": "reference",
+        "difficulty": "advanced",
+    },
+    {
+        "title": "BM25 Scoring Explained",
+        "content": (
+            "BM25 (Best Matching 25) is a ranking function for full-text"
+            " search. It improves on TF-IDF by adding document length"
+            " normalization and term frequency saturation. Redis supports"
+            " BM25STD and BM25 scorers."
+        ),
+        "url": "https://redis.io/docs/bm25/",
+        "category": "redis",
+        "doc_type": "reference",
+        "difficulty": "advanced",
     },
     {
         "title": "Hybrid Search with FT.HYBRID",
         "content": (
-            "Hybrid search combines BM25 text scoring with vector similarity. "
-            "Redis 8.4 introduced the FT.HYBRID command for server-side fusion "
-            "using either LINEAR weighting or Reciprocal Rank Fusion (RRF). "
-            "Older Redis versions can fall back to client-side aggregation."
+            "Hybrid search combines BM25 text scoring with vector similarity."
+            " Redis 8.4 introduced the FT.HYBRID command for server-side"
+            " fusion using either LINEAR weighting or Reciprocal Rank Fusion"
+            " (RRF). Older Redis versions fall back to client-side"
+            " aggregation."
         ),
-        "topic": "search",
         "url": "https://redis.io/docs/hybrid/",
+        "category": "redis",
+        "doc_type": "reference",
+        "difficulty": "advanced",
     },
+    # === RAG-FOCUSED ===
     {
-        "title": "Semantic Caching for LLMs",
+        "title": "RAG Architecture Overview",
         "content": (
-            "A semantic cache stores prompt-response pairs keyed by the prompt "
-            "embedding. On a cache lookup the new prompt is embedded and the "
-            "nearest stored entry is returned when its distance is below the "
-            "configured threshold. This skips the LLM call for repeated or "
-            "near-duplicate requests."
+            "Retrieval-Augmented Generation (RAG) enhances LLMs with external"
+            " knowledge. Step 1: embed the user query. Step 2: search the"
+            " vector database for relevant documents. Step 3: include"
+            " retrieved context in the LLM prompt. Step 4: generate a"
+            " grounded response."
         ),
-        "topic": "caching",
-        "url": "https://redis.io/langcache",
+        "url": "https://redis.io/solutions/rag/",
+        "category": "concepts",
+        "doc_type": "tutorial",
+        "difficulty": "intermediate",
     },
     {
-        "title": "Long-Term Memory for Agents",
+        "title": "RAG Best Practices",
         "content": (
-            "Agent memory layers working memory and long-term memory. Working "
-            "memory holds the active conversation; promoted facts move to "
-            "long-term memory where recency-boosted semantic search retrieves "
-            "them on demand. Background extraction keeps the layers in sync."
+            "Tips for effective RAG: chunk documents appropriately (512 to"
+            " 1024 tokens), use hybrid search for better recall, rerank"
+            " results before prompting, include metadata for filtering, and"
+            " monitor retrieval quality metrics."
         ),
-        "topic": "memory",
-        "url": "https://github.com/redis/agent-memory-server",
+        "url": "https://redis.io/solutions/rag/best-practices/",
+        "category": "concepts",
+        "doc_type": "tutorial",
+        "difficulty": "intermediate",
     },
+    # === MCP-FOCUSED ===
     {
         "title": "RedisVL MCP Server",
         "content": (
-            "The RedisVL MCP server exposes a configured Redis index over the "
-            "Model Context Protocol. Search and upsert tools are wired with "
-            "schema-aware descriptions so agents see allowed filters and "
-            "return fields. The server supports stdio, SSE, and "
-            "streamable-http transports and ships a read-only flag."
+            "The RedisVL MCP server exposes a configured Redis index over the"
+            " Model Context Protocol. Search and upsert tools are wired with"
+            " schema-aware descriptions so agents see allowed filters and"
+            " return fields. The server supports stdio, SSE, and"
+            " streamable-http transports and ships a read-only flag."
+        ),
+        "url": (
+            "https://docs.redisvl.com/en/stable/user_guide/how_to_guides/mcp.html"
         ),
-        "topic": "mcp",
-        "url": "https://docs.redisvl.com/en/stable/user_guide/how_to_guides/mcp.html",
+        "category": "redis",
+        "doc_type": "reference",
+        "difficulty": "intermediate",
     },
+    # === FAQ STYLE ===
     {
-        "title": "Index Schemas in RedisVL",
+        "title": "FAQ: Embedding Dimensions Mismatch",
         "content": (
-            "An IndexSchema declares the fields stored in a Redis search index. "
-            "Fields include text, tag, numeric, geo, and vector types. The "
-            "schema drives how documents are loaded, how filters are parsed, "
-            "and which fields are projected by default."
+            "Q: Dimension mismatch error? A: Ensure query embeddings match"
+            " index dimensions. Common dimensions: OpenAI ada-002 (1536),"
+            " langcache-embed-v2 (768), sentence-transformers (384 to 768)."
+            " Check the schema dims field."
         ),
-        "topic": "schemas",
-        "url": "https://docs.redisvl.com/en/stable/user_guide/schemas.html",
+        "url": "https://redis.io/docs/faq/vectors/",
+        "category": "redis",
+        "doc_type": "faq",
+        "difficulty": "beginner",
     },
 ]
 
 
 def load_data() -> None:
-  """Create the article index and load sample documents."""
+  """Build the index and load documents with embeddings."""
   schema_path = Path(__file__).parent / "schema.yaml"
   redis_url = os.getenv("REDIS_URL", "redis://localhost:6379")
 
@@ -107,19 +185,35 @@ def load_data() -> None:
   print("Creating index (will overwrite if exists)...")
   index.create(overwrite=True, drop=True)
 
-  print(f"Loading {len(SAMPLE_ARTICLES)} articles...")
-  keys = [f"{index.prefix}:{i:04d}" for i in range(len(SAMPLE_ARTICLES))]
-  index.load(SAMPLE_ARTICLES, keys=keys)
+  print("Generating embeddings (redis/langcache-embed-v2)...")
+  vectorizer = HFTextVectorizer(model="redis/langcache-embed-v2")
+
+  docs_with_embeddings = []
+  for doc in SAMPLE_DOCS:
+    embedding = vectorizer.embed(doc["content"], as_buffer=True)
+    docs_with_embeddings.append({**doc, "embedding": embedding})
+    print(f"  [{doc['doc_type']:9}] {doc['title']}")
+
+  print(f"\nLoading {len(SAMPLE_DOCS)} docs into Redis...")
+  # Stable keys so re-running the loader overwrites instead of duplicating.
+  keys = [f"{index.prefix}:{i:04d}" for i in range(len(SAMPLE_DOCS))]
+  index.load(docs_with_embeddings, keys=keys)
 
   print(
       """
-Loaded articles. Next:
+Loaded knowledge base. Next:
 
   1. Start the rvl mcp server in another terminal:
        rvl mcp --config mcp_config.yaml \\
          --transport streamable-http --host 127.0.0.1 --port 8765
   2. Run the agent:
        adk web redisvl_mcp_search_agent
+
+Try prompts like the in-process redis_search_tools demo:
+  - "What is Redis?"           (semantic)
+  - "Tell me about HNSW"       (keyword)
+  - "How does RAG work?"       (semantic)
+  - "FT.HYBRID command"        (mixed semantic + keyword)
 """
   )
 
diff --git a/examples/redisvl_mcp_search/mcp_config.yaml b/examples/redisvl_mcp_search/mcp_config.yaml
index 98a3324..f435cb1 100644
--- a/examples/redisvl_mcp_search/mcp_config.yaml
+++ b/examples/redisvl_mcp_search/mcp_config.yaml
@@ -3,23 +3,32 @@
 #   rvl mcp --config mcp_config.yaml --transport streamable-http \
 #     --host 127.0.0.1 --port 8765 --read-only
 #
-# The MCP server inspects an existing Redis search index (adk_mcp_articles)
-# and serves `search-records` over BM25 fulltext on the `content` field.
+# Configured for hybrid search (BM25 text + vector similarity) over the
+# `adk_mcp_knowledge_base` index built by load_data.py. The server embeds
+# user queries via the vectorizer block and runs FT.HYBRID on Redis 8.4+.
 
 server:
   # Override with REDIS_URL env var if your Redis is not on localhost:6379.
   redis_url: ${REDIS_URL:-redis://localhost:6379}
 
 indexes:
-  articles:
-    redis_name: adk_mcp_articles
+  knowledge_base:
+    redis_name: adk_mcp_knowledge_base
+    vectorizer:
+      class: HFTextVectorizer
+      model: redis/langcache-embed-v2
     search:
-      type: fulltext
+      type: hybrid
       params:
-        text_scorer: BM25STD
-        # Disable nltk-based stopword removal so the server runs without nltk.
+        # LINEAR fusion: 50% text BM25, 50% vector similarity. Adjust to
+        # taste. Set to "RRF" for Reciprocal Rank Fusion.
+        combination_method: LINEAR
+        linear_text_weight: 0.5
+        # Disable nltk-based stopword removal so the server runs without
+        # the optional nltk dependency.
         stopwords: null
     runtime:
       text_field_name: content
+      vector_field_name: embedding
       default_limit: 5
       max_limit: 20
diff --git a/examples/redisvl_mcp_search/redisvl_mcp_search_agent/agent.py b/examples/redisvl_mcp_search/redisvl_mcp_search_agent/agent.py
index 5f9dc66..0b8fe27 100644
--- a/examples/redisvl_mcp_search/redisvl_mcp_search_agent/agent.py
+++ b/examples/redisvl_mcp_search/redisvl_mcp_search_agent/agent.py
@@ -14,11 +14,16 @@
 
 """RedisVL MCP search agent.
 
-Connects an ADK agent to a running `rvl mcp` server via
-`create_redisvl_mcp_toolset(...)` over the streamable-http transport.
-The MCP server (configured by `../mcp_config.yaml`) exposes one
-`search-records` tool over BM25 fulltext on the `content` field of the
-`adk_mcp_articles` index.
+The MCP-path counterpart of `examples/redis_search_tools/`. It targets
+a similar Redis knowledge-base corpus (overlapping, with MCP-specific
+docs added in the loader) and routes search through a
+separately-running `rvl mcp` server via ADK's native ``McpToolset``.
+The server is configured for hybrid (BM25 + vector) search, so a
+single MCP tool covers both semantic and keyword retrieval.
+
+The agent does not depend on any adk-redis MCP wrapper; it uses the
+standard ADK MCP pattern shown by every catalog integration page so the
+same shape works against any MCP server.
 """
 
 import os
@@ -26,44 +31,88 @@
 
 from dotenv import load_dotenv
 from google.adk import Agent
-from pydantic import SecretStr
-
-from adk_redis import create_redisvl_mcp_toolset
-
-INSTRUCTION = """You are a Redis docs assistant. You have a single MCP tool,
-`search-records`, that runs BM25 fulltext search over a Redis index of
-articles about Redis search, caching, memory, and the MCP server itself.
-
-For any question:
-
-1. Decide which keywords from the user's question are most likely to appear
-   in a relevant article (e.g., "HNSW", "FT.HYBRID", "semantic cache").
-2. Call `search-records` with that query. You can pass `limit` (default 5).
-3. Summarize the top matches and cite each article's title and URL.
-
-If the tool returns no matches, say so plainly. Do not fabricate articles.
+from google.adk.tools.mcp_tool import McpToolset
+from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
+from google.adk.tools.mcp_tool.mcp_session_manager import (
+    StreamableHTTPConnectionParams,
+)
+from mcp import StdioServerParameters
+
+INSTRUCTION = """You are a helpful assistant with a technical knowledge base
+served via an MCP server. You have one tool: `search-records`, configured
+for hybrid search (BM25 text + vector similarity) over the
+`adk_mcp_knowledge_base` index.
+
+## When to call search-records
+
+- Conceptual questions ("how does RAG work?") -> hybrid will lean on vector
+  similarity.
+- Technical terms / acronyms ("HNSW", "FT.HYBRID", "BM25") -> hybrid keeps
+  exact keyword matches via the BM25 component.
+- Comparative or "everything about X" questions -> hybrid combines both
+  paths and ranks by the configured fusion method (LINEAR by default).
+
+Pass a natural-language query in the `query` argument. The MCP server
+embeds it server-side using `redis/langcache-embed-v2`. Optionally pass
+`limit` (default 5).
+
+## Response style
+
+After calling `search-records`, summarize the matches for the user. Cite
+each document's title and url. If the tool returns no matches, say so
+plainly; do not fabricate results.
+
+Available document categories: redis, adk, concepts, tutorials.
+Document types: reference, tutorial, faq, api.
+Difficulty levels: beginner, intermediate, advanced.
 """
 
+DEFAULT_MCP_CONFIG_PATH = str(Path(__file__).parent.parent / "mcp_config.yaml")
+
+
+def _build_toolset() -> McpToolset:
+  """Pick stdio or streamable-http based on env vars.
+
+  - If `REDISVL_MCP_URL` is set, connect to the running server over
+    streamable-http. Optional `REDISVL_MCP_AUTH_TOKEN` becomes a bearer
+    header.
+  - Otherwise, spawn `rvl mcp --config <path> --read-only` over stdio.
+    `REDISVL_MCP_CONFIG` overrides the default config path.
+  """
+  remote_url = os.getenv("REDISVL_MCP_URL")
+  if remote_url:
+    auth_token = os.getenv("REDISVL_MCP_AUTH_TOKEN")
+    headers = {"Authorization": f"Bearer {auth_token}"} if auth_token else None
+    return McpToolset(
+        connection_params=StreamableHTTPConnectionParams(
+            url=remote_url,
+            headers=headers,
+            timeout=30,
+        ),
+        tool_filter=["search-records"],
+    )
+
+  config_path = os.getenv("REDISVL_MCP_CONFIG", DEFAULT_MCP_CONFIG_PATH)
+  return McpToolset(
+      connection_params=StdioConnectionParams(
+          server_params=StdioServerParameters(
+              command="rvl",
+              args=["mcp", "--config", config_path, "--read-only"],
+          ),
+          timeout=30,
+      ),
+      tool_filter=["search-records"],
+  )
+
 
 def create_agent() -> Agent:
   """Create the RedisVL MCP search agent."""
   load_dotenv(Path(__file__).parent.parent / ".env")
-
-  mcp_url = os.getenv("REDISVL_MCP_URL", "http://127.0.0.1:8765/mcp")
-  auth_token = os.getenv("REDISVL_MCP_AUTH_TOKEN")
-
-  toolset = create_redisvl_mcp_toolset(
-      url=mcp_url,
-      transport="streamable-http",
-      auth_token=SecretStr(auth_token) if auth_token else None,
-      tool_filter=["search-records"],
-  )
-
   return Agent(
       model="gemini-2.5-flash",
       name="redisvl_mcp_search_agent",
       instruction=INSTRUCTION,
-      tools=[toolset],
+      tools=[_build_toolset()],
   )
 
 
@@ -72,5 +121,6 @@ def create_agent() -> Agent:
 
 if __name__ == "__main__":
   print(
-      f"Agent '{root_agent.name}' loaded with {len(root_agent.tools)} toolset(s)"
+      f"Agent '{root_agent.name}' loaded with"
+      f" {len(root_agent.tools)} toolset(s)"
   )
diff --git a/examples/redisvl_mcp_search/schema.yaml b/examples/redisvl_mcp_search/schema.yaml
index 7668b10..1366610 100644
--- a/examples/redisvl_mcp_search/schema.yaml
+++ b/examples/redisvl_mcp_search/schema.yaml
@@ -12,15 +12,19 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-# Article index for the RedisVL MCP search demo. Plain text + tag fields
-# so the rvl mcp server can serve BM25 fulltext search without needing a
-# vectorizer or embedding model at the server.
+# Redis index schema for the redisvl_mcp_search sample.
+#
+# Parallels `examples/redis_search_tools/schema.yaml` so users can
+# compare the in-process Python tool path with the MCP toolset path on
+# a similar (overlapping) knowledge base. The `rvl mcp` server reads
+# the live index for its search-records description; the schema below
+# shapes those hints.
 
 version: "0.1.0"
 
 index:
-  name: adk_mcp_articles
-  prefix: article
+  name: adk_mcp_knowledge_base
+  prefix: mcp_doc
 
 fields:
   - name: title
@@ -29,8 +33,22 @@ fields:
   - name: content
     type: text
 
-  - name: topic
-    type: tag
-
   - name: url
     type: tag
+
+  - name: category
+    type: tag  # redis, adk, concepts, tutorials
+
+  - name: doc_type
+    type: tag  # reference, tutorial, faq, api
+
+  - name: difficulty
+    type: tag  # beginner, intermediate, advanced
+
+  - name: embedding
+    type: vector
+    attrs:
+      algorithm: hnsw
+      dims: 768
+      distance_metric: cosine
+      datatype: float32
diff --git a/pyproject.toml b/pyproject.toml
index 20bce38..0f52b1c 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 
 [project]
 name = "adk-redis"
-version = "0.0.4"
+version = "0.0.5"
 description = "Redis integrations for Google's Agent Development Kit (ADK)"
 readme = "README.md"
 license = "Apache-2.0"
@@ -61,11 +61,6 @@ sql = [
     "redisvl[sql-redis]>=0.18.2",
 ]
 
-# RedisVL MCP server toolset (create_redisvl_mcp_toolset)
-mcp-search = [
-    "redisvl[mcp]>=0.18.2",
-]
-
 # Example runner dependencies
 examples = [
     "python-dotenv",
@@ -73,7 +68,7 @@ examples = [
 
 # All Redis integrations
 all = [
-    "adk-redis[memory,search,langcache,sql,mcp-search]",
+    "adk-redis[memory,search,langcache,sql]",
 ]
 
 # Development dependencies
diff --git a/scripts/smoke_adk_mcp_runner.py b/scripts/smoke_adk_mcp_runner.py
index 3d081d7..47d576e 100644
--- a/scripts/smoke_adk_mcp_runner.py
+++ b/scripts/smoke_adk_mcp_runner.py
@@ -44,9 +44,9 @@
 from redisvl_mcp_search_agent.agent import root_agent  # noqa: E402
 
 PROMPTS = [
-    # Use a keyword-friendly prompt so the BM25 tokenizer matches cleanly.
-    # The corpus uses words like 'hybrid', 'caching', 'memory'.
-    "Find articles about hybrid search.",
+    # Hybrid search: semantic + keyword. The corpus is the knowledge base
+    # from redis_search_tools, embedded for vector + BM25 fusion.
+    "How does hybrid search work in Redis?",
 ]
 
 
diff --git a/src/adk_redis/__init__.py b/src/adk_redis/__init__.py
index 1f13774..2d5d3e7 100644
--- a/src/adk_redis/__init__.py
+++ b/src/adk_redis/__init__.py
@@ -97,10 +97,6 @@
 # Memory tools (MCP-based)
 from adk_redis.tools.mcp_memory import ALL_MCP_TOOLS
 from adk_redis.tools.mcp_memory import create_memory_mcp_toolset
-from adk_redis.tools.mcp_search import ALL_REDISVL_MCP_TOOLS
-from adk_redis.tools.mcp_search import REDISVL_MCP_TOOL_SEARCH
-from adk_redis.tools.mcp_search import REDISVL_MCP_TOOL_UPSERT
-from adk_redis.tools.mcp_search import create_redisvl_mcp_toolset
 # Semantic caching
 from adk_redis.cache import BaseCacheProvider
 from adk_redis.cache import CacheEntry
@@ -150,10 +146,6 @@
     # MCP tools
     "create_memory_mcp_toolset",
     "ALL_MCP_TOOLS",
-    "create_redisvl_mcp_toolset",
-    "ALL_REDISVL_MCP_TOOLS",
-    "REDISVL_MCP_TOOL_SEARCH",
-    "REDISVL_MCP_TOOL_UPSERT",
     # Semantic caching
     "BaseCacheProvider",
     "CacheEntry",
diff --git a/src/adk_redis/tools/__init__.py b/src/adk_redis/tools/__init__.py
index a0dc0e6..a1ae2ca 100644
--- a/src/adk_redis/tools/__init__.py
+++ b/src/adk_redis/tools/__init__.py
@@ -23,10 +23,6 @@
 from adk_redis.tools.memory import UpdateMemoryTool
 from adk_redis.tools.mcp_memory import ALL_MCP_TOOLS
 from adk_redis.tools.mcp_memory import create_memory_mcp_toolset
-from adk_redis.tools.mcp_search import ALL_REDISVL_MCP_TOOLS
-from adk_redis.tools.mcp_search import REDISVL_MCP_TOOL_SEARCH
-from adk_redis.tools.mcp_search import REDISVL_MCP_TOOL_UPSERT
-from adk_redis.tools.mcp_search import create_redisvl_mcp_toolset
 from adk_redis.tools.search import BaseRedisSearchTool
 from adk_redis.tools.search import RedisAggregatedHybridQueryConfig
 from adk_redis.tools.search import RedisHybridQueryConfig
@@ -67,8 +63,4 @@
     # MCP tools
     "create_memory_mcp_toolset",
     "ALL_MCP_TOOLS",
-    "create_redisvl_mcp_toolset",
-    "ALL_REDISVL_MCP_TOOLS",
-    "REDISVL_MCP_TOOL_SEARCH",
-    "REDISVL_MCP_TOOL_UPSERT",
 ]
diff --git a/src/adk_redis/tools/mcp_search.py b/src/adk_redis/tools/mcp_search.py
deleted file mode 100644
index c1dbd7f..0000000
--- a/src/adk_redis/tools/mcp_search.py
+++ /dev/null
@@ -1,171 +0,0 @@
-# Copyright 2025 Redis, Inc.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""Helper for binding ADK to a RedisVL MCP server (`rvl mcp`).
-
-This module exposes ``create_redisvl_mcp_toolset(...)``, which returns an
-``McpToolset`` wired to a RedisVL MCP server. The server is shipped by the
-``redisvl[mcp]`` extra and exposes index-aware ``search-records`` and
-``upsert-records`` tools whose descriptions include filter and return-field
-hints derived from the bound index schema.
-
-Three transport modes are supported:
-
-- ``stdio``: spawn ``rvl mcp --config <path>`` in-process. Pass ``config_path``.
-- ``streamable-http``: connect to a remote server. Pass ``url``. (default)
-- ``sse``: connect to a remote SSE server. Pass ``url``.
-
-Read-only mode is on by default and is the safer choice for agents that
-should not write.
-"""
-
-from __future__ import annotations
-
-from typing import Any, Literal, TYPE_CHECKING
-
-from pydantic import SecretStr
-
-if TYPE_CHECKING:
-  from google.adk.tools.mcp_tool import McpToolset
-
-
-REDISVL_MCP_TOOL_SEARCH = "search-records"
-REDISVL_MCP_TOOL_UPSERT = "upsert-records"
-
-ALL_REDISVL_MCP_TOOLS = [
-    REDISVL_MCP_TOOL_SEARCH,
-    REDISVL_MCP_TOOL_UPSERT,
-]
-
-
-def create_redisvl_mcp_toolset(
-    *,
-    url: str | None = None,
-    config_path: str | None = None,
-    transport: Literal["stdio", "sse", "streamable-http"] = "streamable-http",
-    read_only: bool = True,
-    auth_token: SecretStr | None = None,
-    tool_filter: list[str] | None = None,
-    timeout: float = 5.0,
-) -> "McpToolset":
-  """Create an MCP toolset pointed at a RedisVL MCP server.
-
-  Args:
-      url: URL of a running RedisVL MCP server. Required for ``sse`` and
-          ``streamable-http`` transports. Mutually exclusive with
-          ``config_path``.
-      config_path: Path to a RedisVL MCP YAML config. When set, the helper
-          spawns ``rvl mcp --config <path>`` over stdio. Required for the
-          ``stdio`` transport. Mutually exclusive with ``url``.
-      transport: Transport to use. Defaults to ``streamable-http``.
-      read_only: Whether to pass ``--read-only`` to the spawned server.
-          Only relevant in stdio mode. Default ``True``.
-      auth_token: Optional bearer token for HTTP transports. Sent as
-          ``Authorization: Bearer <token>``.
-      tool_filter: Optional list of MCP tool names to expose. Use
-          ``REDISVL_MCP_TOOL_SEARCH`` / ``REDISVL_MCP_TOOL_UPSERT`` for
-          symbolic filtering.
-      timeout: Connection timeout in seconds.
-
-  Returns:
-      A configured ``McpToolset``.
-
-  Raises:
-      ValueError: If ``url`` and ``config_path`` are both set or both unset,
-          or if a transport / param combination is invalid.
-      ImportError: If ``google-adk`` was installed without MCP support.
-
-  Example:
-      ```python
-      from google.adk import Agent
-      from adk_redis.tools.mcp_search import create_redisvl_mcp_toolset
-
-      # Remote server, read-only.
-      toolset = create_redisvl_mcp_toolset(
-          url="http://localhost:8000/mcp",
-      )
-
-      # Local in-process stdio.
-      toolset = create_redisvl_mcp_toolset(
-          transport="stdio",
-          config_path="/etc/redisvl/mcp.yaml",
-      )
-
-      agent = Agent(model="gemini-2.5-flash", tools=[toolset])
-      ```
-  """
-  _VALID_TRANSPORTS = ("stdio", "sse", "streamable-http")
-  if transport not in _VALID_TRANSPORTS:
-    raise ValueError(
-        f"Unknown transport {transport!r}. "
-        f"Expected one of: {', '.join(_VALID_TRANSPORTS)}."
-    )
-  if url is None and config_path is None:
-    raise ValueError(
-        "create_redisvl_mcp_toolset requires either url or config_path."
-    )
-  if url is not None and config_path is not None:
-    raise ValueError(
-        "url and config_path are mutually exclusive: stdio uses config_path,"
-        " HTTP/SSE transports use url."
-    )
-  if transport == "stdio" and config_path is None:
-    raise ValueError("stdio transport requires config_path.")
-  if transport in ("sse", "streamable-http") and url is None:
-    raise ValueError(f"{transport} transport requires url.")
-
-  try:
-    from google.adk.tools.mcp_tool import McpToolset
-    from google.adk.tools.mcp_tool.mcp_session_manager import (
-        SseConnectionParams,
-    )
-    from google.adk.tools.mcp_tool.mcp_session_manager import (
-        StdioConnectionParams,
-    )
-    from google.adk.tools.mcp_tool.mcp_session_manager import (
-        StreamableHTTPConnectionParams,
-    )
-    from mcp import StdioServerParameters
-  except ImportError as e:
-    raise ImportError(
-        "google-adk with MCP support is required. Install it with: "
-        "pip install 'google-adk[mcp]'"
-    ) from e
-
-  connection_params: Any
-  if transport == "stdio":
-    args = ["mcp", "--config", str(config_path)]
-    if read_only:
-      args.append("--read-only")
-    connection_params = StdioConnectionParams(
-        server_params=StdioServerParameters(command="rvl", args=args),
-        timeout=timeout,
-    )
-  else:
-    headers: dict[str, str] | None = None
-    if auth_token is not None:
-      headers = {"Authorization": f"Bearer {auth_token.get_secret_value()}"}
-    if transport == "sse":
-      connection_params = SseConnectionParams(
-          url=str(url), headers=headers, timeout=timeout
-      )
-    else:
-      connection_params = StreamableHTTPConnectionParams(
-          url=str(url), headers=headers, timeout=timeout
-      )
-
-  return McpToolset(
-      connection_params=connection_params,
-      tool_filter=tool_filter,
-  )
diff --git a/tests/integration/test_adk_agent_registration.py b/tests/integration/test_adk_agent_registration.py
index d636a2b..07343f5 100644
--- a/tests/integration/test_adk_agent_registration.py
+++ b/tests/integration/test_adk_agent_registration.py
@@ -14,9 +14,10 @@
 
 """End-to-end ADK Agent registration tests.
 
-These tests confirm the new and existing search tools register cleanly
-with ``google.adk.Agent`` and surface a usable ``FunctionDeclaration``.
-They do not call any LLM, so they run without API keys.
+These tests confirm search tools register cleanly with
+``google.adk.Agent`` and surface a usable ``FunctionDeclaration``. They
+also confirm a native ADK ``McpToolset`` pointed at the RedisVL MCP
+server attaches to an Agent. No LLM calls; runs without API keys.
 """
 
 from __future__ import annotations
@@ -32,11 +33,28 @@
 from google.adk.agents.readonly_context import ReadonlyContext
 from redisvl.index import SearchIndex
 
-from adk_redis import create_redisvl_mcp_toolset
 from adk_redis import RedisSQLSearchTool
 from adk_redis import RedisTextQueryConfig
 from adk_redis import RedisTextSearchTool
 
+# google-adk exposes its MCP surface from google.adk.tools.mcp_tool only when
+# the optional `mcp` dependency is importable. The package's __init__.py
+# silently swallows ImportError on partial installs, so test it by attempting
+# the actual symbol import; gate the MCP-specific tests on that.
+try:
+  from google.adk.tools.mcp_tool import McpToolset
+  from google.adk.tools.mcp_tool.mcp_session_manager import (
+      StdioConnectionParams,
+  )
+  from google.adk.tools.mcp_tool.mcp_session_manager import (
+      StreamableHTTPConnectionParams,
+  )
+  from mcp import StdioServerParameters
+
+  _MCP_AVAILABLE = True
+except ImportError:
+  _MCP_AVAILABLE = False
+
 
 class TestSearchToolsRegisterWithAgent:
   """Agent.canonical_tools surfaces every search tool it was handed."""
@@ -76,24 +94,42 @@ async def test_text_search_tool_registers(self):
     assert "redis_text_search" in names
 
 
+@pytest.mark.skipif(
+    not _MCP_AVAILABLE,
+    reason="google-adk MCP support not available in this install",
+)
 class TestRedisVLMcpToolsetRegistersWithAgent:
-  """The MCP toolset registers as an Agent tool source."""
+  """A native ADK McpToolset against rvl mcp attaches to an Agent."""
 
   def test_streamable_http_toolset_registers(self):
-    toolset = create_redisvl_mcp_toolset(url="http://localhost:8000/mcp")
+    toolset = McpToolset(
+        connection_params=StreamableHTTPConnectionParams(
+            url="http://localhost:8000/mcp",
+        ),
+        tool_filter=["search-records"],
+    )
     agent = Agent(
         model="gemini-2.5-flash",
         name="test_agent",
         tools=[toolset],
     )
-    # The toolset is held on the agent (no LLM dispatch required).
     assert toolset in agent.tools
 
   def test_stdio_toolset_registers(self):
-    toolset = create_redisvl_mcp_toolset(
-        transport="stdio",
-        config_path="/etc/redisvl/mcp.yaml",
-        read_only=True,
+    toolset = McpToolset(
+        connection_params=StdioConnectionParams(
+            server_params=StdioServerParameters(
+                command="rvl",
+                args=[
+                    "mcp",
+                    "--config",
+                    "/etc/redisvl/mcp.yaml",
+                    "--read-only",
+                ],
+            ),
+            timeout=30,
+        ),
+        tool_filter=["search-records"],
     )
     agent = Agent(
         model="gemini-2.5-flash",
diff --git a/tests/tools/test_mcp_search.py b/tests/tools/test_mcp_search.py
deleted file mode 100644
index 2968337..0000000
--- a/tests/tools/test_mcp_search.py
+++ /dev/null
@@ -1,170 +0,0 @@
-# Copyright 2025 Redis, Inc.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""Tests for create_redisvl_mcp_toolset."""
-
-from __future__ import annotations
-
-from pydantic import SecretStr
-import pytest
-
-pytest.importorskip("google.adk.tools.mcp_tool")
-
-from google.adk.tools.mcp_tool import McpToolset
-from google.adk.tools.mcp_tool.mcp_session_manager import SseConnectionParams
-from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
-from google.adk.tools.mcp_tool.mcp_session_manager import (
-    StreamableHTTPConnectionParams,
-)
-
-from adk_redis.tools.mcp_search import create_redisvl_mcp_toolset
-from adk_redis.tools.mcp_search import REDISVL_MCP_TOOL_SEARCH
-from adk_redis.tools.mcp_search import REDISVL_MCP_TOOL_UPSERT
-
-
-class TestCreateRedisVLMcpToolsetValidation:
-  """Argument validation."""
-
-  def test_requires_url_or_config_path(self):
-    with pytest.raises(ValueError, match="url.*config_path"):
-      create_redisvl_mcp_toolset()
-
-  def test_url_and_config_path_are_mutually_exclusive(self):
-    with pytest.raises(ValueError, match="mutually exclusive"):
-      create_redisvl_mcp_toolset(
-          url="http://localhost:8000",
-          config_path="/etc/redisvl.yaml",
-      )
-
-  def test_stdio_requires_config_path(self):
-    with pytest.raises(ValueError, match="config_path"):
-      create_redisvl_mcp_toolset(transport="stdio", url="http://x")
-
-  def test_url_transports_reject_config_path(self):
-    with pytest.raises(ValueError):
-      create_redisvl_mcp_toolset(
-          transport="sse", config_path="/etc/redisvl.yaml"
-      )
-
-  def test_unknown_transport_raises_value_error(self):
-    """Regression: typo in `transport` must fail loudly, not silently fall through."""
-    with pytest.raises(ValueError, match="transport"):
-      create_redisvl_mcp_toolset(
-          url="http://localhost:8000/mcp",
-          transport="stdioo",  # type: ignore[arg-type]
-      )
-
-
-class TestCreateRedisVLMcpToolsetStdio:
-  """Stdio transport: spawn `rvl mcp --config <path>`."""
-
-  def test_stdio_returns_mcp_toolset(self):
-    toolset = create_redisvl_mcp_toolset(
-        transport="stdio",
-        config_path="/etc/redisvl.yaml",
-    )
-    assert isinstance(toolset, McpToolset)
-
-  def test_stdio_connection_params_shape(self):
-    toolset = create_redisvl_mcp_toolset(
-        transport="stdio",
-        config_path="/etc/redisvl.yaml",
-    )
-    params = toolset._connection_params
-    assert isinstance(params, StdioConnectionParams)
-    assert params.server_params.command == "rvl"
-    assert "mcp" in params.server_params.args
-    assert "--config" in params.server_params.args
-    assert "/etc/redisvl.yaml" in params.server_params.args
-
-  def test_stdio_read_only_flag_propagates(self):
-    toolset = create_redisvl_mcp_toolset(
-        transport="stdio",
-        config_path="/etc/redisvl.yaml",
-        read_only=True,
-    )
-    params = toolset._connection_params
-    assert "--read-only" in params.server_params.args
-
-  def test_stdio_no_read_only_when_false(self):
-    toolset = create_redisvl_mcp_toolset(
-        transport="stdio",
-        config_path="/etc/redisvl.yaml",
-        read_only=False,
-    )
-    params = toolset._connection_params
-    assert "--read-only" not in params.server_params.args
-
-
-class TestCreateRedisVLMcpToolsetStreamableHttp:
-  """Streamable-HTTP transport: connect to a remote server."""
-
-  def test_streamable_http_default(self):
-    toolset = create_redisvl_mcp_toolset(url="http://localhost:8000/mcp")
-    params = toolset._connection_params
-    assert isinstance(params, StreamableHTTPConnectionParams)
-    assert params.url == "http://localhost:8000/mcp"
-
-  def test_streamable_http_bearer_token_in_headers(self):
-    toolset = create_redisvl_mcp_toolset(
-        url="http://localhost:8000/mcp",
-        auth_token=SecretStr("s3cret"),
-    )
-    params = toolset._connection_params
-    assert params.headers is not None
-    assert params.headers.get("Authorization") == "Bearer s3cret"
-
-  def test_streamable_http_no_headers_without_token(self):
-    toolset = create_redisvl_mcp_toolset(url="http://localhost:8000/mcp")
-    params = toolset._connection_params
-    assert params.headers is None or "Authorization" not in (
-        params.headers or {}
-    )
-
-
-class TestCreateRedisVLMcpToolsetSse:
-  """SSE transport: connect to a remote SSE server."""
-
-  def test_sse_returns_correct_params(self):
-    toolset = create_redisvl_mcp_toolset(
-        url="http://localhost:8000/sse",
-        transport="sse",
-    )
-    params = toolset._connection_params
-    assert isinstance(params, SseConnectionParams)
-    assert params.url == "http://localhost:8000/sse"
-
-  def test_sse_bearer_token_in_headers(self):
-    toolset = create_redisvl_mcp_toolset(
-        url="http://localhost:8000/sse",
-        transport="sse",
-        auth_token=SecretStr("t"),
-    )
-    params = toolset._connection_params
-    assert params.headers.get("Authorization") == "Bearer t"
-
-
-class TestCreateRedisVLMcpToolsetFilterAndConstants:
-  """Tool filter and exported tool-name constants."""
-
-  def test_tool_filter_passthrough(self):
-    toolset = create_redisvl_mcp_toolset(
-        url="http://localhost:8000/mcp",
-        tool_filter=[REDISVL_MCP_TOOL_SEARCH],
-    )
-    assert toolset.tool_filter == [REDISVL_MCP_TOOL_SEARCH]
-
-  def test_exports_known_tool_constants(self):
-    assert REDISVL_MCP_TOOL_SEARCH == "search-records"
-    assert REDISVL_MCP_TOOL_UPSERT == "upsert-records"