redis-developer · nkanu17 · May 20, 2026 · May 20, 2026 · May 20, 2026 · May 20, 2026
diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
@@ -0,0 +1,63 @@
+name: Docs
+
+on:
+  push:
+    branches: [main]
+    paths:
+      - "docs/**"
+      - "mkdocs.yml"
+      - "src/adk_redis/**"
+      - "pyproject.toml"
+      - ".github/workflows/docs.yml"
+  workflow_dispatch:
+
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+concurrency:
+  group: pages
+  cancel-in-progress: false
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Install uv
+        uses: astral-sh/setup-uv@v4
+        with:
+          version: "latest"
+
+      - name: Set up Python
+        run: uv python install 3.12
+
+      - name: Install docs dependencies
+        # docs extra pulls in mkdocs-material, mkdocstrings, llmstxt, etc.
+        # The `all` extra is included so mkdocstrings can import the package
+        # surface (e.g., redisvl) when rendering API reference pages.
+        run: uv sync --extra docs --extra all
+
+      - name: Build site
+        # No --strict: pre-existing griffe warnings in tools/memory/*
+        # docstrings would block deploys. Tracked separately.
+        run: uv run mkdocs build
+
+      - name: Upload site
+        uses: actions/upload-pages-artifact@v3
+        with:
+          path: site
+
+  deploy:
+    if: github.ref == 'refs/heads/main'
+    needs: build
+    runs-on: ubuntu-latest
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+    steps:
+      - name: Deploy to GitHub Pages
+        id: deployment
+        uses: actions/deploy-pages@v4
diff --git a/docs/concepts/adk_overview.md b/docs/concepts/adk_overview.md
@@ -1,30 +1,53 @@
 # ADK Overview
 
-The [Google Agent Development Kit (ADK)](https://github.com/google/adk-python) is a framework for building AI agents with Google's Gemini models. `adk-redis` provides Redis-backed implementations of ADK's service interfaces.
+The [Google Agent Development Kit (ADK)](https://github.com/google/adk-python) is a framework for building AI agents with Google's Gemini models. `adk-redis` provides Redis-backed implementations of ADK's service interfaces so you can move from prototype to production without rewriting your agent.
 
-## ADK abstractions
+## Architecture
 
-| Abstraction | What it does | Redis implementation |
-|-------------|-------------|---------------------|
-| **Agent** | The reasoning core: plans, calls tools, responds | No change (ADK provides this) |
-| **Session** | Conversation state across turns | `RedisSessionService` |
-| **Memory** | Persistent knowledge across sessions | `RedisMemoryService` |
-| **Tool** | Functions the agent can call | RedisVL search tools |
+```mermaid
+flowchart TD
+    subgraph Agent [Your ADK Agent]
+        SS[Session Service<br/>working memory]
+        MS[Memory Service<br/>long-term]
+        ST[Search Tools<br/>vector · hybrid · SQL]
+        SC[Semantic Cache<br/>before/after callbacks]
+    end
 
-## Where Redis fits
+    SS & MS -->|REST / MCP| AMS
+    ST -->|RedisVL / MCP| R
+    SC -->|RedisVL / LangCache| R
 
-Redis replaces the default in-memory implementations with durable, scalable alternatives:
+    subgraph AMS [Agent Memory Server]
+        WM[Working Memory API]
+        LTM[Long-Term Memory API]
+    end
 
-- **Sessions** are stored as Redis JSON documents with optional TTL
-- **Memory** is proxied to the Redis Agent Memory Server for two-tier storage
-- **Search tools** use RedisVL for vector similarity search
-- **Caching** uses Redis for semantic LLM response caching
+    AMS --> R
 
-## When to use adk-redis
+    subgraph R [Redis 8.4+]
+        JSON[(JSON storage)]
+        VEC[(Vector index)]
+        FTS[(Full-text index)]
+    end
+```
 
-Use `adk-redis` when you are building a Google ADK agent and need:
+## ADK Interfaces
 
-- Session persistence across process restarts
-- Long-term memory that survives beyond a single conversation
-- Vector search over your own documents
-- Production deployment with Redis as the data layer
+`adk-redis` implements four ADK extension points. Each one maps to a concept page with full details.
+
+| ADK Interface | `adk-redis` implementation | Concept page |
+|---------------|---------------------------|-------------|
+| `BaseSessionService` | `RedisWorkingMemorySessionService` | [Sessions + Memory Services](sessions.md) |
+| `BaseMemoryService` | `RedisLongTermMemoryService` | [Sessions + Memory Services](sessions.md) |
+| `BaseTool` | Search tools (`RedisVectorSearchTool`, `RedisHybridSearchTool`, etc.) and memory tools (`SearchMemoryTool`, `CreateMemoryTool`, etc.) | [Search Tools](search.md), [Memory MCP + Tools](memory.md) |
+| Model callbacks | `LLMResponseCache` with `RedisVLCacheProvider` or `LangCacheProvider` | [Semantic Caching](caching.md) |
+
+## Running Your Agent
+
+ADK provides several ways to run and test agents:
+
+- **`adk web`**: browser-based UI for interactive development and debugging.
+- **`adk run`**: terminal-based interaction.
+- **`adk api_server`**: RESTful API for production deployment.
+
+See the [ADK runtime documentation](https://google.github.io/adk-docs/runtime/) for details.
diff --git a/docs/concepts/caching.md b/docs/concepts/caching.md
@@ -0,0 +1,144 @@
+# Semantic Caching
+
+`adk-redis` provides semantic caching that skips LLM calls when a user sends a prompt that is similar (or identical) to one already answered. This reduces latency and cost without changing agent behavior.
+
+## Quick Reference
+
+| Feature | Details |
+|---------|---------|
+| **What it caches** | LLM responses keyed by prompt similarity |
+| **Similarity** | Vector distance between prompt embeddings |
+| **Providers** | `RedisVLCacheProvider` (self-hosted) or `LangCacheProvider` (managed) |
+| **TTL** | Configurable per-entry expiration |
+| **Integration** | ADK `before_model_callback` / `after_model_callback` hooks |
+
+## How It Works
+
+```mermaid
+flowchart TD
+    U([User prompt]) --> BC[before_model_callback<br/>embed prompt, search cache]
+    BC --> D{Cache hit?}
+    D -->|Yes| CR([Return cached response<br/>no LLM call])
+    D -->|No| LLM[Call LLM]
+    LLM --> AC[after_model_callback<br/>store response in cache]
+    AC --> R([Return LLM response])
+
+    subgraph Cache [Redis Cache]
+        SE[(Semantic index<br/>prompt embeddings)]
+    end
+
+    BC <--> Cache
+    AC --> Cache
+```
+
+1. Before the LLM is called, `LLMResponseCache` embeds the prompt and searches for a semantically similar entry in the cache.
+2. If the distance is below the configured threshold, the cached response is returned immediately (no LLM call).
+3. If no match is found, the LLM runs normally and the response is stored in the cache for future hits.
+
+## Two Provider Options
+
+### Self-Hosted (RedisVL)
+
+Use `RedisVLCacheProvider` when you run your own Redis instance and want full control over the vectorizer and cache index.
+
+```python
+from redisvl.utils.vectorize import HFTextVectorizer
+
+from adk_redis.cache import (
+    LLMResponseCache,
+    LLMResponseCacheConfig,
+    RedisVLCacheProvider,
+    RedisVLCacheProviderConfig,
+)
+
+vectorizer = HFTextVectorizer(model="redis/langcache-embed-v1")
+
+provider = RedisVLCacheProvider(
+    config=RedisVLCacheProviderConfig(
+        redis_url="redis://localhost:6379",
+        name="my_cache",
+        ttl=3600,
+        distance_threshold=0.1,
+    ),
+    vectorizer=vectorizer,
+)
+```
+
+**Requirements**: `pip install 'adk-redis[search]'` and a running Redis instance.
+
+### Managed (LangCache)
+
+Use `LangCacheProvider` with [Redis LangCache](https://redis.io/langcache) for a fully managed service. No local vectorizer needed; embeddings are handled server-side.
+
+```python
+from adk_redis.cache import (
+    LLMResponseCache,
+    LLMResponseCacheConfig,
+    LangCacheProvider,
+    LangCacheProviderConfig,
+)
+
+provider = LangCacheProvider(
+    config=LangCacheProviderConfig(
+        cache_id="your-cache-id",
+        api_key="your-api-key",
+        server_url="https://aws-us-east-1.langcache.redis.io",
+        ttl=3600,
+    ),
+)
+```
+
+**Requirements**: `pip install 'adk-redis[langcache]'` and a LangCache account.
+
+## Wiring Into an Agent
+
+Both providers use the same `LLMResponseCache` wrapper, which produces ADK-compatible callbacks:
+
+```python
+from adk_redis.cache import create_llm_cache_callbacks
+
+llm_cache = LLMResponseCache(
+    provider=provider,
+    config=LLMResponseCacheConfig(
+        first_message_only=True,   # only cache the first user message
+        include_app_name=True,     # scope cache keys by app
+        include_user_id=True,      # scope cache keys by user
+    ),
+)
+
+before_cb, after_cb = create_llm_cache_callbacks(llm_cache)
+
+agent = Agent(
+    model="gemini-2.0-flash",
+    name="my_agent",
+    before_model_callback=before_cb,
+    after_model_callback=after_cb,
+)
+```
+
+## When to Use Which
+
+| Provider | Use when |
+|----------|----------|
+| **RedisVL** | You already run Redis, want local embeddings, need full control over cache index schema. |
+| **LangCache** | You want a managed service with no infrastructure, server-side embeddings, and built-in analytics. |
+
+## Configuration Options
+
+| Option | Provider | Default | Description |
+|--------|----------|---------|-------------|
+| `distance_threshold` | Both | `0.1` | Max vector distance for a cache hit (lower = stricter) |
+| `ttl` | Both | `None` | Time-to-live in seconds for cache entries |
+| `name` | RedisVL | `llmcache` | Redis index name for the cache |
+| `redis_url` | RedisVL | `redis://localhost:6379` | Redis connection string |
+| `cache_id` | LangCache | Required | LangCache instance identifier |
+| `api_key` | LangCache | Required | LangCache API key |
+| `use_exact_search` | LangCache | `True` | Enable exact (hash) matching in addition to semantic |
+| `use_semantic_search` | LangCache | `True` | Enable semantic (vector) matching |
+
+## Next Steps
+
+- [Semantic cache example](https://github.com/redis-developer/adk-redis/tree/main/examples/semantic_cache) for a runnable self-hosted demo.
+- [LangCache example](https://github.com/redis-developer/adk-redis/tree/main/examples/langcache_cache) for a runnable managed demo.
+- [Sessions + Memory services](sessions.md) and [Sessions + Memory MCP](memory.md) for the other Redis-backed features.
+- [ADK runtime options](https://google.github.io/adk-docs/runtime/) for `adk web`, `adk run`, and `adk api_server`.
diff --git a/docs/concepts/index.md b/docs/concepts/index.md
@@ -4,32 +4,55 @@ description: Foundational concepts for adk-redis.
 
 # Concepts
 
-How adk-redis maps Google ADK service interfaces onto Redis.
+`adk-redis` maps Google ADK service interfaces onto Redis, the Agent Memory
+Server, and RedisVL. These pages explain the **what** and **why** behind each
+feature. For step-by-step setup instructions, see the
+[User Guide](../user_guide/index.md).
+
+There are four ways to use this integration. Pick the page that matches your goal.
 
 <div class="grid cards" markdown>
 
 -   :material-google:{ .lg .middle } **[ADK overview](adk_overview.md)**
 
     ---
 
-    A short tour of the Google Agent Development Kit interfaces this package implements.
+    Architecture diagram and the ADK interfaces this package implements.
+
+-   :material-brain:{ .lg .middle } **[Sessions + Memory Services](sessions.md)**
+
+    ---
+
+    Framework-managed sessions and memory. The ADK Runner handles everything automatically.
 
--   :material-account-multiple:{ .lg .middle } **[Sessions](sessions.md)**
+-   :material-tools:{ .lg .middle } **[Sessions + Memory MCP + Tools](memory.md)**
 
     ---
 
-    Session storage model and ADK session-service contract.
+    LLM-controlled memory via MCP or REST-based tools. The agent decides when to remember and recall.
 
--   :material-brain:{ .lg .middle } **[Memory](memory.md)**
+-   :material-database-search:{ .lg .middle } **[RedisVL MCP + Search Tools](search.md)**
 
     ---
 
-    Short-term and long-term memory layered over Agent Memory Server.
+    Vector, hybrid, range, text, and SQL search over your own data via in-process tools or MCP.
 
--   :material-database-search:{ .lg .middle } **[Search](search.md)**
+-   :material-cached:{ .lg .middle } **[Semantic Caching](caching.md)**
 
     ---
 
-    Vector and lexical search backing the ADK tool surface.
+    Skip repeat LLM calls with self-hosted (RedisVL) or managed (LangCache) semantic caching.
 
 </div>
+
+## Where to Start
+
+| Goal | Read this |
+|------|-----------|
+| Understand the big picture | [ADK overview](adk_overview.md) |
+| Let the framework manage sessions and memory | [Sessions + Memory Services](sessions.md) |
+| Give the LLM explicit memory tools | [Sessions + Memory MCP + Tools](memory.md) |
+| Search your own knowledge base | [RedisVL MCP + Search Tools](search.md) |
+| Reduce LLM latency and cost | [Semantic Caching](caching.md) |
+| Get a working agent running | [Quickstart](../user_guide/01_integration.md) |
+| Run and test your agent | [ADK runtime](https://google.github.io/adk-docs/runtime/) |