AscendAI

Multi-provider AI orchestrator with MCP, RAG, and semantic memory. Built on Spring AI.

graph TB
    User["User"]

    subgraph "AscendAI Platform"
        Agent["AscendAgent<br/>REST API :9917<br/>Spring Boot · Java 21"]

        subgraph "MCP Tool Services"
            AudioScribe["AudioScribe<br/>:7017<br/>Audio Transcription"]
            Weather["WeatherMCP<br/>:9998<br/>Weather Data"]
            WebSearch["AscendWebSearch<br/>:7021<br/>Web Search"]
            PaddleOCR["PaddleOCR<br/>:7022<br/>OCR"]
        end

        Memory["AscendMemory<br/>:7020<br/>Semantic Memory"]
    end

    subgraph "AI Providers"
        Local["LM Studio<br/>(local, default)"]
        Cloud["OpenAI · Anthropic<br/>Gemini · MiniMax"]
    end

    subgraph "Data Layer"
        PG["PostgreSQL<br/>Chat · Metadata"]
        RD["Redis<br/>Chat cache"]
        QD["Qdrant<br/>Vector DB"]
        S3["MinIO<br/>Documents"]
    end

    User -->|"REST"| Agent
    Agent -->|"MCP"| AudioScribe
    Agent -->|"MCP"| Weather
    Agent -->|"MCP"| WebSearch
    Agent -->|"MCP"| PaddleOCR
    Agent -->|"REST"| Memory
    Agent -.-> Local
    Agent -.-> Cloud
    Agent --> PG
    Agent --> RD
    Agent --> QD
    Agent --> S3
    Memory --> QD

Why this exists

I built AscendAI because off-the-shelf orchestrators don't let you swap providers per request, run a privacy-respecting search backend you fully control, and persist semantic memory across sessions in one coherent platform. AscendAI does all three. Each prompt routes to the model you pick at call time (local LM Studio, OpenAI, Anthropic, Gemini, MiniMax). MCP tool servers handle audio, web, weather, and OCR. Long-term context lives in a Mem0-backed Qdrant store so conversations actually accumulate knowledge.

Features

Per-request provider routing. Pick LM Studio, OpenAI, Anthropic, Gemini, or MiniMax on every API call without restarts or config changes.
RAG pipeline with Qdrant. Thresholded soft-retrieval over ingested documents using provider-matched embedding dimensions (768 / 1536).
Semantic memory via Mem0. Long-lived, user-scoped memories searchable across sessions through the AscendMemory service.
MCP tool servers. First-class integrations for audio transcription (AudioScribe), web search (AscendWebSearch + SearXNG), weather (WeatherMCP), and OCR (PaddleOCR).
Document ingestion to MinIO. Drop files (Markdown, PDF, DOCX) into a bucket and the pipeline parses them via Docling / Unstructured and indexes them automatically.
Hybrid chat history. Redis for the active context window, PostgreSQL for durable long-term archives and analytics.
Privacy-respecting web. SearXNG meta-search plus FlareSolverr for Cloudflare-protected pages, all self-hosted.

How it compares

The honest peer set is other deployable AI orchestration backends that bundle multi-provider routing, RAG, memory, and tools into one self-hosted service. Not chat UIs, not low-code workflow builders, not pure router proxies, not libraries. All four below are mature and well-known in this niche.

	AscendAI	R2R	Letta	Onyx	Quivr	LangChain
Shape	Deployable service	Deployable service	Deployable service	Deployable service	Deployable service	Library / framework
Stack	Java 21 / Spring AI	Python	Python	Python	Python	Python (TS port)
API-first (no UI shipped)	Yes	Yes	Yes (server on `:8283`)	UI bundled, API-driven	UI bundled, API exposed	N/A, you build it
Per-request provider switch	Built-in	Built-in	Built-in	Built-in	Built-in	Possible via chain rebuild
RAG over uploaded docs	Built-in (Qdrant + threshold)	Built-in (multimodal, hybrid, KGs)	Lighter, agent-state focused	Built-in (40+ connectors)	Built-in (pluggable stores)	Many backends, you wire it
Persistent semantic memory	Mem0 + Qdrant	Add-on	Native (OS-style hierarchical)	Add-on	Built-in	Roll-your-own
Tool integration model	MCP-native (Spring AI MCP client)	Function tools	Function tools	Function tools + connectors	Function tools	Tools + MCP via adapters
Single docker compose deploy	Yes	Yes	Yes	Yes	Yes	Bring-your-own

LangChain isn't strictly a peer. It's a framework, not a deployable service. It's in the table because it's the most likely thing readers reach for when they think "AI orchestration", and the honest answer is "if you're already wiring your own service in LangChain, you don't need AscendAI."

Where AscendAI is honestly distinctive

JVM-native. Every credible peer in this niche is Python or TypeScript. If you live in Spring Boot already, AscendAI drops in alongside the rest of your services without a polyglot deploy.
MCP-first tool model. Onyx and Letta do tool use; AscendAI is built around MCP from day one with multiple bundled MCP servers (audio, OCR, web search, weather). Add new tools by pointing the agent at another MCP server, no code changes.
Breadth of integration in one stack. RAG, semantic memory, MCP tools, multi-provider routing, hot / archive chat history. All present, no add-ons.

Where it loses

No UI. Onyx and Quivr ship one. AscendAI is a backend you put behind your own client.
Smaller community. All four peers above have more stars, more contributors, more battle testing.
RAG depth. R2R has a more sophisticated RAG pipeline (knowledge graphs, multimodal). AscendAI's RAG is solid but plain.
Memory depth. Letta's memory architecture is more advanced than the Mem0-based memory here.

If you're already happy in Python with R2R or Letta, you don't need this. AscendAI exists because I wanted these capabilities in a Spring-native deployment.

Demo

Send a prompt with per-request provider and model selection. The endpoint accepts multipart/form-data (so you can attach an optional image or document).

Bash:

curl -X POST http://localhost:9917/api/v1/ai/prompt \
  -H "X-User-Id: luksarna" \
  -F "prompt=Summarize my notes on Spring AI and MCP." \
  -F "provider=anthropic" \
  -F "model=claude-sonnet-4-6" \
  -F "embeddingProvider=lmstudio"

PowerShell 7+ (-Form supports multipart):

Invoke-RestMethod -Uri http://localhost:9917/api/v1/ai/prompt -Method Post -Headers @{ "X-User-Id" = "luksarna" } -Form @{ prompt = "Summarize my notes on Spring AI and MCP."; provider = "anthropic"; model = "claude-sonnet-4-6"; embeddingProvider = "lmstudio" }

Sample response (AiResponse: content plus an unwrapped Spring AI ChatResponseMetadata and the list of MCP tools invoked during the turn):

{
  "content": "Your notes describe AscendAI as a Spring AI orchestrator that routes prompts across providers and uses MCP for tool calls. Per-request model selection happens via /api/v1/ai/prompt; RAG runs over Qdrant collections (ascendai-768 / -1536); semantic memory is backed by Mem0…",
  "id": "msg_01ABcDEf…",
  "model": "claude-sonnet-4-6",
  "usage": { "promptTokens": 1842, "completionTokens": 312, "totalTokens": 2154 },
  "toolsUsed": ["ascend_memory_search", "web_search"]
}

Architecture

Two architecture entry points, depending on what you're after.

Monorepo architecture. System overview, service interactions, deployment, ADRs.
AscendAgent arc42. Internals, component diagrams, module ADRs.

Module	Stack	Port	Role
AscendAgent	Java 21 / Spring Boot	9917	API gateway, multi-provider AI, RAG, MCP client
AudioScribe	Python / FastMCP	7017	Audio transcription (Whisper / OpenAI / HF)
AscendWebSearch	Python / FastMCP	7021	Web search and scraping via SearXNG
AscendMemory	Python / FastAPI	7020	Semantic memory (Mem0 + Qdrant)
WeatherMCP	Java / Spring Boot	9998	Weather data MCP server
PaddleOCR	Python / FastMCP	7022	OCR service

Request flow

How a single prompt traverses the platform.

sequenceDiagram
    autonumber
    participant U as User
    participant A as AscendAgent
    participant R as Redis<br/>(short-term)
    participant M as AscendMemory<br/>(Mem0)
    participant Q as Qdrant<br/>(RAG)
    participant T as MCP Tools<br/>(Web/Audio/OCR/Weather)
    participant L as LLM Provider<br/>(per-request)
    participant P as PostgreSQL<br/>(long-term)

    U->>A: POST /api/v1/ai/prompt
    A->>R: Load chat window
    A->>M: Search semantic memory
    M->>Q: Vector search (memory collection)
    M-->>A: Top-k memories
    A->>Q: RAG retrieval (doc collection)
    Q-->>A: Top-k chunks (above threshold)
    A->>L: Prompt + memory + RAG + tool defs
    L-->>A: Tool call request (optional)
    A->>T: Invoke MCP tool
    T-->>A: Tool result
    A->>L: Tool result, then final answer
    L-->>A: Response + usage
    A->>R: Append turn
    A->>P: Persist transcript
    A->>M: Extract and store new memories
    A-->>U: AiResponse {content, metadata, toolsUsed}

Quick Start

Prerequisites

Docker Desktop
Java 21
PostgreSQL on 5432, Redis on 6379, Qdrant on 6333 / 6334, MinIO on 9070 / 9071 (admin / password)

Run it

1. Provision secrets.

Copy the example file and fill in the API keys you plan to use. .env is gitignored. Provider keys are optional individually. Leave a key blank and just don't pick that provider at request time.

Bash:

cp .env.example .env

PowerShell:

Copy-Item .env.example .env

2. Bring up the stack.

The main compose file pulls in ascend-scrapper.docker-compose.yaml via include:, so a single up brings up the full stack (AscendAgent + tool services + scrapper).

Bash:

docker compose up -d --build

PowerShell:

docker compose up -d --build

Optional, bring up only the web-scraping stack as its own Docker Desktop group.

Bash:

docker compose -f ascend-scrapper.docker-compose.yaml up -d --build

PowerShell:

docker compose -f ascend-scrapper.docker-compose.yaml up -d --build

3. Ensure PostgreSQL has the ascend_ai database (user postgres, password local).

On first start the agent creates the MinIO knowledge-base bucket and initialises metadata tables. The API is then available at http://localhost:9917. Check the startup banner for live status of every dependency.

4. Optional: run the agent on the host.

For active development with hot reload and an attached debugger, run the agent on the host instead of in the container. Stop the container first (docker compose stop ascend-agent) so port 9917 is free.

Bash:

cd AscendAgent

./gradlew bootRun

PowerShell:

cd AscendAgent

./gradlew bootRun

For advanced compose flags, per-service rebuilds, and production notes see docs/DEPLOYMENT.md. For document ingestion see docs/INGESTION.md.

Supported AI Providers

Per-request selection across the providers below. Models listed are the ones currently wired in application.yaml (chat default, memory extraction, and history compaction). Any model the provider accepts works at request time via the model form field; these are the values that ship with the agent.

OpenAI. gpt-4o (default), gpt-4o-mini (extraction + compaction).
Anthropic. claude-sonnet-4-5 (default), claude-3-5-haiku-20241022 (extraction), claude-haiku-4-5 (compaction).
Gemini. gemini-flash-latest (default), gemini-flash-lite-latest (extraction + compaction).
MiniMax. MiniMax-M2.7 (default + extraction + compaction).
LM Studio. meta-llama-3.1-8b-instruct (default, local).

Override per request with the provider and model form fields, or globally via the *_MODEL env vars listed in .env.example.

Configuration & Ports

AscendAI services

Each service ships both REST and MCP surfaces (except WeatherMCP, MCP-only). The "Used by AscendAgent via" column shows the actual transport AscendAgent uses today. The other surface is available for direct external use.

Service	Port	Surfaces	Used by AscendAgent via	Role
AscendAgent	`9917`	REST	(this is the agent)	API gateway and orchestrator. `POST /api/v1/ai/prompt` is the entry.
AscendMemory	`7020`	REST + MCP	REST	Semantic memory store (Mem0 + Qdrant). Search / insert per user.
AudioScribe	`7017`	REST + MCP	MCP (Streamable HTTP)	Speech-to-text (faster-whisper / OpenAI / HF / Audacity merge).
AscendWebSearch	`7021`	REST + MCP	MCP (Streamable HTTP)	Web search + content extraction (SearXNG, Cloudflare, NoVNC).
PaddleOCR	`7022`	REST + MCP	MCP (Streamable HTTP)	Image OCR.
WeatherMCP	`9998`	MCP only (SSE)	MCP (SSE)	Weather data tool (reference Spring AI MCP server).

Support services (in-stack, deployed via compose)

Service	Port	Default credentials	Role
SearXNG	`9020`	(none)	Privacy-respecting meta-search; backend for AscendWebSearch.
FlareSolverr	`8191`	(none)	Cloudflare bypass proxy used by AscendWebSearch.
ngrok (web-search)	(none)	`NGROK_AUTHTOKEN`	Public tunnel to AscendWebSearch's NoVNC for remote CAPTCHA intervention.
Docling Serve	`5001`	(none)	PDF / DOCX to structured JSON (used by ingestion pipeline).
Unstructured API	`9080`	(none)	Generic document parsing fallback for ingestion.

Observability stack (in-stack, deployed via compose)

Full setup and usage in observability/README.md.

Service	Port	Exposed	Role
Grafana	`7078` → `3000`	Browser UI	Dashboards + Explore. Anonymous `Viewer`; `admin` / `admin` to edit.
Prometheus	`7077` → `9090`	Browser UI	Scrapes metrics from the 6 services, Qdrant, and MinIO.
Loki	`3100`	Internal only	Log store; receives logs from Vector.
Tempo	(none)	Internal only	Trace store; receives traces from the OTel Collector.
Vector	(none)	Internal only	Tails the 6 app containers' Docker logs and ships them to Loki.
OTel Collector	(none)	Internal only	Receives OTLP traces from the services and exports them to Tempo.

External prerequisites (not in compose, managed / cloud in production)

Service	Port	Default credentials	Role
PostgreSQL	`5432`	`postgres` / `local`	Chat-history archive, ingestion metadata, user instructions.
Redis	`6379`	(none)	Short-term chat-history cache, session state.
Qdrant	`6333` / `6334`	(none)	Vector DB for RAG (`ascendai-768/1536`) and Mem0 memory.
MinIO	`9070` / `9071`	`admin` / `password`	S3-compatible object store for ingested documents.

Documentation

Canonical index. Every doc the repo ships, in one place.

File	What's in it
docs/architecture/README.md	Monorepo architecture: system view, ADRs, deployment topology.
docs/architecture/arc42/01-introduction-and-goals.md	Arc42 entry point for the platform.
AscendAgent/docs/architecture/arc42/01-introduction-and-goals.md	Arc42 for the agent internals.
docs/DEPLOYMENT.md	Docker Compose recipes, image publishing, prod notes.
docs/INGESTION.md	Upload flows for the RAG pipeline.
docs/TROUBLESHOOTING.md	Qdrant / MinIO / PostgreSQL / Redis reset recipes.
docs/OBSERVABILITY.md	Metrics, logs, traces — what is collected, dashboards, how to instrument.
observability/README.md	Observability stack services (Grafana / Prometheus / Loki / Tempo / Vector / OTel), pipeline, and how to view logs.
docs/AGENT_TOOLING.md	Agent-standards import, OpenSpec workflow.
docs/AGENTS-UPDATE.md	Per-OS selective refresh of skills, subagents, and shipped docs.
docs/MCP_SETUP.md	How to configure the MCP servers wired into agent sessions.
AscendAgent/e2e/README.md	End-to-end capability tests, fixtures, Bruno collection.
AGENTS.md	Shared instructions for any AI coding agent operating in this repo.
.github/workflows/README.md	CI and Release workflow operator notes: secrets, bump convention, how to cut a release.

License

Released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
.agents/skills		.agents/skills
.claude		.claude
.codex		.codex
.github/workflows		.github/workflows
.kilocode		.kilocode
.opencode		.opencode
.run		.run
AscendAgent		AscendAgent
AscendMemory		AscendMemory
AscendWebSearch		AscendWebSearch
AudioScribe		AudioScribe
PaddleOCR		PaddleOCR
WeatherMCP		WeatherMCP
docs		docs
observability		observability
openspec		openspec
searxng		searxng
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.mcp.json		.mcp.json
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
ascend-scrapper.docker-compose.yaml		ascend-scrapper.docker-compose.yaml
docker-compose.yaml		docker-compose.yaml
opencode.json		opencode.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AscendAI

Table of Contents

Why this exists

Features

How it compares

Where AscendAI is honestly distinctive

Where it loses

Demo

Architecture

Request flow

Quick Start

Prerequisites

Run it

Supported AI Providers

Configuration & Ports

AscendAI services

Support services (in-stack, deployed via compose)

Observability stack (in-stack, deployed via compose)

External prerequisites (not in compose, managed / cloud in production)

Documentation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AscendAI

Table of Contents

Why this exists

Features

How it compares

Where AscendAI is honestly distinctive

Where it loses

Demo

Architecture

Request flow

Quick Start

Prerequisites

Run it

Supported AI Providers

Configuration & Ports

AscendAI services

Support services (in-stack, deployed via compose)

Observability stack (in-stack, deployed via compose)

External prerequisites (not in compose, managed / cloud in production)

Documentation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages