Semantic Router Server

A FastAPI-based server for language query classification using semantic-router. This server provides a flexible and configurable way to classify user queries into predefined routes based on semantic meaning.

Features

Semantic & Hybrid Routing: Switch between pure semantic search (SemanticRouter) and a combination of semantic and keyword search (HybridRouter).
Flexible Encoders: Defaults to using OpenAIEncoder, with the ability to switch to local models like FastEmbedEncoder.
Pluggable Vector Stores: Runs in-memory by default, with optional support for persistent vector stores like Qdrant.
Observability: Integrated with OpenTelemetry for tracing, allowing you to monitor requests and performance.
Configuration-driven: All aspects of the server are controlled via config.yaml and routes.yaml.

Getting Started

Prerequisites

Python 3.12+
An ASGI server like uvicorn.
An OpenAI API key (for the default configuration).

Installation

Clone the repository:

git clone <your-repo-url>
cd semantic-router-server

Install dependencies: This project uses uv for package management.

uv pip install -r requirements.txt  # Or install from pyproject.toml

Configuration

Routes (routes.yaml): Define your classification routes and example utterances in routes.yaml.
Application (config.yaml):
- Copy your OpenAI API key into the api_key field under encoder.
- Review other settings like router_mode and opentelemetry.

Running the Server

Use uvicorn to run the FastAPI application:

uvicorn app.main:app --reload

The server will be available at http://127.0.0.1:8000.

Making a Query

You can send a POST request to the /query endpoint:

curl -X POST http://127.0.0.1:8000/query \
-H "Content-Type: application/json" \
-d '{"text": "what are the latest advancements in AI?"}'

The response will look like this:

{
  "route": "tech",
  "input": "what are the latest advancements in AI?",
  "score": 0.85
}

Advanced Configuration

Using a Local Encoder (`FastEmbed`)

To avoid dependency on the OpenAI API, you can use a local model.

Install the required dependency:
```
uv pip install "fastembed>=0.2.0"
```

Update config.yaml:

encoder:
  type: fastembed
  # api_key is not needed for fastembed

Using a Persistent Vector Store (`Qdrant`)

For production environments, you can use Qdrant to persist your route embeddings.

Install the required dependency:
```
uv pip install "qdrant-client>=1.7.0"
```
Run a Qdrant instance (e.g., using Docker):
```
docker run -p 6333:6333 qdrant/qdrant
```

Update config.yaml:

index:
  type: qdrant
  qdrant:
    # Assumes Qdrant is running locally.
    # For more options, see semantic-router's QdrantIndex documentation.
    host: "localhost"
    port: 6333
    collection_name: "semantic-router-routes"

The server will automatically create the collection and sync the routes from routes.yaml on startup.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
app		app
configs		configs
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Semantic Router Server

Features

Getting Started

Prerequisites

Installation

Configuration

Running the Server

Making a Query

Advanced Configuration

Using a Local Encoder (`FastEmbed`)

Using a Persistent Vector Store (`Qdrant`)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Semantic Router Server

Features

Getting Started

Prerequisites

Installation

Configuration

Running the Server

Making a Query

Advanced Configuration

Using a Local Encoder (FastEmbed)

Using a Persistent Vector Store (Qdrant)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Using a Local Encoder (`FastEmbed`)

Using a Persistent Vector Store (`Qdrant`)

Packages