Skip to content

Jonkimi/semantic-router-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Semantic Router Server

A FastAPI-based server for language query classification using semantic-router. This server provides a flexible and configurable way to classify user queries into predefined routes based on semantic meaning.

Features

  • Semantic & Hybrid Routing: Switch between pure semantic search (SemanticRouter) and a combination of semantic and keyword search (HybridRouter).
  • Flexible Encoders: Defaults to using OpenAIEncoder, with the ability to switch to local models like FastEmbedEncoder.
  • Pluggable Vector Stores: Runs in-memory by default, with optional support for persistent vector stores like Qdrant.
  • Observability: Integrated with OpenTelemetry for tracing, allowing you to monitor requests and performance.
  • Configuration-driven: All aspects of the server are controlled via config.yaml and routes.yaml.

Getting Started

Prerequisites

  • Python 3.12+
  • An ASGI server like uvicorn.
  • An OpenAI API key (for the default configuration).

Installation

  1. Clone the repository:

    git clone <your-repo-url>
    cd semantic-router-server
  2. Install dependencies: This project uses uv for package management.

    uv pip install -r requirements.txt  # Or install from pyproject.toml

Configuration

  1. Routes (routes.yaml): Define your classification routes and example utterances in routes.yaml.

  2. Application (config.yaml):

    • Copy your OpenAI API key into the api_key field under encoder.
    • Review other settings like router_mode and opentelemetry.

Running the Server

Use uvicorn to run the FastAPI application:

uvicorn app.main:app --reload

The server will be available at http://127.0.0.1:8000.

Making a Query

You can send a POST request to the /query endpoint:

curl -X POST http://127.0.0.1:8000/query \
-H "Content-Type: application/json" \
-d '{"text": "what are the latest advancements in AI?"}'

The response will look like this:

{
  "route": "tech",
  "input": "what are the latest advancements in AI?",
  "score": 0.85
}

Advanced Configuration

Using a Local Encoder (FastEmbed)

To avoid dependency on the OpenAI API, you can use a local model.

  1. Install the required dependency:

    uv pip install "fastembed>=0.2.0"
  2. Update config.yaml:

    encoder:
      type: fastembed
      # api_key is not needed for fastembed

Using a Persistent Vector Store (Qdrant)

For production environments, you can use Qdrant to persist your route embeddings.

  1. Install the required dependency:

    uv pip install "qdrant-client>=1.7.0"
  2. Run a Qdrant instance (e.g., using Docker):

    docker run -p 6333:6333 qdrant/qdrant
  3. Update config.yaml:

    index:
      type: qdrant
      qdrant:
        # Assumes Qdrant is running locally.
        # For more options, see semantic-router's QdrantIndex documentation.
        host: "localhost"
        port: 6333
        collection_name: "semantic-router-routes"

    The server will automatically create the collection and sync the routes from routes.yaml on startup.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages