A FastAPI-based server for language query classification using semantic-router. This server provides a flexible and configurable way to classify user queries into predefined routes based on semantic meaning.
- Semantic & Hybrid Routing: Switch between pure semantic search (
SemanticRouter) and a combination of semantic and keyword search (HybridRouter). - Flexible Encoders: Defaults to using
OpenAIEncoder, with the ability to switch to local models likeFastEmbedEncoder. - Pluggable Vector Stores: Runs in-memory by default, with optional support for persistent vector stores like
Qdrant. - Observability: Integrated with OpenTelemetry for tracing, allowing you to monitor requests and performance.
- Configuration-driven: All aspects of the server are controlled via
config.yamlandroutes.yaml.
- Python 3.12+
- An ASGI server like
uvicorn. - An OpenAI API key (for the default configuration).
-
Clone the repository:
git clone <your-repo-url> cd semantic-router-server
-
Install dependencies: This project uses
uvfor package management.uv pip install -r requirements.txt # Or install from pyproject.toml
-
Routes (
routes.yaml): Define your classification routes and example utterances inroutes.yaml. -
Application (
config.yaml):- Copy your OpenAI API key into the
api_keyfield underencoder. - Review other settings like
router_modeandopentelemetry.
- Copy your OpenAI API key into the
Use uvicorn to run the FastAPI application:
uvicorn app.main:app --reloadThe server will be available at http://127.0.0.1:8000.
You can send a POST request to the /query endpoint:
curl -X POST http://127.0.0.1:8000/query \
-H "Content-Type: application/json" \
-d '{"text": "what are the latest advancements in AI?"}'The response will look like this:
{
"route": "tech",
"input": "what are the latest advancements in AI?",
"score": 0.85
}To avoid dependency on the OpenAI API, you can use a local model.
-
Install the required dependency:
uv pip install "fastembed>=0.2.0" -
Update
config.yaml:encoder: type: fastembed # api_key is not needed for fastembed
For production environments, you can use Qdrant to persist your route embeddings.
-
Install the required dependency:
uv pip install "qdrant-client>=1.7.0" -
Run a Qdrant instance (e.g., using Docker):
docker run -p 6333:6333 qdrant/qdrant
-
Update
config.yaml:index: type: qdrant qdrant: # Assumes Qdrant is running locally. # For more options, see semantic-router's QdrantIndex documentation. host: "localhost" port: 6333 collection_name: "semantic-router-routes"
The server will automatically create the collection and sync the routes from
routes.yamlon startup.