Skip to content

Add Intel OpenVINO Model Server support for local LLM inference#2496

Open
JohnLeFeng wants to merge 5 commits intosipeed:mainfrom
JohnLeFeng:ovms_enabling
Open

Add Intel OpenVINO Model Server support for local LLM inference#2496
JohnLeFeng wants to merge 5 commits intosipeed:mainfrom
JohnLeFeng:ovms_enabling

Conversation

@JohnLeFeng
Copy link
Copy Markdown

📝 Description

This PR adds Intel OpenVINO Model Server support for local LLM inference. With OVMS, users can use it to run the local LLM with OpenVINO on Intel CPU/GPU/NPU.

🗣️ Type of Change

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 📖 Documentation update
  • ⚡ Code refactoring (no functional changes, no api changes)

🤖 AI Code Generation

  • 🤖 Fully AI-generated (100% AI, 0% Human)
  • 🛠️ Mostly AI-generated (AI draft, Human verified/modified)
  • 👨‍💻 Mostly Human-written (Human lead, AI assisted or none)

🔗 Related Issue

📚 Technical Context (Skip for Docs)

  • Reference URL: QuickStart - LLM models
  • Reasoning: With OpenVINO, users can run the LLM inference on Intel CPU/GPU/NPU to utilize fully platform capability.

🧪 Test Environment

  • Hardware: Intel Core Ultra 5 228v
  • OS: Windows
  • Model/Provider: Qwen3-8B-int4-ov
  • Channels: Telegram

📸 Evidence (Optional)

Click to view Logs/Screenshots

☑️ Checklist

  • My code/docs follow the style of this project.
  • I have performed a self-review of my own changes.
  • I have updated the documentation accordingly.

Copilot AI review requested due to automatic review settings April 13, 2026 03:06
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new local provider protocol (ovms/) to support running LLM inference via Intel OpenVINO Model Server (OVMS), wiring it through the existing OpenAI-compatible provider path and surfacing it in the UI and documentation.

Changes:

  • Add ovms as an OpenAI-compatible provider with a default local API base (http://localhost:8000/v3) and allow empty API keys.
  • Extend local model reachability probing to include ovms.
  • Update UI provider labels/icons/sorting plus docs/READMEs (multiple languages) to document and advertise OVMS usage.

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
web/README.md Troubleshooting checklist updated to mention OVMS as a local model target.
web/frontend/src/components/models/provider-label.ts Adds a human-readable label for the ovms provider key.
web/frontend/src/components/models/provider-icon.tsx Adds a domain mapping for OVMS favicon fallback.
web/frontend/src/components/models/models-page.tsx Adds ovms to provider grouping priority ordering.
web/backend/api/model_status.go Treats ovms like other local OpenAI-compatible servers for runtime probing.
ROADMAP.md Mentions OVMS in the “Local Models” roadmap line.
README.md Documents OVMS in the provider table and adds an OVMS local deployment example config.
README.zh.md Same as above (Chinese).
README.vi.md Same as above (Vietnamese).
README.pt-br.md Same as above (Portuguese - Brazil).
README.my.md Same as above (Malay).
README.ja.md Same as above (Japanese).
README.it.md Same as above (Italian).
README.id.md Same as above (Indonesian).
README.fr.md Same as above (French).
pkg/providers/factory_provider.go Registers ovms protocol metadata (default base + empty-key allowed) and routes it through the OpenAI-compatible HTTP provider creation path.
pkg/providers/factory_provider_test.go Adds coverage to ensure ovms gets a default API base and parses model IDs correctly for local-provider creation.
pkg/config/defaults.go Adds a default local-ovms model entry pointing at http://localhost:8000/v3.
pkg/audio/asr/asr.go Includes ovms in protocol sets that route through the OpenAI-compatible audio transcription path.
docs/providers.md Adds OVMS to provider list and OpenAI-compatible protocol family description (English).
docs/configuration.md Adds OVMS to provider list and routing description (English).
docs/zh/providers.md Same as above (Chinese).
docs/zh/configuration.md Same as above (Chinese).
docs/vi/providers.md Same as above (Vietnamese).
docs/vi/configuration.md Same as above (Vietnamese).
docs/pt-br/providers.md Same as above (Portuguese - Brazil).
docs/pt-br/configuration.md Same as above (Portuguese - Brazil).
docs/ja/providers.md Same as above (Japanese).
docs/ja/configuration.md Same as above (Japanese).
docs/fr/providers.md Same as above (French).
docs/fr/configuration.md Same as above (French).
cmd/picoclaw/internal/model/command.go Updates CLI help text to mention OVMS alongside the local-model guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/picoclaw/internal/model/command.go Outdated
Comment on lines 176 to 180
switch protocol {
case "ollama":
return probeOllamaModelFunc(apiBase, modelID)
case "vllm", "lmstudio":
case "vllm", "ovms", "lmstudio":
return probeOpenAICompatibleModelFunc(apiBase, modelID, m.APIKey())
@sipeed-bot sipeed-bot bot added type: enhancement New feature or request domain: provider labels Apr 13, 2026
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 14, 2026 04:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Intel OpenVINO Model Server (OVMS) as a new OpenAI-compatible “local” provider option across the backend, frontend, default config, and documentation so users can run local LLM inference via OpenVINO on Intel hardware.

Changes:

  • Adds ovms as an OpenAI-compatible protocol with default API base http://localhost:8000/v3 (and allows empty API key) and includes it in local runtime probing.
  • Updates the frontend model UI to recognize OVMS (label/domain) and sort it among providers.
  • Updates docs/READMEs (multiple languages) to list OVMS and show example configuration.

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
web/frontend/src/components/models/provider-label.ts Adds “OVMS (local)” provider label mapping.
web/frontend/src/components/models/provider-icon.tsx Adds OVMS domain for favicon fallback.
web/frontend/src/components/models/models-page.tsx Adds OVMS provider priority to sorting/grouping.
web/backend/api/model_status.go Treats ovms like other local OpenAI-compatible providers for reachability probing.
web/README.md Mentions OVMS among reachable local models.
pkg/providers/factory_provider_test.go Adds test coverage for OVMS protocol handling in provider factory.
pkg/providers/factory_provider.go Registers ovms protocol defaults and routes it through OpenAI-compatible HTTP provider path.
pkg/config/defaults.go Adds a default local-ovms model entry pointing at http://localhost:8000/v3.
pkg/audio/asr/asr.go Includes ovms in the OpenAI-compatible protocol list for transcription routing.
docs/zh/providers.md Documents OVMS provider entry and notes it doesn’t require API keys locally.
docs/zh/configuration.md Adds OVMS to provider tables and routing description.
docs/vi/providers.md Documents OVMS provider entry and routing.
docs/vi/configuration.md Mentions OVMS under OpenAI-compatible routing.
docs/pt-br/providers.md Documents OVMS provider entry and routing.
docs/pt-br/configuration.md Mentions OVMS under OpenAI-compatible routing.
docs/providers.md Documents OVMS provider entry and routing.
docs/ja/providers.md Documents OVMS provider entry and routing.
docs/ja/configuration.md Mentions OVMS under OpenAI-compatible routing.
docs/fr/providers.md Documents OVMS provider entry and routing.
docs/fr/configuration.md Mentions OVMS under OpenAI-compatible routing.
docs/configuration.md Adds OVMS to provider tables and routing description.
cmd/picoclaw/internal/model/command.go Updates CLI help text to describe local-model as a generic local OpenAI-compatible server (vLLM/OVMS/etc.).
ROADMAP.md Adds OVMS to the “Local Models” roadmap line.
README.zh.md Adds OVMS to provider table and includes a local OVMS config snippet.
README.vi.md Adds OVMS to provider table and includes a local OVMS config snippet.
README.pt-br.md Adds OVMS to provider table and includes a local OVMS config snippet.
README.my.md Adds OVMS to provider table and includes a local OVMS config snippet.
README.md Adds OVMS to provider table and includes a local OVMS config snippet.
README.ja.md Adds OVMS to provider table and includes a local OVMS config snippet.
README.it.md Adds OVMS to provider table and includes a local OVMS config snippet.
README.id.md Adds OVMS to provider table and includes a local OVMS config snippet.
README.fr.md Adds OVMS to provider table and includes a local OVMS config snippet.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread README.md
Comment thread README.pt-br.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
Comment thread README.it.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
Comment thread pkg/config/defaults.go
{
ModelName: "local-ovms",
Model: "ovms/custom-model",
APIBase: "http://localhost:8000/v3",
Comment thread README.zh.md
{
"model_list": [
{
"model_name": "local-ovms",
Comment thread README.vi.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
Comment thread README.my.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
Comment thread README.ja.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
Comment thread README.id.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
Comment thread README.fr.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 14, 2026 05:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces Intel OpenVINO Model Server (OVMS) as a supported local OpenAI-compatible provider across the backend, frontend UI, default config templates, and multi-language documentation.

Changes:

  • Add ovms as an OpenAI-compatible protocol with a default local API base (http://localhost:8000/v3) and allow empty API keys.
  • Extend local model reachability/probing and provider lists (UI labels/domains/sorting, runtime probe switch cases, tests).
  • Document OVMS in READMEs and provider/configuration docs (multiple locales).

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
web/frontend/src/components/models/provider-label.ts Adds OVMS label for model grouping/display.
web/frontend/src/components/models/provider-icon.tsx Adds OVMS domain for favicon fallback.
web/frontend/src/components/models/models-page.tsx Adds OVMS to provider sorting priority.
web/backend/api/model_status.go Treats OVMS like other local OpenAI-compatible servers for probing.
web/README.md Mentions OVMS among reachable local model servers.
pkg/providers/factory_provider_test.go Adds test cases for OVMS defaults and local provider parsing.
pkg/providers/factory_provider.go Registers ovms protocol metadata and routes it via HTTP OpenAI-compatible provider.
pkg/config/defaults.go Adds a default local-ovms model_list entry.
pkg/audio/asr/asr.go Routes OVMS through the OpenAI-compatible transcription-capable path.
docs/zh/providers.md Documents OVMS endpoint/defaults and no-key local behavior.
docs/zh/configuration.md Documents OVMS in provider tables and routing family notes.
docs/vi/providers.md Documents OVMS in provider tables and protocol family notes.
docs/vi/configuration.md Documents OVMS in protocol family routing description.
docs/pt-br/providers.md Documents OVMS in provider tables and protocol family notes.
docs/pt-br/configuration.md Documents OVMS in protocol family routing description.
docs/providers.md Adds OVMS to provider list and OpenAI-compatible protocol notes.
docs/ja/providers.md Documents OVMS in provider tables and protocol family notes.
docs/ja/configuration.md Documents OVMS in protocol family routing description.
docs/fr/providers.md Documents OVMS in provider tables and protocol family notes.
docs/fr/configuration.md Documents OVMS in protocol family routing description.
docs/configuration.md Adds OVMS to provider tables and OpenAI-compatible protocol notes.
cmd/picoclaw/internal/model/command.go Updates CLI help text to mention OVMS as a local OpenAI-compatible option.
ROADMAP.md Adds OVMS to the “Local Models” roadmap bullet.
README.zh.md Adds OVMS provider row and a local OVMS config example snippet.
README.vi.md Adds OVMS provider row and a local OVMS config example snippet.
README.pt-br.md Adds OVMS provider row and a local OVMS config example snippet.
README.my.md Adds OVMS provider row and a local OVMS config example snippet.
README.md Adds OVMS provider row and a local OVMS config example snippet.
README.ja.md Adds OVMS provider row and a local OVMS config example snippet.
README.it.md Adds OVMS provider row and a local OVMS config example snippet.
README.id.md Adds OVMS provider row and a local OVMS config example snippet.
README.fr.md Adds OVMS provider row and a local OVMS config example snippet.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread README.zh.md
"model_list": [
{
"model_name": "local-ovms",
"model": "ovms/your-model",
Comment thread README.vi.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
Comment thread README.id.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
Comment thread README.fr.md
"model_list": [
{
"model_name": "local-ovms",
"model": "ovms/your-model",
Comment thread README.pt-br.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
Comment thread README.my.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
Comment thread README.ja.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
Comment thread README.it.md
{
"model_name": "local-ovms",
"model": "ovms/your-model",
"api_base": "http://localhost:8000/v3"
picoclaw model gpt-5.2 # Set gpt-5.2 as default
picoclaw model claude-sonnet-4.6 # Set claude-sonnet-4.6 as default
picoclaw model local-model # Set local VLLM server as default
picoclaw model local-model # Set local VLLM/OVMS server as default
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants