feat(mcp): add configurable cross-encoder for search result reranking#1174
Open
lehcode wants to merge 24 commits intogetzep:mainfrom
Open
feat(mcp): add configurable cross-encoder for search result reranking#1174lehcode wants to merge 24 commits intogetzep:mainfrom
lehcode wants to merge 24 commits intogetzep:mainfrom
Conversation
Allow setting log level through LOG_LEVEL env var (DEBUG, INFO, WARNING, ERROR, CRITICAL). Defaults to INFO for backward compatibility.
- Add LOG_LEVEL to .env.example with description - Add LOG_LEVEL to README.md Environment Variables section - Add test_log_level_environment_variable() unit test
7acdf6b to
b00f2fa
Compare
|
I was trying to run graphici with gemini and had pani with cross-encoder. |
Add support for `openai_generic` LLM provider in the MCP server factory.
This provider uses `OpenAIGenericClient` which calls `/chat/completions`
with `response_format` for structured output instead of the `/responses`
endpoint. This enables compatibility with:
- LiteLLM proxy
- Ollama
- vLLM
- Any OpenAI-compatible API that doesn't support `/responses`
The `/responses` endpoint is only available on OpenAI's native API, so
this provider is essential for self-hosted LLM deployments.
Usage in config.yaml:
```yaml
llm:
provider: "openai_generic"
model: "your-model"
providers:
openai:
api_key: ${OPENAI_API_KEY}
api_url: ${OPENAI_BASE_URL}
```
Detect when providers (e.g., LiteLLM with Gemini) return schema definition instead of data and automatically switch to json_object mode with schema embedded in prompt. - Add _is_schema_returned_as_data() detection helper - Add instance-level _use_json_object_mode fallback state - Modify _generate_response() to support dual modes - Fallback persists for client lifetime after first trigger
- Add _extract_json() method to handle responses with trailing content - Simplify _is_json_schema() detection logic - Handle "Extra data" JSON parse errors gracefully
- Document openai_generic provider in README.md with LiteLLM and Ollama examples - Add provider configuration to .env.example - Add unit tests for _is_schema_returned_as_data() and _extract_json() methods
Revert "Fix dependabot security vulnerabilities (getzep#1184)" This reverts commit 30cd907.
* Fix dependabot security vulnerabilities in dependencies Update lock files to address multiple security alerts: - pyasn1: 0.6.1 → 0.6.2 (CVE-2026-23490) - langchain-core: 0.3.74 → 0.3.83 (CVE-2025-68664) - mcp: 1.9.4 → 1.26.0 (DNS rebinding, DoS) - azure-core: 1.34.0 → 1.38.0 (deserialization) - starlette: 0.46.2/0.47.1 → 0.50.0/0.52.1 (DoS vulnerabilities) - python-multipart: 0.0.20 → 0.0.22 (arbitrary file write) - fastapi: 0.115.14 → 0.128.0 (for starlette compatibility) - nbconvert: 7.16.6 → 7.17.0 - orjson: 3.11.5 → 3.11.6 - protobuf: 6.33.4 → 6.33.5 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Pin mcp_server to graphiti-core 0.26.3 from PyPI - Change dependency from >=0.23.1 to ==0.26.3 - Remove editable source override to use published package - Addresses code review feedback about RC version usage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Fix remaining security vulnerabilities in mcp_server Update vulnerable transitive dependencies: - aiohttp: 3.12.15 → 3.13.3 (High: zip bomb, DoS) - urllib3: 2.5.0 → 2.6.3 (High: decompression bomb bypass) - filelock: 3.19.1 → 3.20.3 (Medium: TOCTOU symlink) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Add support for `openai_generic` LLM provider in the MCP server factory.
This provider uses `OpenAIGenericClient` which calls `/chat/completions`
with `response_format` for structured output instead of the `/responses`
endpoint. This enables compatibility with:
- LiteLLM proxy
- Ollama
- vLLM
- Any OpenAI-compatible API that doesn't support `/responses`
The `/responses` endpoint is only available on OpenAI's native API, so
this provider is essential for self-hosted LLM deployments.
Usage in config.yaml:
```yaml
llm:
provider: "openai_generic"
model: "your-model"
providers:
openai:
api_key: ${OPENAI_API_KEY}
api_url: ${OPENAI_BASE_URL}
```
Allow setting log level through LOG_LEVEL env var (DEBUG, INFO, WARNING, ERROR, CRITICAL). Defaults to INFO for backward compatibility.
Enables Azure Active Directory authentication for Azure OpenAI LLM and Embedder clients. Conditionally configures the `AsyncOpenAI` client to use an Azure AD token provider when specified. Retains API key authentication as an alternative.
feat: add openai_generic provider for LiteLLM/Ollama compatibility
feat: add configurable LOG_LEVEL for MCP server
Add CrossEncoderConfig to MCP server configuration, allowing users to configure the cross-encoder (reranker) used for search result ranking. Previously, the cross-encoder defaulted to OpenAIRerankerClient with hardcoded model 'gpt-4.1-nano', causing "No deployments available" errors when using LiteLLM or other OpenAI-compatible proxies. Changes: - Add CrossEncoderConfig and CrossEncoderProvidersConfig to schema.py - Add CrossEncoderFactory supporting providers: openai, openai_generic, azure_openai, gemini, bge, and none/disabled - Pass configured cross_encoder to Graphiti initialization - Add cross_encoder section to all config YAML files - Suppress neo4j.notifications logger to reduce log noise Supported providers: - openai: Direct OpenAI API - openai_generic: LiteLLM, Ollama, vLLM compatible - azure_openai: Azure OpenAI with AD auth support - gemini: Google Gemini - bge: Local BGE model (no API required) - none/disabled: Disable reranking entirely
7760624 to
192c3e0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add
CrossEncoderConfigto MCP server configuration, allowing users to configure the cross-encoder (reranker) used for search result ranking.Problem: The cross-encoder previously defaulted to
OpenAIRerankerClientwith hardcoded modelgpt-4.1-nano, causing "No deployments available" errors when using LiteLLM or other OpenAI-compatible proxies that don't have this model configured.Solution: Add configurable cross-encoder with support for multiple providers.
Changes
CrossEncoderConfigandCrossEncoderProvidersConfigtoschema.pyCrossEncoderFactorysupporting multiple providerscross_encoderto Graphiti initializationcross_encodersection to all config YAML filesneo4j.notificationslogger to reduce log noiseSupported Providers
openaiopenai_genericazure_openaigeminibgenone/disabledExample Configuration
Test Plan