Summary
ccc index fails when provider: litellm is used with an OpenAI-compatible embeddings endpoint, because the outgoing embeddings request includes:
{
"input": ["hello"],
"model": "Qwen/Qwen3-Embedding-8B",
"encoding_format": null
}
If encoding_format is omitted entirely, or changed to "float", the same request succeeds.
This appears to match LiteLLM's previously documented encoding_format=None regression for OpenAI-like embeddings.
Why I believe this is dependency-related
The current cocoindex-code pyproject.toml pins:
cocoindex[litellm]==1.0.0a38
So this issue appears to come from the cocoindex[litellm] / LiteLLM dependency path rather than from provider configuration alone.
Relevant source:
LiteLLM also published an official incident report on February 18, 2026 describing a very similar bug:
- a change explicitly sent
encoding_format=None
- OpenAI-like embedding endpoints that strictly validated the field rejected the request
- the upstream fix was to filter out
None / empty-string values before sending the request
Relevant source:
Environment
- OS: Windows
- Python: 3.12
- Installed via:
uv tool install cocoindex-code
Embedding config:
embedding:
provider: litellm
model: openai/Qwen/Qwen3-Embedding-8B
envs:
OPENAI_API_KEY: <redacted>
OPENAI_BASE_URL: https://api.siliconflow.cn/v1
Reproduction
- Configure
cocoindex-code to use provider: litellm
- Point it to an OpenAI-compatible embeddings endpoint
- Run:
- Observe that indexing fails when the embeddings request is sent
Actual Behavior
The outgoing embeddings request contains:
{
"input": ["hello"],
"model": "Qwen/Qwen3-Embedding-8B",
"encoding_format": null
}
The provider rejects this as an invalid parameter.
Expected Behavior
For OpenAI-compatible embedding providers:
- if
encoding_format is unset, it should be omitted from the request body
- or it should be set to a valid value such as
"float"
It should not be serialized as null.
What I found
I captured the outgoing request and confirmed that encoding_format: null was being sent.
I then patched my local LiteLLM installation as follows:
if encoding_format is not None:
optional_params["encoding_format"] = encoding_format
else:
optional_params["encoding_format"] = "float"
After this local patch, indexing worked.
A cleaner fix may be to omit None values entirely before sending the request to OpenAI-compatible embedding providers.
Why this seems to match the known LiteLLM bug
LiteLLM's official incident report says the regression was caused by explicitly passing encoding_format=None in embedding requests, and that the upstream fix was to filter out None and empty string values before sending requests to OpenAI-like providers.
That seems consistent with what I observed here:
- request fails when
encoding_format: null is present
- request succeeds when the field is omitted
- request also succeeds when it is forced to
"float"
Suggested Fix
One of the following would likely resolve the issue:
- Upgrade the
cocoindex[litellm] dependency path to a version that includes the upstream LiteLLM fix
- Filter out
encoding_format=None before sending embedding requests
- As a fallback, explicitly send
"float" instead of null
Notes
This does not appear to be model-specific.
The issue is the serialized request field:
Once that field is omitted or changed to "float", the same provider/model combination works correctly.
Summary
ccc indexfails whenprovider: litellmis used with an OpenAI-compatible embeddings endpoint, because the outgoing embeddings request includes:{ "input": ["hello"], "model": "Qwen/Qwen3-Embedding-8B", "encoding_format": null }If
encoding_formatis omitted entirely, or changed to"float", the same request succeeds.This appears to match LiteLLM's previously documented
encoding_format=Noneregression for OpenAI-like embeddings.Why I believe this is dependency-related
The current
cocoindex-codepyproject.tomlpins:cocoindex[litellm]==1.0.0a38So this issue appears to come from the
cocoindex[litellm]/ LiteLLM dependency path rather than from provider configuration alone.Relevant source:
cocoindex-codedependency declaration:https://github.com/cocoindex-io/cocoindex-code/blob/main/pyproject.toml
LiteLLM also published an official incident report on February 18, 2026 describing a very similar bug:
encoding_format=NoneNone/ empty-string values before sending the requestRelevant source:
https://docs.litellm.ai/blog/vllm-embeddings-incident
Environment
uv tool install cocoindex-codeEmbedding config:
Reproduction
cocoindex-codeto useprovider: litellmActual Behavior
The outgoing embeddings request contains:
{ "input": ["hello"], "model": "Qwen/Qwen3-Embedding-8B", "encoding_format": null }The provider rejects this as an invalid parameter.
Expected Behavior
For OpenAI-compatible embedding providers:
encoding_formatis unset, it should be omitted from the request body"float"It should not be serialized as
null.What I found
I captured the outgoing request and confirmed that
encoding_format: nullwas being sent.I then patched my local LiteLLM installation as follows:
After this local patch, indexing worked.
A cleaner fix may be to omit
Nonevalues entirely before sending the request to OpenAI-compatible embedding providers.Why this seems to match the known LiteLLM bug
LiteLLM's official incident report says the regression was caused by explicitly passing
encoding_format=Nonein embedding requests, and that the upstream fix was to filter outNoneand empty string values before sending requests to OpenAI-like providers.That seems consistent with what I observed here:
encoding_format: nullis present"float"Suggested Fix
One of the following would likely resolve the issue:
cocoindex[litellm]dependency path to a version that includes the upstream LiteLLM fixencoding_format=Nonebefore sending embedding requests"float"instead ofnullNotes
This does not appear to be model-specific.
The issue is the serialized request field:
Once that field is omitted or changed to
"float", the same provider/model combination works correctly.