it seems to be a bug in
python/dify_plugin/interfaces/model/openai_compatible/llm.py
key code:
def _handle_generate_stream_response(
...
if chunk_json: # noqa: SIM102
if u := chunk_json.get("usage"):
usage = u
if not chunk_json or len(chunk_json["choices"]) == 0:
continue
...
if we got chunks like this:
...
data: {"model":"qwen3-max","id":"chatcmpl-2c88ceee-85d2-9a51-8326-53b61ac5bf48","choices":[{"delta":{"content":"","role":null},"index":0,"finish_reason":"stop"}],"created":1770648705,"object":"chat.completion.chunk","usage":null}
data: {"model":"qwen3-max","id":"chatcmpl-2c88ceee-85d2-9a51-8326-53b61ac5bf48","choices":[],"created":1770648705,"object":"chat.completion.chunk","usage":{"total_tokens":95,"completion_tokens":84,"prompt_tokens":11,"prompt_tokens_details":{"cached_tokens":0}}}
data: [DONE]
usage Info will not be read because of the continue logic.
i have fixed this in our local dify server, yet i'm not sure if this is a bug.
it seems to be a bug in
python/dify_plugin/interfaces/model/openai_compatible/llm.py
key code:
if we got chunks like this:
usage Info will not be read because of the continue logic.
i have fixed this in our local dify server, yet i'm not sure if this is a bug.