fix(watsonx): guard function_calling and json_mode against vLLM guided-decoding failure

## Background

PR #89 fixed the `response_format` empty-content bug on `ChatWatsonx` by switching to a prompt-based `PydanticOutputParser` fallback. During review it was noted that the `function_calling` and `json_mode` branches in `BaseAgent.get_chain` (`src/cuga/backend/cuga_graph/nodes/shared/base_agent.py`, ~line 167) still route through `llm.with_structured_output(...)`, which is known-broken on vLLM versions prior to 0.8.2 (see [vllm#15236](https://github.com/vllm-project/vllm/issues/15236) and [vllm#21148](https://github.com/vllm-project/vllm/issues/21148)).

## Risk

No caller currently passes `wx_json_mode="function_calling"` or `wx_json_mode="json_mode"` explicitly for a `ChatWatsonx` LLM — the default is `"response_format"` and current overrides use `"no_format"`. However, if a future caller opts in to either of these modes, they will silently hit the same vLLM guided-decoding failure (empty content / no tool_calls returned).

## Suggested Fix

Either:
1. Route `"function_calling"` and `"json_mode"` through the same `prompt_template | llm | parser` fallback with `.with_retry(stop_after_attempt=3)` (same approach as the `"response_format"` fix in PR #89), **or**
2. Raise a clear `ValueError` / `NotImplementedError` when `wx_json_mode` is `"function_calling"` or `"json_mode"` on `ChatWatsonx`, so callers get an explicit error rather than a silent failure.

This fix should be **reverted or revisited** once IBM upgrades the deployed vLLM to a fixed version (0.8.2+).

## References

- PR #89 (fix for `response_format` mode): https://github.com/cuga-project/cuga-agent/pull/89
- Review comment: https://github.com/cuga-project/cuga-agent/pull/89#discussion_r2996097287
- Issue #88 (original empty-content bug): https://github.com/cuga-project/cuga-agent/issues/88
- Requested by: @haroldship


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(watsonx): guard function_calling and json_mode against vLLM guided-decoding failure #90

Background

Risk

Suggested Fix

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fix(watsonx): guard function_calling and json_mode against vLLM guided-decoding failure #90

Description

Background

Risk

Suggested Fix

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions