gemma4:26b is very good and fast used directly but from pydantic within the ROS llm node it is stuck in reasoning mode which makes it very slow to respond and not useful for front line Marvin responses.
Can we find a solution to the thinking config?
Alternative is to use a lightweight model for immediate responses and use gemma as an background agent that gets triggered for hard tasks.
gemma4:26bis very good and fast used directly but from pydantic within the ROS llm node it is stuck in reasoning mode which makes it very slow to respond and not useful for front line Marvin responses.Can we find a solution to the thinking config?
Alternative is to use a lightweight model for immediate responses and use gemma as an background agent that gets triggered for hard tasks.