Problem Statement
Transient network errors or local LLM hiccups (e.g., temporary disconnection from Ollama) can kill a long-running session. Currently, if an inference call fails once, the entire session crashes or hangs, leading to a poor user experience.
Proposed Solution
- Add a
retry_count: int = Field(default=3) to SessionConfig in config.py.
- Implement a retry loop in
Agent.generate_response() (or within the litellm call wrapper).
- If an inference call fails, the system should catch the error and retry the completion up to the specified limit before giving up.
Alternatives Considered
Manual retries by the user, which is disruptive to the dynamic conversation flow.
Priority
Medium 🟡
Additional Context
This works in tandem with the 'timeout' feature to provide a much more stable and robust experience for users with varying hardware or network quality.
Problem Statement
Transient network errors or local LLM hiccups (e.g., temporary disconnection from Ollama) can kill a long-running session. Currently, if an inference call fails once, the entire session crashes or hangs, leading to a poor user experience.
Proposed Solution
retry_count: int = Field(default=3)toSessionConfiginconfig.py.Agent.generate_response()(or within thelitellmcall wrapper).Alternatives Considered
Manual retries by the user, which is disruptive to the dynamic conversation flow.
Priority
Medium 🟡
Additional Context
This works in tandem with the 'timeout' feature to provide a much more stable and robust experience for users with varying hardware or network quality.