feat(ai_provider): add retry handling, exponential backoff, and timeout resilience by Harshit-Maurya838 · Pull Request #245 · imDarshanGK/AI-dev-assistant

Harshit-Maurya838 · 2026-05-22T08:05:36Z

Which issue does this PR close?

Closes #203

Description

This PR improves the reliability and observability of LLM integrations in ai_provider.py by introducing configurable retry handling, exponential backoff, structured logging, and comprehensive timeout/error handling while maintaining backward compatibility.

Key Improvements

Retry & Backoff Support

Added configurable retry handling for:
- request timeouts
- connection failures
- HTTP 429 rate limits
- HTTP 5xx provider errors
Implemented exponential backoff using:
```
LLM_RETRY_BACKOFF * (2 ** attempt)
```

Improved Logging & Observability

Replaced print-based logging with structured logging
Added provider-level observability including:
- provider name
- retry attempt count
- latency metrics
- failure type
- final request status

Better Error Handling

Added granular handling for:
- httpx.HTTPStatusError
- httpx.RequestError
- timeout and connection failures
Prevented retries for invalid client-side requests (400/401)
Preserved graceful fallback behavior and offline compatibility

Configuration Updates

Added new environment variables:

LLM_MAX_RETRIES
LLM_RETRY_BACKOFF

Updated:

backend/app/config.py
.env.example

Documentation

Added an "LLM Provider Reliability" section in README
Documented retry behavior, fallback handling, and observability improvements

Unit Tests

Added comprehensive test coverage in:

backend/tests/test_ai_provider.py

Test scenarios include:

successful responses
retry exhaustion on timeout
retries after HTTP 5xx failures
no retry for HTTP 400 errors
retry handling for HTTP 429
provider-disabled behavior

Type of Change

Bug fix
Enhancement
Tests added/updated
Documentation updated

Verification

pytest backend/tests/test_ai_provider.py -v

Test Results

7 passed in 0.58s

Notes

The existing call_llm() return signature (str | None) was intentionally preserved to avoid breaking changes.
Provider names are inferred internally from LLM_BASE_URL to keep configuration minimal and backward-compatible.

…DarshanGK#203)

…imeout-retry-handling-203

Harshit-Maurya838 · 2026-05-22T08:09:11Z

@imDarshanGK Conflicts has been resolved and all points in issue has been implemented. Please review it.

Harshit-Maurya838 · 2026-05-22T15:22:43Z

@imDarshanGK Please review this PR i resolved the issue and resolve the merge conflicts also

feat(ai_provider): add retry policy and detailed timeout handling (im…

d8e5b25

…DarshanGK#203)

Harshit-Maurya838 requested a review from imDarshanGK as a code owner May 22, 2026 08:05

Merge remote-tracking branch 'origin/main' into feature/improve-llm-t…

465e738

…imeout-retry-handling-203

Merge branch 'main' into feature/improve-llm-timeout-retry-handling-203

c86c686

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ai_provider): add retry handling, exponential backoff, and timeout resilience#245

feat(ai_provider): add retry handling, exponential backoff, and timeout resilience#245
Harshit-Maurya838 wants to merge 3 commits into
imDarshanGK:mainfrom
Harshit-Maurya838:feature/improve-llm-timeout-retry-handling-203

Harshit-Maurya838 commented May 22, 2026

Uh oh!

Harshit-Maurya838 commented May 22, 2026

Uh oh!

Harshit-Maurya838 commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Harshit-Maurya838 commented May 22, 2026

Which issue does this PR close?

Description

Key Improvements

Retry & Backoff Support

Improved Logging & Observability

Better Error Handling

Configuration Updates

Documentation

Unit Tests

Type of Change

Verification

Test Results

Notes

Uh oh!

Harshit-Maurya838 commented May 22, 2026

Uh oh!

Harshit-Maurya838 commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant