Skip to content

chore: add hf retries to files#322

Open
mckornfield wants to merge 1 commit into
mainfrom
hf-retry-attempts/mck
Open

chore: add hf retries to files#322
mckornfield wants to merge 1 commit into
mainfrom
hf-retry-attempts/mck

Conversation

@mckornfield

@mckornfield mckornfield commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary by CodeRabbit

  • Refactor

    • HuggingFace backend enhanced with improved error handling and automatic retry capabilities for transient failures, including rate limiting scenarios. Implements exponential backoff delays and Retry-After header support for improved resilience.
  • Tests

    • Added and updated test coverage to verify retry behavior on rate limit errors and transient unavailability conditions.

@mckornfield mckornfield requested review from a team as code owners June 12, 2026 21:47
@github-actions github-actions Bot added the chore label Jun 12, 2026
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Centralizes HTTP error handling and transient-retry logic in the HuggingFace backend. Introduces internal utilities to map errors, parse retry delays, and run HuggingFace API calls with automatic retry. Applies retry infrastructure to metadata and config retrieval, streaming downloads, and validates with test helpers and new test cases.

Changes

HuggingFace Transient Retry Handling

Layer / File(s) Summary
Retry Infrastructure: Error Mapping and Request Runner
services/core/files/src/nmp/core/files/app/backends/huggingface.py
Adds retry configuration constants, utility functions to map HfHubHTTPError to backend exceptions, parse Retry-After headers into backoff delays, and a shared _run_hf_request runner that executes HuggingFace Hub calls in a thread with automatic retry on 429/503. Updates imports to support retry utilities.
Metadata and Config API Integration
services/core/files/src/nmp/core/files/app/backends/huggingface.py
Updates resolve_config, _get_hf_file_metadata, get_file, list_files, and validate_storage to use _run_hf_request for repository and file metadata retrieval. Removes prior ad-hoc exception handling and centralizes error mapping.
Download Streaming with Retry
services/core/files/src/nmp/core/files/app/backends/huggingface.py
Wraps file download streaming in a retry loop that retries only on transient unavailability (when no data has been yielded) using Retry-After or exponential backoff. Preserves behavior for not-found and non-transient errors.
Test Infrastructure and Retry Verification
services/core/files/tests/test_huggingface_backend.py
Adds AsyncMock import and retry constant. Introduces _hf_http_error helper to construct HfHubHTTPError with mocked response. Updates test_get_file_rate_limit_error to assert retry count and sleep invocations. Adds test_resolve_config_retries_rate_limit_then_succeeds to verify retry-then-succeed behavior.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 76.19% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: adding retry handling for HuggingFace API calls in the files service.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch hf-retry-attempts/mck

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
services/core/files/src/nmp/core/files/app/backends/huggingface.py (1)

178-192: 📐 Maintainability & Code Quality | 💤 Low value

Line 192 is unreachable.

The loop always either returns on line 181, raises on line 183, or raises on line 190. This raise can never execute. Harmless defensive code, but could be removed or replaced with an assertion to clarify intent.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/core/files/src/nmp/core/files/app/backends/huggingface.py` around
lines 178 - 192, The final raise in _run_hf_request is unreachable because the
loop always returns or raises; remove the trailing "raise
HuggingfaceBackendError(...)" and replace it with an explicit assertion to
document intent (e.g., assert False or raise AssertionError("unreachable in
_run_hf_request")). Update surrounding references to
_HF_TRANSIENT_RETRY_ATTEMPTS, EntryNotFoundError, HfHubHTTPError,
_map_hf_http_error, HuggingfaceUnavailableError, and _sleep_before_retry as
needed to ensure control flow is still clear.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@services/core/files/src/nmp/core/files/app/backends/huggingface.py`:
- Around line 178-192: The final raise in _run_hf_request is unreachable because
the loop always returns or raises; remove the trailing "raise
HuggingfaceBackendError(...)" and replace it with an explicit assertion to
document intent (e.g., assert False or raise AssertionError("unreachable in
_run_hf_request")). Update surrounding references to
_HF_TRANSIENT_RETRY_ATTEMPTS, EntryNotFoundError, HfHubHTTPError,
_map_hf_http_error, HuggingfaceUnavailableError, and _sleep_before_retry as
needed to ensure control flow is still clear.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 78b3df66-4bfd-4a24-ba49-ef418aa09363

📥 Commits

Reviewing files that changed from the base of the PR and between fc132aa and b2e345e.

📒 Files selected for processing (2)
  • services/core/files/src/nmp/core/files/app/backends/huggingface.py
  • services/core/files/tests/test_huggingface_backend.py

@github-actions

Copy link
Copy Markdown
Contributor
Suite Lines Covered Line Rate Branch Rate
Unit Tests 19033/25227 75.4% 61.4%
Integration Tests 11028/23999 46.0% 20.4%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant