Skip to content

[FEATURE] Advanced Error Handling with Retry Configuration #27

@sajeerzeji

Description

@sajeerzeji

Is your feature request related to a problem? Please describe.

Currently, the onError callback in Knowledge.create() only supports two actions: 'skip' or 'abort'. This is insufficient for handling transient failures like:

  • Network timeouts when calling embedding APIs
  • Rate limiting from OpenAI/Ollama
  • Temporary database connection issues
  • Intermittent file system errors

When these transient errors occur, users must either skip the chunk entirely (losing data) or abort the entire sync (failing the operation). There's no way to automatically retry with backoff, which is the standard pattern for handling transient failures.

Describe the solution you'd like

Extend the ErrorHandler type to support a third return value: a retry configuration object with exponential backoff support.

Implementation Details

1. Update Error Handler Type

File: packages/toolpack-knowledge/src/knowledge.ts

Extend the ErrorHandler type to accept a RetryConfig object as a return value. The config should include:

  • maxAttempts - Maximum number of retry attempts
  • delay - Initial delay in milliseconds
  • backoff - Strategy: 'linear' or 'exponential'
  • maxDelay - Optional cap for exponential backoff (default: 30000ms)
  • retryableErrors - Optional array of error codes/messages to retry (default: retry all)

Add attempt number to the error context so handlers can track retry progress.

2. Implement Retry Logic

File: packages/toolpack-knowledge/src/knowledge.ts

Create a retryWithBackoff<T>() utility function that:

  • Accepts a function to retry, retry config, and error context
  • Attempts the function up to maxAttempts times
  • Checks if errors match retryableErrors filter (if specified)
  • Calculates delay based on backoff strategy:
    • Linear: delay * attempt
    • Exponential: delay * 2^(attempt-1) capped at maxDelay
  • Calls error handler on each attempt for logging/monitoring
  • Respects 'abort' action from error handler
  • Throws the last error if all attempts fail

3. Update embedChunks to Use Retry Logic

File: packages/toolpack-knowledge/src/knowledge.ts

Modify the embedChunks() method to:

  • Wrap embedder.embed() and embedder.embedBatch() calls with retry logic
  • Check error handler return value for retry configuration
  • Apply retry logic when { retry: RetryConfig } is returned
  • Maintain backward compatibility with 'skip' and 'abort' actions

Create helper methods:

  • embedWithRetry() - Wraps single embedding with retry logic
  • embedBatchWithRetry() - Wraps batch embedding with retry logic

4. Update Embedder Error Handling

Files:

  • packages/toolpack-knowledge/src/embedders/openai.ts
  • packages/toolpack-knowledge/src/embedders/ollama.ts

Enhance error classification to include status codes:

  • Rate limit errors (429) - Should be retried with backoff
  • Server errors (5xx) - Should be retried
  • Network errors (ECONNREFUSED, ETIMEDOUT) - Should be retried
  • Client errors (4xx except 429) - Should not be retried

Update EmbeddingError to include statusCode property for better error handling decisions.

5. Usage Examples

Example 1: Retry on rate limits
Configure retry with exponential backoff for rate limit errors, starting with 1 second delay and capping at 30 seconds.

Example 2: Retry on network errors
Configure retry with linear backoff for network connectivity issues.

Example 3: Selective retry based on error message
Only retry specific error types using the retryableErrors filter.

6. Testing

File: packages/toolpack-knowledge/src/__tests__/retry-logic.test.ts

Create comprehensive tests covering:

  • Successful retry after failures - Verify retries work and eventually succeed
  • Exponential backoff calculation - Verify delays increase exponentially
  • Linear backoff calculation - Verify delays increase linearly
  • maxDelay cap - Verify exponential backoff respects the cap
  • maxAttempts limit - Verify failure after max attempts
  • retryableErrors filter - Verify only specified errors are retried
  • Non-retryable errors - Verify immediate failure for non-retryable errors
  • Error handler integration - Verify error handler is called on each attempt
  • Abort action - Verify 'abort' stops retries immediately

Acceptance Criteria

  • ErrorHandler type supports { retry: RetryConfig } return value
  • Retry logic implements exponential backoff correctly
  • Retry logic implements linear backoff correctly
  • maxDelay cap is respected for exponential backoff
  • retryableErrors filter works correctly
  • Error context includes attempt number
  • Embedders throw properly classified errors
  • EmbeddingError includes status codes
  • Retry works for both embed() and embedBatch()
  • Comprehensive tests cover all retry scenarios
  • Documentation includes retry examples
  • Backward compatible - existing code still works

Describe alternatives you've considered

  1. Built-in retry in embedders: Add retry logic directly to OllamaEmbedder and OpenAIEmbedder - rejected because it removes user control and doesn't help with other error types
  2. Separate retry utility: Export a standalone retry utility - considered but less ergonomic than integrating with error handler
  3. Promise-based retry library: Use p-retry or similar - rejected to avoid dependencies

Additional context

  • Retry logic is critical for production deployments with API-based embedders
  • OpenAI rate limits are common when processing large knowledge bases
  • Network errors are transient and should be retried automatically
  • This feature makes knowledge base ingestion more robust and reliable
  • Exponential backoff prevents overwhelming rate-limited services

Dependencies:

  • No new dependencies required

Related Issues:

  • Complements onError callback already in Knowledge.create()
  • Works with all embedders and providers
  • Enhances reliability of sync() operation

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestmedium-priorityMedium priority issuestoolpack-knowledgeIssues related to toolpack-knowledge package

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions