Is your feature request related to a problem? Please describe.
Currently, the onError callback in Knowledge.create() only supports two actions: 'skip' or 'abort'. This is insufficient for handling transient failures like:
- Network timeouts when calling embedding APIs
- Rate limiting from OpenAI/Ollama
- Temporary database connection issues
- Intermittent file system errors
When these transient errors occur, users must either skip the chunk entirely (losing data) or abort the entire sync (failing the operation). There's no way to automatically retry with backoff, which is the standard pattern for handling transient failures.
Describe the solution you'd like
Extend the ErrorHandler type to support a third return value: a retry configuration object with exponential backoff support.
Implementation Details
1. Update Error Handler Type
File: packages/toolpack-knowledge/src/knowledge.ts
Extend the ErrorHandler type to accept a RetryConfig object as a return value. The config should include:
maxAttempts - Maximum number of retry attempts
delay - Initial delay in milliseconds
backoff - Strategy: 'linear' or 'exponential'
maxDelay - Optional cap for exponential backoff (default: 30000ms)
retryableErrors - Optional array of error codes/messages to retry (default: retry all)
Add attempt number to the error context so handlers can track retry progress.
2. Implement Retry Logic
File: packages/toolpack-knowledge/src/knowledge.ts
Create a retryWithBackoff<T>() utility function that:
- Accepts a function to retry, retry config, and error context
- Attempts the function up to
maxAttempts times
- Checks if errors match
retryableErrors filter (if specified)
- Calculates delay based on backoff strategy:
- Linear:
delay * attempt
- Exponential:
delay * 2^(attempt-1) capped at maxDelay
- Calls error handler on each attempt for logging/monitoring
- Respects 'abort' action from error handler
- Throws the last error if all attempts fail
3. Update embedChunks to Use Retry Logic
File: packages/toolpack-knowledge/src/knowledge.ts
Modify the embedChunks() method to:
- Wrap
embedder.embed() and embedder.embedBatch() calls with retry logic
- Check error handler return value for retry configuration
- Apply retry logic when
{ retry: RetryConfig } is returned
- Maintain backward compatibility with 'skip' and 'abort' actions
Create helper methods:
embedWithRetry() - Wraps single embedding with retry logic
embedBatchWithRetry() - Wraps batch embedding with retry logic
4. Update Embedder Error Handling
Files:
packages/toolpack-knowledge/src/embedders/openai.ts
packages/toolpack-knowledge/src/embedders/ollama.ts
Enhance error classification to include status codes:
- Rate limit errors (429) - Should be retried with backoff
- Server errors (5xx) - Should be retried
- Network errors (ECONNREFUSED, ETIMEDOUT) - Should be retried
- Client errors (4xx except 429) - Should not be retried
Update EmbeddingError to include statusCode property for better error handling decisions.
5. Usage Examples
Example 1: Retry on rate limits
Configure retry with exponential backoff for rate limit errors, starting with 1 second delay and capping at 30 seconds.
Example 2: Retry on network errors
Configure retry with linear backoff for network connectivity issues.
Example 3: Selective retry based on error message
Only retry specific error types using the retryableErrors filter.
6. Testing
File: packages/toolpack-knowledge/src/__tests__/retry-logic.test.ts
Create comprehensive tests covering:
- Successful retry after failures - Verify retries work and eventually succeed
- Exponential backoff calculation - Verify delays increase exponentially
- Linear backoff calculation - Verify delays increase linearly
- maxDelay cap - Verify exponential backoff respects the cap
- maxAttempts limit - Verify failure after max attempts
- retryableErrors filter - Verify only specified errors are retried
- Non-retryable errors - Verify immediate failure for non-retryable errors
- Error handler integration - Verify error handler is called on each attempt
- Abort action - Verify 'abort' stops retries immediately
Acceptance Criteria
Describe alternatives you've considered
- Built-in retry in embedders: Add retry logic directly to
OllamaEmbedder and OpenAIEmbedder - rejected because it removes user control and doesn't help with other error types
- Separate retry utility: Export a standalone retry utility - considered but less ergonomic than integrating with error handler
- Promise-based retry library: Use
p-retry or similar - rejected to avoid dependencies
Additional context
- Retry logic is critical for production deployments with API-based embedders
- OpenAI rate limits are common when processing large knowledge bases
- Network errors are transient and should be retried automatically
- This feature makes knowledge base ingestion more robust and reliable
- Exponential backoff prevents overwhelming rate-limited services
Dependencies:
- No new dependencies required
Related Issues:
- Complements
onError callback already in Knowledge.create()
- Works with all embedders and providers
- Enhances reliability of
sync() operation
Is your feature request related to a problem? Please describe.
Currently, the
onErrorcallback inKnowledge.create()only supports two actions:'skip'or'abort'. This is insufficient for handling transient failures like:When these transient errors occur, users must either skip the chunk entirely (losing data) or abort the entire sync (failing the operation). There's no way to automatically retry with backoff, which is the standard pattern for handling transient failures.
Describe the solution you'd like
Extend the
ErrorHandlertype to support a third return value: a retry configuration object with exponential backoff support.Implementation Details
1. Update Error Handler Type
File:
packages/toolpack-knowledge/src/knowledge.tsExtend the
ErrorHandlertype to accept aRetryConfigobject as a return value. The config should include:maxAttempts- Maximum number of retry attemptsdelay- Initial delay in millisecondsbackoff- Strategy: 'linear' or 'exponential'maxDelay- Optional cap for exponential backoff (default: 30000ms)retryableErrors- Optional array of error codes/messages to retry (default: retry all)Add
attemptnumber to the error context so handlers can track retry progress.2. Implement Retry Logic
File:
packages/toolpack-knowledge/src/knowledge.tsCreate a
retryWithBackoff<T>()utility function that:maxAttemptstimesretryableErrorsfilter (if specified)delay * attemptdelay * 2^(attempt-1)capped atmaxDelay3. Update embedChunks to Use Retry Logic
File:
packages/toolpack-knowledge/src/knowledge.tsModify the
embedChunks()method to:embedder.embed()andembedder.embedBatch()calls with retry logic{ retry: RetryConfig }is returnedCreate helper methods:
embedWithRetry()- Wraps single embedding with retry logicembedBatchWithRetry()- Wraps batch embedding with retry logic4. Update Embedder Error Handling
Files:
packages/toolpack-knowledge/src/embedders/openai.tspackages/toolpack-knowledge/src/embedders/ollama.tsEnhance error classification to include status codes:
Update
EmbeddingErrorto includestatusCodeproperty for better error handling decisions.5. Usage Examples
Example 1: Retry on rate limits
Configure retry with exponential backoff for rate limit errors, starting with 1 second delay and capping at 30 seconds.
Example 2: Retry on network errors
Configure retry with linear backoff for network connectivity issues.
Example 3: Selective retry based on error message
Only retry specific error types using the
retryableErrorsfilter.6. Testing
File:
packages/toolpack-knowledge/src/__tests__/retry-logic.test.tsCreate comprehensive tests covering:
Acceptance Criteria
ErrorHandlertype supports{ retry: RetryConfig }return valuemaxDelaycap is respected for exponential backoffretryableErrorsfilter works correctlyattemptnumberEmbeddingErrorincludes status codesembed()andembedBatch()Describe alternatives you've considered
OllamaEmbedderandOpenAIEmbedder- rejected because it removes user control and doesn't help with other error typesp-retryor similar - rejected to avoid dependenciesAdditional context
Dependencies:
Related Issues:
onErrorcallback already inKnowledge.create()sync()operation