|
7 | 7 | In distributed systems and AI workflows, transient failures (such as network glitches, rate limits, or temporary service outages) are common. `RetryConfig` allows developers to define a resilient policy for individual nodes. When a node execution fails with a configured exception, the ADK will automatically retrying the node execution according to the specified delay and backoff strategy, before propagating the failure. |
8 | 8 |
|
9 | 9 | Key benefits: |
| 10 | + |
10 | 11 | - **Resilience**: Automatically recovers from transient errors. |
11 | 12 | - **Configurable Backoff**: Supports exponential backoff to avoid overwhelming downstream services. |
12 | 13 | - **Jitter**: Introduces randomness to retry delays to prevent thundering herd problems. |
@@ -38,28 +39,34 @@ async def call_unstable_api(node_input: str): |
38 | 39 |
|
39 | 40 | When a node configured with `RetryConfig` raises an exception during execution: |
40 | 41 |
|
41 | | -1. **Exception Matching**: The `NodeRunner` catches the exception and checks if it matches any of the types specified in `RetryConfig.exceptions`. If `exceptions` is `None` (the default), it matches all exceptions. |
42 | | -2. **Attempt Count Check**: It checks if the current attempt count is less than `max_attempts`. |
43 | | -3. **Delay Calculation**: If a retry is warranted, it calculates the delay: |
| 42 | +1. **Exception Matching**: The `NodeRunner` catches the exception and checks if it matches any of the types specified in `RetryConfig.exceptions`. If `exceptions` is `None` (the default), it matches all exceptions. |
| 43 | +1. **Attempt Count Check**: It checks if the current attempt count is less than `max_attempts`. |
| 44 | +1. **Delay Calculation**: If a retry is warranted, it calculates the delay. The delay is capped at `max_delay`. |
| 45 | + |
| 46 | +```math |
44 | 47 | $$\text{delay} = \text{initial\_delay} \times (\text{backoff\_factor}^{\text{attempt} - 1})$$ |
45 | | - The delay is capped at `max_delay`. |
46 | | -4. **Jitter Application**: If `jitter` is enabled (greater than 0.0), a random offset is added to the delay: |
| 48 | +``` |
| 49 | + |
| 50 | +4. **Jitter Application**: If `jitter` is enabled (greater than 0.0), a random offset is added to the delay. The final delay is guaranteed to be non-negative. |
| 51 | + |
| 52 | +```math |
47 | 53 | $$\text{delay} = \text{delay} + \text{random}(-jitter \times \text{delay}, jitter \times \text{delay})$$ |
48 | | - The final delay is guaranteed to be non-negative. |
49 | | -5. **Execution Pause and Retry**: The runner sleeps for the calculated delay and then re-executes the node's logic. |
| 54 | +``` |
| 55 | + |
| 56 | +5. **Execution Pause and Retry**: The runner sleeps for the calculated delay and then re-executes the node's logic. |
50 | 57 |
|
51 | 58 | ## Configuration options |
52 | 59 |
|
53 | 60 | `RetryConfig` is a Pydantic model with the following fields: |
54 | 61 |
|
55 | | -| Field | Type | Default | Description | |
56 | | -| :--- | :--- | :--- | :--- | |
57 | | -| `max_attempts` | `int \| None` | `5` (if omitted) | Maximum number of attempts, including the original request. If `0` or `1`, retries are disabled. | |
58 | | -| `initial_delay` | `float \| None` | `1.0` | Initial delay before the first retry, in seconds. | |
59 | | -| `max_delay` | `float \| None` | `60.0` | Maximum delay between retries, in seconds. | |
60 | | -| `backoff_factor` | `float \| None` | `2.0` | Multiplier by which the delay increases after each attempt. | |
61 | | -| `jitter` | `float \| None` | `1.0` | Randomness factor for the delay. Set to `0.0` to disable jitter (deterministic delays). | |
62 | | -| `exceptions` | `list[str \| type[BaseException]] \| None` | `None` | Exceptions to retry on. Can be exception classes or their string names. `None` means retry on all exceptions. | |
| 62 | +| Field | Type | Default | Description | |
| 63 | +| :--------------- | :----------------------------------------- | :--------------- | :------------------------------------------------------------------------------------------------------------ | |
| 64 | +| `max_attempts` | `int \| None` | `5` (if omitted) | Maximum number of attempts, including the original request. If `0` or `1`, retries are disabled. | |
| 65 | +| `initial_delay` | `float \| None` | `1.0` | Initial delay before the first retry, in seconds. | |
| 66 | +| `max_delay` | `float \| None` | `60.0` | Maximum delay between retries, in seconds. | |
| 67 | +| `backoff_factor` | `float \| None` | `2.0` | Multiplier by which the delay increases after each attempt. | |
| 68 | +| `jitter` | `float \| None` | `1.0` | Randomness factor for the delay. Set to `0.0` to disable jitter (deterministic delays). | |
| 69 | +| `exceptions` | `list[str \| type[BaseException]] \| None` | `None` | Exceptions to retry on. Can be exception classes or their string names. `None` means retry on all exceptions. | |
63 | 70 |
|
64 | 71 | ## Advanced applications |
65 | 72 |
|
|
0 commit comments