fix: honor server-directed FDv1 Fallback Directive in initializer phase#423
Open
fix: honor server-directed FDv1 Fallback Directive in initializer phase#423
Conversation
Previously the X-LD-FD-Fallback response header was honored only in the synchronizer phase, and in the synchronizer phase a payload arriving alongside the directive was discarded because the streaming processor halted before applying it. Per the updated FDv2 Data System spec, the directive must be honored in both initializer and synchronizer phases, must take precedence over the section 1.2 failover algorithm, and must let any accompanying payload be applied before the SDK switches terminally to the FDv1 Fallback Synchronizer. Carry the signal explicitly on the public initializer/synchronizer result types -- Basis.fallback_to_fdv1 and Update.fallback_to_fdv1 -- so callers cannot silently drop it. Update.revert_to_fdv1 is renamed to Update.fallback_to_fdv1 to match the spec terminology.
Declare the fdv1-fallback capability and accept the new top-level dataSystem.fdv1Fallback config object, wiring it directly to the SDK's FDv1 Fallback Synchronizer. Drop the heuristic that previously inferred the FDv1 fallback from the last polling synchronizer entry -- the FDv1 Fallback Synchronizer is a distinct configuration concern from the FDv2 Primary/Fallback chain. Also bump the test harness pin to v3.0.0-alpha.6 so the new fdv1-fallback test suite runs in CI.
Per the FDv1 Fallback Directive (Data System spec 1.6.3), the Primary Synchronizer must be terminated when the directive engages. The streaming source called ``SSEClient.close()`` to do this, but on Python 3.10 that wasn't enough: ``SSEClient.close()`` ultimately invokes ``urllib3.HTTPResponse.shutdown()`` (which only performs a half-close on the local read side) plus ``release_conn()`` (which returns the live connection to the pool). The TCP socket stayed open until the response was garbage-collected, which the contract test harness flagged on the linux/3.10 matrix entry as "SDK did not close the FDv2 streaming connection after the directive — the Primary Synchronizer must be stopped when Directed Fallback engages." Newer Python versions GC the response promptly enough to hide the leak. Track the urllib3 PoolManager alongside the SSEClient and call ``HTTPConnectionPool.close()`` on each pool when the synchronizer stops. ``HTTPConnectionPool.close()`` drains the connection queue and closes each socket, sending the FIN the server is waiting on. ``PoolManager.clear()`` alone won't do this -- it drops the dict of pools without closing the connections inside them. The fix is deterministic (no sleeps) and runs both at the natural end of ``sync()`` and in ``stop()`` so it's robust to whichever thread arrives first.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit f37695b. Configure here.
When a streaming Start action latched the FDv1 Fallback Directive (via X-LD-FD-Fallback: true), a subsequent Fault, JSONDecodeError, or generic exception path could yield an Update with fallback_to_fdv1=False because each error path independently checked the failure's headers for the directive instead of consulting the latched state from the earlier Start. The most visible symptom is an HTTPStatusError 401 mid-stream after the directive was set: the SDK treated it as an ordinary unrecoverable failure (REMOVE) instead of honoring the directive (FDV1). Funnel every error-path Update through a small _with_fallback_signal helper that propagates the latched signal, and break out of the SSE read loop unconditionally once the directive is set so we don't keep retrying the FDv2 endpoint -- the directive is one-way and terminal.
Two follow-up cleanups from review of the FDv1 Fallback Directive work:
- Drop the test-only shim in StreamingDataSource.sync() that accepted
either a bare SSEClient or a (client, pool) tuple from the injected
builder. The SseClientBuilder type alias already declares the tuple
return; annotating the field as SseClientBuilder lets pyright align,
and the lone test fixture (list_sse_client) now returns
(client, None) like the production create_sse_client does.
- Hoist basis.change_set.selector.is_defined() into a local in
FDv2._run_initializers() so it isn't computed twice on the
success-with-fallback path. Also fixes a typo in the adjacent
comment ("if an only if" -> "if and only if").
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
X-LD-FD-Fallback: truenow apply any accompanying payload first, then the data system swaps the synchronizer list to the FDv1 Fallback Synchronizer (or transitions to OFF when none is configured). Synchronizer-phase behavior is unchanged; the directive is one-way.Basis.fallback_to_fdv1and renamesUpdate.revert_to_fdv1toUpdate.fallback_to_fdv1so initializer/synchronizer results carry the signal explicitly. Polling and streaming data sources detect the header on both success and error paths via a shared helper.fdv1-fallbackcapability and accepts a top-leveldataSystem.fdv1Fallbackconfig object, wiring it directly todatasystem.fdv1_compatible_synchronizer(...)instead of inferring it from the last polling synchronizer. Bumps the contract-tests pin fromv3.0.0-alpha.4tov3.0.0-alpha.6.Mirrors the Go reference change in go-server-sdk#365 and contract-test wiring in go-server-sdk#368. Spec rationale lives in sdk-specs#155; the new harness suite is in sdk-test-harness#336.
Test plan
pytest(LD_SKIP_DATABASE_TESTS=1): 975 passed, 194 skipped (database-only).mypy ldclient: clean.isort --check,pycodestyle: clean.Start-then-payload and synchronizer fallback after a Valid update. No real wall clocks or sleeps introduced.v3.0.0-alpha.6runs the new "FDv1 Fallback Directive" suite cleanly in CI.Note
Medium Risk
Changes FDv2 initialization and streaming/polling fallback behavior based on server headers, which can alter data-source lifecycle and shutdown semantics. Risk is mitigated by extensive new unit/contract-test wiring but impacts core update/fallback paths.
Overview
Implements server-directed FDv1 fallback during FDv2 initialization. Initializers can now trigger a terminal switch to the configured FDv1 fallback synchronizer when
X-LD-FD-Fallback: trueis seen (including on initializer error responses), and the system transitions toOFFif no FDv1 fallback is configured.Unifies and hardens fallback signaling across FDv2 components. Adds
Basis.fallback_to_fdv1and renamesUpdate.revert_to_fdv1toUpdate.fallback_to_fdv1, updates polling to forward headers on failures and to honor fallback even for non-HTTP errors, and updates streaming to latch the directive fromStart, propagate it through subsequent updates/errors, and force-close the underlyingurllib3pool on shutdown.Updates contract-test integration. CI contract tests are bumped to
v3.0.0-alpha.6, the contract test service advertisesfdv1-fallback, and the test client config now acceptsdataSystem.fdv1Fallbackas a dedicated FDv1 fallback synchronizer configuration.Reviewed by Cursor Bugbot for commit cafcca1. Bugbot is set up for automated code reviews on this repo. Configure here.