Skip to content

Fix conversation test isolation across async threads and file roots#1220

Merged
gh-xj merged 4 commits intoeastreams:devfrom
chumyin:fix/conversation-flake-ordering-20260412
Apr 14, 2026
Merged

Fix conversation test isolation across async threads and file roots#1220
gh-xj merged 4 commits intoeastreams:devfrom
chumyin:fix/conversation-flake-ordering-20260412

Conversation

@chumyin
Copy link
Copy Markdown
Collaborator

@chumyin chumyin commented Apr 12, 2026

Summary

  • Problem:
    Full-suite Rust tests could fail nondeterministically because conversation-runtime tests relied on thread-local selector overrides and ambient filesystem or audit defaults.
  • Why it matters:
    The failures kept CI red, obscured real regressions, and made conversation-runtime work look unstable even when the product code was unchanged.
  • What changed:
    • replaced test-only thread-local context-engine and turn-middleware overrides with process-wide synchronized overrides so spawned test threads observe the same selection state
    • added regression tests that prove both override paths remain visible across threads
    • introduced a test helper that binds tools.file_root to each harness temp directory for the multi-step file tool tests
    • made the network-egress bootstrap test use an in-memory audit sink instead of ambient HOME-derived defaults
  • What did not change (scope boundary):
    • no production runtime selection semantics changed outside test-only override plumbing
    • no user-facing config defaults or file tool behavior changed in production
    • no crate boundaries or public contracts changed

Linked Issues

Change Type

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Security hardening
  • CI / workflow / release

Touched Areas

  • Kernel / policy / approvals
  • Contracts / protocol / spec
  • Daemon / CLI / install
  • Providers / routing
  • Tools
  • Browser automation
  • Channels / integrations
  • ACP / conversation / session runtime
  • Memory / context assembly
  • Config / migration / onboarding
  • Docs / contributor workflow
  • CI / release / workflows

Risk Track

  • Track A (routine / low-risk)
  • Track B (higher-risk / policy-impacting)

Validation

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo test --workspace --locked
  • cargo test --workspace --all-features --locked
  • Relevant architecture / dep-graph / docs checks for touched areas
  • Additional scenario, benchmark, or manual checks when behavior changed
  • If tests mutate process-global env: document how state is restored or serialized

Commands and evidence:

export CARGO_TARGET_DIR=<redacted-target-dir>
cargo fmt --all -- --check
cargo clippy --workspace --all-targets --all-features -- -D warnings
cargo test -p loongclaw-app --lib context_engine_env_override_is_visible_across_threads -- --nocapture
cargo test -p loongclaw-app --lib turn_middleware_env_override_is_visible_across_threads -- --nocapture
cargo test -p loongclaw-app --lib handle_turn_with_runtime_safe_lane_executes_session_wait_via_default_dispatcher -- --nocapture
cargo test -p loongclaw-app --lib handle_turn_with_runtime_provider_shape_function_calls_multi_step_chain_continues -- --nocapture
cargo test -p loongclaw-app --lib
cargo test --workspace --locked
cargo test --workspace --all-features --locked
./scripts/check_architecture_boundaries.sh

All commands passed locally.
The new process-wide test overrides are serialized with the existing registry/test guard helpers and are explicitly cleared or restored after each scoped use.

User-visible / Operator-visible Changes

  • None in production behavior.
  • Contributors and CI should stop seeing order-dependent conversation-runtime test failures caused by hidden ambient state.

Failure Recovery

  • Fast rollback or disable path:
    Revert dddf25d0e.
  • Observable failure symptoms reviewers should watch for:
    New tests that depend on ambient cwd, HOME-derived defaults, or per-thread selector overrides instead of harness-scoped state.

Reviewer Focus

  • crates/app/src/conversation/context_engine_registry.rs
  • crates/app/src/conversation/turn_middleware_registry.rs
  • crates/app/src/conversation/tests.rs
  • crates/app/src/context.rs

Please focus on whether the new synchronized test overrides stay strictly test-only and whether the harness file-root helper covers the multi-step file-tool chain without leaking production semantics.

Summary by CodeRabbit

  • Tests
    • Enhanced test infrastructure for improved reliability and thread-safety across internal systems.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 12, 2026

📝 Walkthrough

Walkthrough

This PR fixes test infrastructure to eliminate non-hermetic behavior in conversation runtime tests. It replaces thread-local environment overrides with process-wide mutex-protected storage, explicitly configures audit modes and file root paths in tests, and adds concurrency validation to prevent state leakage across worker threads.

Changes

Cohort / File(s) Summary
Audit Mode Configuration
crates/app/src/context.rs
Updated test to explicitly construct LoongClawConfig with config.audit.mode = AuditMode::InMemory and pass it to bootstrap_kernel_context_with_config instead of using default.
Thread-Safe Environment Overrides
crates/app/src/conversation/context_engine_registry.rs, crates/app/src/conversation/turn_middleware_registry.rs
Replaced thread-local RefCell<Option<Option<String>>> with process-wide OnceLock<Mutex<Option<Option<String>>>> for test environment overrides. Updated accessor functions to lock mutex and added concurrency tests validating override visibility across spawned threads.
Test Configuration Helpers
crates/app/src/conversation/tests.rs
Introduced test_config_with_file_root() helper that clones test_config() and sets config.tools.file_root explicitly. Updated async test cases to use this helper with harness temp directory instead of relying on ambient filesystem root.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Possibly related PRs

Suggested labels

conversation, config, size: S

Suggested reviewers

  • gh-xj
  • Ari4ka

Poem

🐰 A thread-safe warren where overrides play,
No ambient ghosts that lead tests astray,
With mutex locks guarding the shared state,
And hermetic paths—now we test with fate! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 68.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main changes: fixing test isolation issues related to async threads and file root configurations in conversation tests.
Linked Issues check ✅ Passed All coding requirements from issue #1219 are met: thread-local overrides converted to process-wide synchronized mechanisms [context_engine_registry.rs, turn_middleware_registry.rs], file_root binding via test helper [tests.rs], and in-memory audit sink [context.rs].
Out of Scope Changes check ✅ Passed All changes are scoped to test-only infrastructure and directly address the objectives of fixing test isolation; no production code or public APIs were altered.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added conversation Conversation runtime, session flow, and prompt assembly. size: S Small pull request: 51-200 changed lines. labels Apr 12, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
crates/app/src/context.rs (1)

236-240: LGTM! Test isolation improvement is correct.

Explicitly setting AuditMode::InMemory ensures this test avoids filesystem I/O and HOME-derived paths, making it hermetic under parallel execution. The change correctly achieves the stated PR objective without affecting the test's purpose (verifying network egress capability grants).

Optional clarity suggestion: The error message on line 240 still says "default config should succeed" but the config is now explicitly modified to use InMemory audit mode. Consider updating it to something like "bootstrap with in-memory audit should succeed" for accuracy.

📝 Proposed message clarification
-        .expect("bootstrap with default config should succeed");
+        .expect("bootstrap with in-memory audit should succeed");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/app/src/context.rs` around lines 236 - 240, Update the assertion
message to reflect that the test now uses an explicit in-memory audit config:
change the error string passed to expect on the
bootstrap_kernel_context_with_config call (after configuring LoongClawConfig and
setting audit.mode = AuditMode::InMemory) from "bootstrap with default config
should succeed" to a message like "bootstrap with in-memory audit should
succeed" so it accurately describes the setup involving LoongClawConfig and
AuditMode::InMemory.
crates/app/src/conversation/context_engine_registry.rs (1)

242-253: Harden override cleanup against panic paths.

clear_context_engine_env_override() is manual here; if the test panics before that line, state can leak to subsequent tests.

Proposed hardening
     let _env_lock = conversation_selector_env_lock()
         .lock()
         .unwrap_or_else(|poisoned| poisoned.into_inner());
+    struct ClearOverrideOnDrop;
+    impl Drop for ClearOverrideOnDrop {
+        fn drop(&mut self) {
+            clear_context_engine_env_override();
+        }
+    }
+    let _clear_override = ClearOverrideOnDrop;
     set_context_engine_env_override(Some("registry-custom"));

     let observed = std::thread::spawn(context_engine_id_from_env)
         .join()
         .expect("join thread");
-
-    clear_context_engine_env_override();

     assert_eq!(observed.as_deref(), Some("registry-custom"));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/app/src/conversation/context_engine_registry.rs` around lines 242 -
253, The test context_engine_env_override_is_visible_across_threads manually
calls set_context_engine_env_override(...) and
clear_context_engine_env_override(), which can leak state if the test panics;
wrap the override in a RAII-style guard or use panic-safe cleanup (e.g., create
a ContextEngineEnvOverrideGuard that calls set_context_engine_env_override in
its constructor and clear_context_engine_env_override in Drop, or use
std::panic::catch_unwind to ensure clear_context_engine_env_override runs), keep
the existing conversation_selector_env_lock() usage and run
context_engine_id_from_env in the spawned thread as before so the override is
always cleared even on panic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/app/src/context.rs`:
- Around line 236-240: Update the assertion message to reflect that the test now
uses an explicit in-memory audit config: change the error string passed to
expect on the bootstrap_kernel_context_with_config call (after configuring
LoongClawConfig and setting audit.mode = AuditMode::InMemory) from "bootstrap
with default config should succeed" to a message like "bootstrap with in-memory
audit should succeed" so it accurately describes the setup involving
LoongClawConfig and AuditMode::InMemory.

In `@crates/app/src/conversation/context_engine_registry.rs`:
- Around line 242-253: The test
context_engine_env_override_is_visible_across_threads manually calls
set_context_engine_env_override(...) and clear_context_engine_env_override(),
which can leak state if the test panics; wrap the override in a RAII-style guard
or use panic-safe cleanup (e.g., create a ContextEngineEnvOverrideGuard that
calls set_context_engine_env_override in its constructor and
clear_context_engine_env_override in Drop, or use std::panic::catch_unwind to
ensure clear_context_engine_env_override runs), keep the existing
conversation_selector_env_lock() usage and run context_engine_id_from_env in the
spawned thread as before so the override is always cleared even on panic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1cf04380-7e6c-4a9b-bde9-65e4420c3cfd

📥 Commits

Reviewing files that changed from the base of the PR and between 4da7a80 and dddf25d.

📒 Files selected for processing (4)
  • crates/app/src/context.rs
  • crates/app/src/conversation/context_engine_registry.rs
  • crates/app/src/conversation/tests.rs
  • crates/app/src/conversation/turn_middleware_registry.rs

@chumyin
Copy link
Copy Markdown
Collaborator Author

chumyin commented Apr 12, 2026

Follow-up after the first CI rerun exposed one more default-suite race on Ubuntu:

  • root cause: tool-lease secret creation could briefly expose an empty file between create_new and the first write
  • symptom: feishu_webhook_card_callback_delayed_update_waits_for_response_body_consumption sometimes saw tool_lease_authority_unavailable: secret file ... is empty
  • fix: stage the secret in a temp file and publish it atomically with persist_noclobber, plus a parallel first-use regression test

Local verification after the follow-up commit (5733e9364):

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo test -p loongclaw-app --lib issue_tool_lease_parallel_first_use_keeps_secret_readable -- --nocapture
  • cargo test -p loongclaw-app --lib feishu_webhook_card_callback_delayed_update_waits_for_response_body_consumption -- --nocapture
  • cargo test -p loongclaw-app --lib
  • cargo test --workspace --locked
  • cargo test --workspace --all-features --locked
  • ./scripts/check_architecture_boundaries.sh

Everything passed locally. The new GitHub Actions rerun is pending.

@github-actions github-actions Bot added tools Tool runtime, policy adapters, and tool catalog behavior. size: M Medium pull request: 201-500 changed lines. and removed size: S Small pull request: 51-200 changed lines. labels Apr 12, 2026
@chumyin
Copy link
Copy Markdown
Collaborator Author

chumyin commented Apr 13, 2026

Second follow-up pushed in 1c2942d02.

This addresses the remaining Windows default-suite failure:

  • tools::tests::tool_search_matches_multilingual_queries_across_languages
  • root cause: after concurrent secret publication, the loser branch could see AlreadyExists before the winning file became readable on Windows-class filesystems
  • fix: keep atomic publish, but retry the post-competition read through a bounded visibility window instead of a single immediate load
  • added regression: read_tool_lease_secret_after_competitor_publish_waits_for_visible_secret

Local verification for this follow-up:

  • cargo fmt --all
  • cargo test -p loongclaw-app --lib read_tool_lease_secret_after_competitor_publish_waits_for_visible_secret -- --nocapture
  • cargo test -p loongclaw-app --lib issue_tool_lease_parallel_first_use_keeps_secret_readable -- --nocapture
  • cargo test -p loongclaw-app --lib tool_search_matches_multilingual_queries_across_languages -- --nocapture
  • cargo test -p loongclaw-app --lib
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • ./scripts/check_architecture_boundaries.sh

Note: the previous rust-test-all-features (windows-latest) failure looked like a transient rustup download/network issue (os error 10054 while fetching the toolchain), not a repository code failure. The new push should rerun it anyway.

@chumyin
Copy link
Copy Markdown
Collaborator Author

chumyin commented Apr 13, 2026

Also refreshed docs/releases/architecture-drift-2026-04.md in 2ec3dc5ba.

The rerun surfaced the usual governance freshness gate again (Fresh architecture drift report), so I regenerated the tracked April snapshot and verified it locally with:

  • scripts/generate_architecture_drift_report.sh docs/releases/architecture-drift-2026-04.md
  • bash scripts/check_architecture_drift_freshness.sh docs/releases/architecture-drift-2026-04.md

That change is only the generated timestamp refresh needed to keep the tracked monthly drift report current.

@github-actions github-actions Bot added the documentation Improvements or additions to documentation. label Apr 13, 2026
@gh-xj gh-xj self-assigned this Apr 14, 2026
@gh-xj
Copy link
Copy Markdown
Collaborator

gh-xj commented Apr 14, 2026

LoongClaw AI scientist has claimed this case and is running review execution.

@gh-xj
Copy link
Copy Markdown
Collaborator

gh-xj commented Apr 14, 2026

LoongClaw QA Review — PR #1220

Reviewed commit: 2ec3dc5b
Risk: high
Agent: ai-scientist

Findings

  • medium: New process-wide context-engine override test still relies on manual cleanup, so a panic can leak global state into later tests — crates/app/src/conversation/context_engine_registry.rs:242

Coverage

  • Rust-specific review: applied (Rust runtime/test and secret-publication code changed)
  • Harness review: applied
  • Adversarial challenge: applied

Open Questions

  • GitHub currently reports this branch as unmergeable (dirty); rebase status should be confirmed after fixing the review finding.

Verdict

Needs a panic-safe cleanup guard in the new context-engine override test, and the branch also needs a rebase before it is merge-ready.

@gh-xj gh-xj force-pushed the fix/conversation-flake-ordering-20260412 branch from 2ec3dc5 to eefbd47 Compare April 14, 2026 03:11
chumyin and others added 4 commits April 13, 2026 20:13
The flaky conversation tests relied on thread-local env overrides and ambient filesystem defaults. Tokio multi-threaded tests and default audit or file roots allowed state to disappear or leak across unrelated tests. Use process-wide synchronized test overrides, bind file tool roots to harness temp directories, and keep the bootstrap audit sink in-memory for isolated tests.

Constraint: Tests must stay compatible with existing runtime selection helpers and harness setup
Rejected: Serialize the entire suite | hides the isolation bug and slows CI
Rejected: Add retries or sleeps | masks shared-state defects instead of fixing them
Confidence: high
Scope-risk: narrow
Directive: Test-only runtime overrides must remain visible across spawned threads and must not depend on ambient cwd or HOME-derived defaults
Tested: cargo fmt --all -- --check
Tested: cargo clippy --workspace --all-targets --all-features -- -D warnings
Tested: cargo test --workspace --locked
Tested: cargo test --workspace --all-features --locked
Tested: ./scripts/check_architecture_boundaries.sh
Not-tested: GitHub Actions rerun for this branch
The remaining Ubuntu default-suite failure came from tool lease secret initialization exposing an empty file between create and write. Feishu callback tests were the first reliable CI witness, but the defect lived in shared runtime secret creation. Stage the secret in a temporary file, persist it without clobbering an existing secret, and add a parallel-first-use regression test so the lease authority stays readable under concurrent startup.

Constraint: Existing secret files must remain authoritative and invalid on-disk state must still fail closed
Rejected: Serialize callers with a process-wide lock | would not protect multi-process first use and hides the file publication race
Rejected: Regenerate over empty or malformed secret files | weakens fail-closed behavior for potentially tampered state
Confidence: high
Scope-risk: narrow
Directive: Runtime secrets that may be initialized concurrently must be published atomically instead of becoming visible before writes complete
Tested: cargo fmt --all -- --check
Tested: cargo clippy --workspace --all-targets --all-features -- -D warnings
Tested: cargo test -p loongclaw-app --lib issue_tool_lease_parallel_first_use_keeps_secret_readable -- --nocapture
Tested: cargo test -p loongclaw-app --lib feishu_webhook_card_callback_delayed_update_waits_for_response_body_consumption -- --nocapture
Tested: cargo test -p loongclaw-app --lib
Tested: cargo test --workspace --locked
Tested: cargo test --workspace --all-features --locked
Tested: ./scripts/check_architecture_boundaries.sh
Related: eastreams#1219
Not-tested: GitHub Actions rerun after this follow-up commit
The remaining Windows failure came from the loser path after concurrent secret creation. On Windows, a competing publish could report AlreadyExists before the winning file became readable to a follow-up load, which left unrelated tool-search tests failing under another test's temporary LOONG_HOME. Keep the atomic publish path, but treat the loser branch as a bounded publication-wait path instead of a single immediate read, and add a regression that exercises delayed secret visibility.

Constraint: Secret files must still fail closed when contents are empty, malformed, or tampered
Rejected: Fall back to generating a fresh secret on loser reads | would desynchronize leases across concurrent publishers
Rejected: Broaden environment locking around all tool tests | hides the publication race instead of fixing the shared runtime primitive
Confidence: high
Scope-risk: narrow
Directive: Shared runtime secret stores must separate atomic publication from bounded post-publish visibility handling on Windows-class filesystems
Tested: cargo fmt --all
Tested: cargo test -p loongclaw-app --lib read_tool_lease_secret_after_competitor_publish_waits_for_visible_secret -- --nocapture
Tested: cargo test -p loongclaw-app --lib issue_tool_lease_parallel_first_use_keeps_secret_readable -- --nocapture
Tested: cargo test -p loongclaw-app --lib tool_search_matches_multilingual_queries_across_languages -- --nocapture
Tested: cargo test -p loongclaw-app --lib
Tested: cargo clippy --workspace --all-targets --all-features -- -D warnings
Tested: ./scripts/check_architecture_boundaries.sh
Related: eastreams#1219
Not-tested: GitHub Actions rerun after this Windows-focused follow-up
Rebased the PR branch onto current dev, replaced the manual cleanup in the
process-wide context-engine override test with a scoped drop guard, and
refreshed the tracked April architecture drift snapshot on the rebased tree.

Constraint: The override path is process-global under test, so panic cleanup must restore prior state
Rejected: Keep manual clear_context_engine_env_override cleanup | leaks global override state if the test panics before teardown
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Any future process-wide test override must use scoped cleanup rather than trailing manual reset calls
Tested: cargo fmt --all -- --check
Tested: cargo clippy --workspace --all-targets --all-features -- -D warnings
Tested: cargo test -p loongclaw-app --lib scoped_context_engine_env_override_clears_on_panic -- --nocapture
Tested: cargo test -p loongclaw-app --lib context_engine_env_override_is_visible_across_threads -- --nocapture
Tested: cargo test -p loongclaw-app --lib turn_middleware_env_override_is_visible_across_threads -- --nocapture
Tested: cargo test -p loongclaw-app --lib issue_tool_lease_parallel_first_use_keeps_secret_readable -- --nocapture
Tested: cargo test -p loongclaw-app --lib read_tool_lease_secret_after_competitor_publish_waits_for_visible_secret -- --nocapture
Tested: cargo test -p loongclaw-app --lib feishu_webhook_card_callback_delayed_update_waits_for_response_body_consumption -- --nocapture
Tested: cargo test -p loongclaw-app --lib tool_search_matches_multilingual_queries_across_languages -- --nocapture
Tested: cargo test -p loongclaw-app --lib
Tested: cargo test --workspace --locked
Tested: cargo test --workspace --all-features --locked
Tested: scripts/generate_architecture_drift_report.sh docs/releases/architecture-drift-2026-04.md
Tested: bash scripts/check_architecture_drift_freshness.sh docs/releases/architecture-drift-2026-04.md
Tested: ./scripts/check_architecture_boundaries.sh
Not-tested: GitHub Actions rerun after push
@gh-xj gh-xj force-pushed the fix/conversation-flake-ordering-20260412 branch from eefbd47 to 98f6284 Compare April 14, 2026 03:23
@github-actions github-actions Bot removed the documentation Improvements or additions to documentation. label Apr 14, 2026
Copy link
Copy Markdown
Collaborator

@gh-xj gh-xj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the rebased head 98f6284 after conflict resolution and the panic-safe cleanup fix. Local verification is green (cargo fmt --all -- --check, cargo clippy --workspace --all-targets --all-features -- -D warnings, cargo test -p loongclaw-app --lib, cargo test --workspace --locked, cargo test --workspace --all-features --locked, ./scripts/check_architecture_boundaries.sh), and the required GitHub checks are passing.

@gh-xj
Copy link
Copy Markdown
Collaborator

gh-xj commented Apr 14, 2026

LoongClaw QA Review — PR #1220

Reviewed commit: 98f62846
Risk: high
Agent: ai-scientist

Findings

  • No remaining code findings after the panic-safe cleanup fix and the rebase onto current dev.

Coverage

  • Rust-specific review: applied (Rust runtime/test and secret-publication code changed)
  • Harness review: applied
  • Adversarial challenge: applied

Open Questions

  • GitHub branch protection on dev requires require_last_push_approval=true, so a reviewer other than the last pusher must approve this rebased head before merge.

Verdict

The branch is conflict-free, required checks are green, and the code is ready to merge once a non-pusher approving review is recorded.

@gh-xj gh-xj merged commit 2ebb266 into eastreams:dev Apr 14, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conversation Conversation runtime, session flow, and prompt assembly. size: M Medium pull request: 201-500 changed lines. tools Tool runtime, policy adapters, and tool catalog behavior.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: conversation runtime tests rely on non-hermetic ambient state

2 participants