Honor provider cooldown hints and add gateway OpenAI compat by littlepenguin66 · Pull Request #1233 · eastreams/loong

littlepenguin66 · 2026-04-12T20:12:30Z

Summary

Problem:
The provider cooldown path ignored shorter provider retry hints, and the gateway still lacked a complete OpenAI-compatible /v1/chat/completions + /v1/models surface that inherited shared runtime semantics.
Why it matters:
Ignoring provider rate-limit hints causes unnecessary wait or premature retries, and a provider-only gateway compat path drifts from shared turn/runtime behavior, usage reporting, and streaming semantics expected by OpenAI-compatible clients.
What changed:
Added provider rate-limit observation/parsing and threaded it through transport/failover/cooldown selection; made cooldown resolution honor provider hints when present; added the gateway OpenAI-compatible surface; routed accepted compat requests through the shared runtime path with usage propagation; disambiguated duplicate model ids; ensured streaming error termination with [DONE]; tightened Feishu/WeCom streaming request assertions.
What did not change (scope boundary):
Tool calling on the OpenAI-compatible gateway surface is still explicitly unsupported; channel entrypoints did not gain a new high-level fallback integration test for streaming-disabled transports.

Linked Issues

Change Type

Touched Areas

Risk Track

Track A (routine / low-risk)
Track B (higher-risk / policy-impacting)

If Track B, fill these in:

Risk notes:
This changes provider cooldown behavior, provider request parsing, shared turn/runtime wiring, gateway request validation, and streaming behavior.
Rollout / guardrails:
Gateway compat rejects unsupported tool fields and non-user-ending message sequences; duplicate model ids are surfaced as profile_id:model; package-level tests cover provider, gateway, and channel paths.
Rollback path:
Revert commit 6b617e3f (and follow-up doc commit if needed) to restore the pre-change provider cooldown and gateway behavior.

Validation

cargo fmt --all -- --check
cargo clippy --workspace --all-targets --all-features -- -D warnings
cargo test --workspace --locked
cargo test --workspace --all-features --locked
Relevant architecture / dep-graph / docs checks for touched areas
Additional scenario, benchmark, or manual checks when behavior changed
If this changes config/env fallback, limits, or defaults: include before/after behavior and regression coverage for explicit path, fallback path, and boundary values
If tests mutate process-global env: document how state is restored or serialized

Commands and evidence:

cargo fmt --all -- --check
cargo test -p loongclaw-app --lib
cargo test -p loongclaw

All three commands passed locally after the final fix set.

Before/after notes:
- Provider cooldowns now prefer provider hints even when shorter than the previous policy floor.
- Gateway compat accepted requests now route through shared runtime semantics instead of direct provider execution.
- Duplicate configured model ids are exposed as stable `profile_id:model` aliases.
- Streaming error responses now terminate with `[DONE]`.
- OpenAI-compatible requests that do not end in a `user` message are rejected instead of silently bypassing the shared runtime path.

User-visible / Operator-visible Changes

OpenAI-compatible clients can call /v1/models and /v1/chat/completions on the gateway and get shared-runtime behavior, usage propagation, stable duplicate-model aliases, and terminated SSE streams.
Provider cooldown behavior now honors provider retry hints more precisely.

Failure Recovery

Fast rollback or disable path:
Revert the branch tip or disable use of the OpenAI-compatible gateway surface until the revert lands.
Observable failure symptoms reviewers should watch for:
Unexpected 400s for malformed/non-user-ending gateway chat payloads, incorrect model alias selection, missing usage in non-streaming compat responses, or channels no longer sending "stream": true in provider requests.

Reviewer Focus

Provider hint parsing and cooldown precedence in crates/app/src/provider/rate_limit.rs and crates/app/src/provider/model_candidate_cooldown_runtime.rs.
Shared-runtime routing and usage propagation in crates/daemon/src/gateway/openai_compat.rs, crates/app/src/agent_runtime.rs, crates/app/src/chat.rs, and crates/app/src/conversation/turn_coordinator.rs.
Channel streaming assertions in crates/app/src/channel/feishu/webhook.rs, crates/app/src/channel/feishu/websocket.rs, and crates/app/src/channel/wecom.rs.

gh-xj · 2026-04-14T03:05:31Z

LoongClaw QA Review — PR #1233

Reviewed commit: 77b73ddd
Risk: high
Agent: ai-scientist

Findings

high: Streamed OpenAI-compatible turn execution can drop streamed tool calls because the streaming parser only reconstructs text deltas and finish markers, not streamed tool-call chunks. crates/app/src/provider/contracts.rs:64
medium: The non-streaming OpenAI-compatible gateway flattens propagated upstream 4xx/429 failures into HTTP 500, which breaks client-visible retry/auth semantics. crates/daemon/src/gateway/openai_compat.rs:229

Coverage

Rust-specific review: applied (manual code-path review plus targeted Rust tests)
Harness review: applied (repo-owned tests green; wrapper-path loongclaw-dev QA still lacks behavioral coverage)
Adversarial challenge: applied (the participant_id QA build blocker did not reproduce under direct cargo build, so it remains a harness inconsistency rather than a confirmed branch defect)

Open Questions

GitHub currently reports this PR as DIRTY; rebase is required before closure.
Wrapper-path E2E proof through the OpenAI-compatible gateway is still missing and should be rerun after rebase from a clean QA lane.

Verdict

Blocking findings on the current head; rebase onto dev, fix the two runtime issues, and rerun boundary-aligned gateway QA before re-dispatch.

Merges dev into the PR branch to clear the current conflict set. The release support drift docs moved on dev, so the stale top-level report stays deleted and the updated support-side artifact wins. Constraint: PR eastreams#1233 is currently CONFLICTING against dev Rejected: Keep the deleted top-level drift report | dev already removed that path Confidence: high Scope-risk: narrow Directive: Re-run CI-parity gates after any follow-up code edits on top of this merge Tested: Merge conflict resolution only Not-tested: Runtime/code changes after the merge

The compat gateway branch now rebuilds streamed OpenAI tool calls and keeps upstream HTTP status classes visible on the non-streaming gateway surface. This keeps the PR aligned with the shared runtime after the dev merge while preserving client-visible retry and auth behavior. Constraint: PR eastreams#1233 had to absorb origin/dev before follow-up fixes could land Constraint: Gateway turns currently surface provider failures as strings, so status preservation had to reuse embedded failover snapshots Rejected: Keep generic HTTP 500 mapping | breaks OpenAI-compatible retry/auth semantics Rejected: Duplicate failover snapshot parsing inside the gateway | reuse the existing provider parser instead Confidence: high Scope-risk: moderate Directive: If streamed OpenAI turns gain richer chunk shapes, extend the parser/tests before widening observer-backed streaming further Tested: cargo fmt --all -- --check; cargo clippy --workspace --all-targets --all-features -- -D warnings; cargo test -p loongclaw-app reconstructs_openai_tool_calls -- --nocapture; cargo test -p loongclaw gateway_openai_chat_completion_preserves_provider_rate_limit_status -- --nocapture; cargo test -p loongclaw-app provider:: -- --nocapture; cargo test -p loongclaw gateway_openai_chat_completion -- --nocapture; cargo test --workspace; cargo test --workspace --all-features Not-tested: loongclaw-dev wrapper-path QA rerun from a clean slot after these fixes

The governance workflow regenerates the tracked monthly architecture drift report from the merged tree. The compat parser and gateway updates changed those tracked metrics, so the checked-in report needed to be refreshed to match the new merged reality. Constraint: governance requires docs/releases/support/architecture-drift-2026-04.md to match a fresh generated report on the PR merge commit Rejected: Leave the stale report in place | keeps governance red even though the code is otherwise mergeable Confidence: high Scope-risk: narrow Directive: Whenever provider/chat/turn-coordinator size shifts on a release-tracked branch, refresh the monthly drift report before expecting governance to pass Tested: bash scripts/check_architecture_drift_freshness.sh docs/releases/support/architecture-drift-2026-04.md Not-tested: Full governance workflow rerun before push

littlepenguin66 added 4 commits April 13, 2026 15:01

fix(runtime): honor rate-limit hints and compat turns

f974d42

fix(test): stabilize gateway and announce ci

b1dba60

fix(runtime): stabilize gateway compat regressions

467706d

Update architecture-drift-2026-04.md

3eba385

littlepenguin66 force-pushed the feat/983-996-rate-limit-openai-compat branch from 2ec715e to 3eba385 Compare April 13, 2026 07:05

github-actions Bot added the tools Tool runtime, policy adapters, and tool catalog behavior. label Apr 13, 2026

littlepenguin66 added 3 commits April 13, 2026 16:15

Use memory config directly without environment overrides

2c7d54a

Disable implicit default session creation in AgentRuntime

085f382

Fix relative paths in architecture drift release doc

77b73dd

gh-xj self-assigned this Apr 14, 2026

gh-xj added 2 commits April 13, 2026 20:18

github-actions Bot removed the documentation Improvements or additions to documentation. label Apr 14, 2026

github-actions Bot added the documentation Improvements or additions to documentation. label Apr 14, 2026

gh-xj approved these changes Apr 14, 2026

View reviewed changes

gh-xj merged commit bae1305 into eastreams:dev Apr 14, 2026
18 checks passed

littlepenguin66 deleted the feat/983-996-rate-limit-openai-compat branch April 14, 2026 09:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Honor provider cooldown hints and add gateway OpenAI compat#1233

Honor provider cooldown hints and add gateway OpenAI compat#1233
gh-xj merged 10 commits intoeastreams:devfrom
littlepenguin66:feat/983-996-rate-limit-openai-compat

littlepenguin66 commented Apr 12, 2026

Uh oh!

gh-xj commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

littlepenguin66 commented Apr 12, 2026

Summary

Linked Issues

Change Type

Touched Areas

Risk Track

Validation

User-visible / Operator-visible Changes

Failure Recovery

Reviewer Focus

Uh oh!

gh-xj commented Apr 14, 2026

LoongClaw QA Review — PR #1233

Findings

Coverage

Open Questions

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants