Skip to content

Add ChatGPT/Codex subscription tier (loopback proxy)#7401

Draft
Git-on-my-level wants to merge 2 commits into
BasedHardware:mainfrom
Git-on-my-level:codex/chatgpt-subscription-support
Draft

Add ChatGPT/Codex subscription tier (loopback proxy)#7401
Git-on-my-level wants to merge 2 commits into
BasedHardware:mainfrom
Git-on-my-level:codex/chatgpt-subscription-support

Conversation

@Git-on-my-level
Copy link
Copy Markdown

Summary

  • Adds ChatGPT/Codex subscription enrollment on the backend (/v1/users/me/chatgpt-active) with LLM quota bypass (separate from four-key BYOK; transcription gates unchanged).
  • Ships a localhost OpenAI-compatible Codex proxy (desktop/codex-proxy) and desktop Settings UX to sign in via codex login, enroll, and route proactive LLM workloads through the proxy.
  • Adds local memory wiki + FTS5 search when ChatGPT tier is active (replaces vector embedding search for task/memory assistants).

Scope / upstream notes

This PR is cherry-picked from local hybrid work and does not include the local-daemon / HybridLLMClient stack. Proactive AI (GeminiClient) uses CodexLLMClient when enrolled; main-window chat still uses the existing pi-mono bridge (ChatGPT tier for chat can follow in a later PR).

Test plan

  • backend/tests/unit/test_chatgpt_enrollment.py
  • cd desktop/codex-proxy && cargo build --release
  • Desktop: Settings → Sign in with ChatGPT → complete Terminal codex login → verify proxy health and subscription shows unlimited LLM
  • Proactive capture / task extraction works with ChatGPT tier (no Gemini proxy calls for enrolled users)

Made with Cursor

Route proactive LLM through a localhost Codex proxy using ~/.codex auth,
add Settings enrollment UX, local memory wiki + FTS for search, backend
tier activation, and bundle the proxy in run.sh. Adapted for upstream main
without hybrid local-daemon dependencies.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bcdc7bbed2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread backend/routers/users.py
Comment on lines +826 to +830
raise HTTPException(
status_code=400,
detail='Invalid fingerprint: expected lowercase hex SHA-256 (64 chars)',
)
users_db.set_chatgpt_active(uid, data.fingerprint)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Verify subscription ownership before setting chatgpt.active

This endpoint grants ChatGPT enrollment after only a format check on fingerprint, so any authenticated user can POST an arbitrary 64-hex string and flip chatgpt.active to true. Because downstream gating (is_chatgpt_active) is what unlocks unlimited subscription/quota paths, this creates a direct paid-feature bypass without proving the caller actually has a ChatGPT/Codex subscription.

Useful? React with 👍 / 👎.

Comment on lines +211 to +215
"model": if requested_model_hint.trim().is_empty() { Value::Null } else { Value::String(requested_model_hint.clone()) },
"choices": [{
"index": 0,
"message": { "role": "assistant", "content": assistant_text },
"logprobs": null,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve tool-call outputs in proxy chat completion responses

The proxy response synthesized here always returns only assistant text content and never includes message.tool_calls, even when the upstream Codex stream represents tool-calling output. CodexLLMClient.performGeminiCompatibleToolRound depends on tool_calls to drive Task/Insight tool loops, so in ChatGPT mode those flows can terminate early or fail on the first required-tool round because no executable tool call is propagated.

Useful? React with 👍 / 👎.

Comment thread backend/database/users.py
Comment on lines +232 to +236
if isinstance(last_seen, datetime):
age = (datetime.now(timezone.utc) - last_seen).total_seconds()
else:
return False
return age <= BYOK_HEARTBEAT_TTL_SECONDS
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Refresh ChatGPT enrollment before TTL-based deactivation

is_chatgpt_active expires enrollment strictly by a 7-day TTL on last_seen_at, but desktop enrollment currently updates that timestamp only during explicit connect flow (not on normal app startup/use). As a result, active users can silently drop out of ChatGPT-unlimited status after seven days unless they manually reconnect, which causes unexpected quota/paywall regressions.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

Greptile Summary

This PR adds a ChatGPT/Codex subscription tier that routes proactive LLM workloads through a local loopback Rust proxy (desktop/codex-proxy), bypassing Omi's LLM quota for users who have a ChatGPT subscription. It also introduces a local FTS5-backed memory wiki as an alternative to vector-embedding search when the ChatGPT tier is active.

  • Backend: Adds POST/DELETE /v1/users/me/chatgpt-active endpoints with fingerprint-format-only validation, integrates chatgpt-active checks into trial paywall bypass and enforce_chat_quota, and returns a synthetic unlimited subscription response for enrolled users.
  • Desktop: New Swift services (CodexAuthService, CodexProxyService, CodexEnrollmentCoordinator, CodexLLMClient) manage the enrollment flow, subprocess lifecycle, and OpenAI-compatible requests to the proxy; MemoryWikiStorage + RewindDatabase migration add a local FTS5 wiki.
  • Rust proxy: desktop/codex-proxy/src/main.rs translates OpenAI chat completion requests to Codex SSE responses with OAuth token refresh support.

Confidence Score: 2/5

Not safe to merge as-is: the backend activation endpoint would let any authenticated user claim unlimited LLM access without a real ChatGPT subscription, and the MemorySearchMode default change would silently disable vector-search deduplication for all desktop users.

The backend fingerprint check accepts any syntactically valid SHA-256 hex, allowing any authenticated Omi user to permanently bypass LLM quotas and the trial paywall without owning a ChatGPT subscription. Separately, MemorySearchMode.current falls back to localWiki for users with no stored preference, replacing vector-embedding search globally and breaking deduplication in TaskAssistant and MemoryAssistant for non-ChatGPT users.

backend/routers/users.py (activation endpoint lacks subscription validation), desktop/Desktop/Sources/MemoryWikiStorage.swift (wrong default in MemorySearchMode.current)

Security Review

  • Quota bypass without subscription proof (backend/routers/users.py): The POST /v1/users/me/chatgpt-active endpoint validates only that the supplied fingerprint is a 64-char lowercase hex string. Any authenticated Omi user can activate the ChatGPT tier and bypass the trial paywall and monthly LLM quotas by submitting any arbitrary SHA-256 hex value — no real ChatGPT subscription required. Unlike BYOK, there is no per-request proof-of-possession.
  • No secrets are stored server-side; auth tokens remain on-device and only traverse the localhost loopback.

Important Files Changed

Filename Overview
backend/routers/users.py Adds ChatGPT enrollment endpoints. Activation only validates fingerprint format — no server-side proof of a real OpenAI subscription, allowing any authenticated user to bypass LLM quotas and the trial paywall.
desktop/Desktop/Sources/MemoryWikiStorage.swift New local wiki + FTS5 storage for memories. MemorySearchMode.current defaults to localWiki for all users (wrong default), silently replacing vector-search deduplication system-wide.
desktop/codex-proxy/src/main.rs New Rust loopback proxy translating OpenAI chat completions to Codex SSE responses with OAuth token refresh. Contains dead code (codex_body_to_chat_completion) used only in tests.
backend/database/users.py Adds ChatGPT state CRUD following the BYOK Firestore pattern. TTL reuses BYOK_HEARTBEAT_TTL_SECONDS, which is semantically misleading but functionally correct.
backend/utils/subscription.py Integrates ChatGPT tier into trial paywall logic and enforce_chat_quota; chatgpt-active users bypass the 3-day paywall. Mirrors BYOK treatment.
desktop/Desktop/Sources/CodexProxyService.swift Manages the proxy subprocess lifecycle with health monitoring and auto-restart. Stderr is silenced, making startup failures hard to diagnose.
desktop/Desktop/Sources/CodexEnrollmentCoordinator.swift Orchestrates ChatGPT enrollment flow. Rollback on failure is correct (clears enrollment, stops proxy).
desktop/Desktop/Sources/Rewind/Core/RewindDatabase.swift Adds createMemoryWikiPages GRDB migration with FTS5 virtual table and three sync triggers. Pattern is consistent with existing FTS migrations.

Sequence Diagram

sequenceDiagram
    participant User as Desktop User
    participant Enroll as EnrollmentCoordinator
    participant AuthFile as CodexAuthService
    participant Proxy as CodexProxyService
    participant Backend as Omi Backend
    participant OpenAI as Codex API

    User->>Enroll: connect()
    Enroll->>AuthFile: loadSnapshot()
    alt auth file missing
        Enroll->>Enroll: launch Terminal (codex login)
        Enroll->>AuthFile: poll every 2s
    end
    AuthFile-->>Enroll: AuthSnapshot
    Enroll->>Proxy: ensureRunning()
    Proxy-->>Proxy: spawn omi-codex-proxy on loopback port
    Enroll->>Backend: POST chatgpt-active (fingerprint)
    Backend-->>Backend: format-only validation, write Firestore
    Backend-->>Enroll: "active=true"

    Note over User, OpenAI: Proactive AI inference
    User->>Proxy: POST v1 chat completions
    Proxy->>OpenAI: codex responses (SSE)
    OpenAI-->>Proxy: SSE stream
    Proxy-->>User: OpenAI-compat JSON
Loading

Reviews (1): Last reviewed commit: "Add ChatGPT/Codex plan support via local..." | Re-trigger Greptile

Comment on lines +171 to +172
let raw = UserDefaults.standard.string(forKey: "memory_search_mode") ?? "local_wiki"
return raw == "vector" ? .vectorEmbeddings : .localWiki
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The fallback value "local_wiki" causes all desktop users (regardless of ChatGPT tier) to use the local wiki search by default. Any user without a stored memory_search_mode preference gets .localWiki, which disables vector embedding search system-wide. The PR description states local wiki should only activate when ChatGPT tier is active, so non-ChatGPT users would lose vector-search deduplication in both TaskAssistant and MemoryAssistant.

Suggested change
let raw = UserDefaults.standard.string(forKey: "memory_search_mode") ?? "local_wiki"
return raw == "vector" ? .vectorEmbeddings : .localWiki
let raw = UserDefaults.standard.string(forKey: "memory_search_mode") ?? "vector"
return raw == "vector" ? .vectorEmbeddings : .localWiki

Comment thread backend/routers/users.py
Comment on lines +820 to +832
@router.post('/v1/users/me/chatgpt-active', tags=['v1'])
def activate_chatgpt_endpoint(
data: ChatGPTActivateRequest, uid: str = Depends(auth.get_current_user_uid_no_byok_validation)
):
"""Enroll ChatGPT / Codex subscription tier (LLM workloads only; no provider keys stored)."""
if not _SHA256_HEX_RE.match(data.fingerprint):
raise HTTPException(
status_code=400,
detail='Invalid fingerprint: expected lowercase hex SHA-256 (64 chars)',
)
users_db.set_chatgpt_active(uid, data.fingerprint)
clear_trial_paywall_cache(uid)
return {"active": True}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security No server-side validation of ChatGPT subscription

The activation endpoint accepts any 64-char hex string as a valid enrollment fingerprint. The fingerprint is stored in Firestore, but is_chatgpt_active (and therefore enforce_chat_quota) never re-verifies it against OpenAI — it only checks that active=True and the Firestore timestamp is fresh. Any authenticated Omi user can POST an arbitrary SHA-256 hex to permanently bypass the 3-day trial paywall and monthly LLM quota without ever having a ChatGPT subscription. Unlike BYOK, there is no per-request proof-of-possession; once enrolled, the bypass is unconditional.

Comment thread backend/database/users.py
Comment on lines +232 to +239
if isinstance(last_seen, datetime):
age = (datetime.now(timezone.utc) - last_seen).total_seconds()
else:
return False
return age <= BYOK_HEARTBEAT_TTL_SECONDS


def set_chatgpt_active(uid: str, fingerprint: str):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The ChatGPT TTL reuses BYOK_HEARTBEAT_TTL_SECONDS, a constant whose name and comment are semantically tied to the BYOK feature. If the two TTLs ever diverge, this shared constant would be the wrong one to change. A dedicated constant makes the intent explicit.

Suggested change
if isinstance(last_seen, datetime):
age = (datetime.now(timezone.utc) - last_seen).total_seconds()
else:
return False
return age <= BYOK_HEARTBEAT_TTL_SECONDS
def set_chatgpt_active(uid: str, fingerprint: str):
if isinstance(last_seen, datetime):
age = (datetime.now(timezone.utc) - last_seen).total_seconds()
else:
return False
return age <= CHATGPT_HEARTBEAT_TTL_SECONDS
def set_chatgpt_active(uid: str, fingerprint: str):

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +567 to +606
fn codex_body_to_chat_completion(model_fallback: &str, bytes: &[u8]) -> Result<Value, String> {
let v: Value = serde_json::from_slice(bytes).map_err(|e| format!("upstream json: {e}"))?;

if v.get("choices").is_some() {
let mut enriched = v;
if enriched.get("id").and_then(Value::as_str).is_none()
|| enriched.get("id") == Some(&Value::Null)
{
enriched["id"] = Value::String(new_chat_completion_id());
}
if enriched.get("object").and_then(Value::as_str).is_none()
|| enriched.get("object") == Some(&Value::Null)
{
enriched["object"] = Value::from("chat.completion");
}
if enriched.get("created").and_then(Value::as_i64).is_none()
|| enriched.get("created") == Some(&Value::Null)
{
enriched["created"] = Value::Number(unix_secs().into());
}
Ok(enriched)
} else {
let text = extract_assistant_text(&v)
.ok_or_else(|| serde_json::to_string(&v).unwrap_or_else(|_| "(unprintable)".into()))?;
let model = chat_model_choice(&v, model_fallback)?;
Ok(json!({
"id": new_chat_completion_id(),
"object": "chat.completion",
"created": unix_secs(),
"model": model,
"choices": [{
"index": 0,
"message": { "role": "assistant", "content": text},
"logprobs": null,
"finish_reason": infer_finish_reason(&v),
}],
"usage": v.get("usage").cloned().unwrap_or(Value::Null),
}))
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 codex_body_to_chat_completion is dead production code

invoke_codex assembles the response inline using collect_text_from_codex_sse and never calls this function. It is referenced only by the maps_responses_like_output_message unit test. Keeping it creates a diverging code path that can mislead future contributors into thinking the proxy has a non-SSE fallback mode.

Comment on lines +74 to +75
proc.standardOutput = FileHandle.nullDevice
proc.standardError = FileHandle.nullDevice
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Proxy stderr silenced — startup failures are undiagnosable

proc.standardError = FileHandle.nullDevice discards all error output from the proxy process. If the proxy crashes on startup (missing auth file, port conflict, bad token format), the only signal the Swift side sees is a health-check timeout with the generic message "Codex proxy failed to start". Routing stderr to a pipe would allow the error message to be surfaced in lastError.

Require X-ChatGPT-Fingerprint on requests for quota and subscription bypass,
refresh enrollment heartbeat from desktop launch and throttled server updates,
resolve Codex transport at call time in GeminiClient, label wiki search hits
distinctly from tasks, and include memory id in wiki slugs to avoid collisions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants