Question: Multi-Tenant Support with Per-Tenant API Keys and Models

### Description

## Question: Multi-Tenant Support with Per-Tenant API Keys and Models

### Context

I'm evaluating OpenFang for a **SaaS platform** serving **10,000+ tenants**. Each tenant needs to:
- Provide their own LLM provider API keys (OpenAI, Anthropic, Gemini, etc.)
- Choose their preferred models (gpt-4-turbo, claude-sonnet-4, gemini-pro, etc.)
- Have isolated data, sessions, and cost tracking

I noticed [Issue #322](https://github.com/RightNow-AI/openfang/issues/322) mentions multi-tenant isolation as a feature request, but I also see that OpenFang has enterprise features like RBAC, metering, and session management.

**I need clarification on what is currently supported.**

---

### My Use Case
```
┌─────────────────────────────────────────────────┐
│  SaaS Platform (Multi-Tenant)                   │
├─────────────────────────────────────────────────┤
│  Tenant A                                       │
│  ├─ API Key: sk-proj-A...                       │
│  ├─ Preferred Model: gpt-4-turbo                │
│  └─ Budget: $100/month                          │
│                                                  │
│  Tenant B                                       │
│  ├─ API Key: sk-ant-B...                        │
│  ├─ Preferred Model: claude-sonnet-4            │
│  └─ Budget: $500/month                          │
│                                                  │
│  ... 9,998 more tenants                         │
└─────────────────────────────────────────────────┘
```

**Requirements:**
1. Each tenant sends requests with their own credentials
2. OpenFang routes to the correct LLM provider using tenant-specific API keys
3. Cost tracking is attributed to the correct tenant
4. Data isolation between tenants (memory, sessions, workspace)
5. Concurrent requests from multiple tenants

---

### Specific Questions

#### 1. **Dynamic Credential Injection**

Can OpenFang accept per-request API credentials, or must credentials be configured statically in `openfang.toml`?

**Example Request:**
```bash
curl -X POST http://openfang:4200/v1/chat/completions \
  -H "Authorization: Bearer tenant-a-key" \
  -H "X-Tenant-ID: tenant-a" \
  -H "X-Provider-API-Key: sk-proj-A..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4-turbo",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
```

**Question:**
- Does OpenFang read `X-Provider-API-Key` from the request header?
- Or does it always use the API key from `openfang.toml`?
- If it uses `openfang.toml`, can I programmatically update it per request?

---

#### 2. **Model Selection Per Request**

In the OpenFang API, the `model` field appears to be an agent type (e.g., "researcher", "coder") rather than an LLM model name (e.g., "gpt-4-turbo", "claude-sonnet-4").

**Example from README:**
```bash
curl -X POST localhost:4200/v1/chat/completions \
  -d '{
    "model": "researcher",  # ← Agent type, not LLM model
    "messages": [...]
  }'
```

**Question:**
- Can I pass actual LLM model names like `"gpt-4-turbo"` or `"claude-sonnet-4"`?
- Or must I pre-configure agents with specific models in their manifests?
- If the latter, how do I support 10,000 tenants each wanting different models?

---

#### 3. **Per-Tenant Cost Tracking**

The README mentions "metering and budget tracking". 

**Question:**
- Does the `MeteringEngine` track usage **per tenant**?
- Can I query: "Show me Tenant A's usage for March 2026"?
- Or does it only track globally per agent/model?

**Expected API:**
```bash
GET /api/v1/metering/tenant/tenant-a?start=2026-03-01&end=2026-03-31
```

Does this endpoint exist?

---

#### 4. **Memory and Session Isolation**

Issue #322 mentions that memory currently uses `(agent_id, scope)` without `user_id` scoping.

**Question:**
- If Tenant A and Tenant B both use the "researcher" agent:
  - Are their conversation memories isolated?
  - Or do they share the same memory store?
- Can multiple tenants concurrently use the same agent with isolated sessions?

**Scenario:**
```
10:00 AM - Tenant A sends request → Session 1 (researcher agent)
10:00 AM - Tenant B sends request → Session 2 (researcher agent)
```

Will these run concurrently with isolated state? Or will Session 2 overwrite Session 1?

---

#### 5. **Deployment Architecture**

For 10,000 tenants, what is the recommended deployment architecture?

**Option A: Shared Instance**
```
All 10,000 tenants → Single OpenFang instance
```
- Does OpenFang support tenant isolation in this setup?
- What's the expected concurrency limit?

**Option B: Instance Per Tenant**
```
Tenant A → OpenFang instance on port 4200
Tenant B → OpenFang instance on port 4201
... 10,000 instances total
```
- Is this the intended design?
- What's the memory footprint per instance?

**Option C: Tenant Pools**
```
Tenants 1-100   → OpenFang pool 1
Tenants 101-200 → OpenFang pool 2
...
```
- Is there built-in support for this?

---

### What I've Already Checked

✅ Read the README and architecture documentation  
✅ Reviewed Issue #322 about enterprise operations  
✅ Checked the API endpoint list (140+ endpoints mentioned)  
✅ Examined the security layers (16 layers, Merkle audit trail, etc.)

**What I couldn't find:**
- Documentation on multi-tenant deployment
- API endpoints with `tenant_id` parameters
- Configuration examples for per-tenant credentials

---

### Clarification Requested

Could you please clarify:

1. **Is multi-tenant support with per-tenant credentials currently available?**
   - If YES: Please point me to documentation/examples
   - If NO: Is it on the roadmap? What's the timeline?

2. **What is the difference between:**
   - "RBAC multi-user auth" (mentioned as existing)
   - "Multi-tenant isolation" (mentioned in Issue #322 as a gap)

3. **For my 10,000 tenant use case:**
   - Is OpenFang a good fit today?
   - Or should I wait for the features in Issue #322 to be implemented?
   - Or should I use OpenFang only for internal automation and use a different solution (LiteLLM, Portkey) for multi-tenant API serving?

---

### Alternative Solutions I'm Considering

If OpenFang doesn't currently support per-tenant credentials:

- **LiteLLM Proxy**: Built-in multi-tenant with virtual keys
- **Portkey.ai**: Enterprise multi-tenant gateway
- **Custom Gateway**: Build our own proxy layer on top of OpenFang

**But I'd prefer to use OpenFang natively if it supports this use case.**

---

### Proposed API (If Not Currently Supported)

If multi-tenant is not yet supported, here's what would make OpenFang perfect for my use case:
```bash
# Tenant configuration endpoint
POST /api/v1/tenants
{
  "tenant_id": "tenant-a",
  "provider": "openai",
  "api_key": "sk-proj-A...",  # encrypted in storage
  "model": "gpt-4-turbo",
  "max_tokens": 2000,
  "temperature": 0.7,
  "budget_usd_per_month": 100
}

# Tenant-scoped request
POST /v1/chat/completions
Headers:
  Authorization: Bearer tenant-a-access-token
  X-Tenant-ID: tenant-a

Body:
{
  "messages": [{"role": "user", "content": "Hello"}]
}

# OpenFang internally:
# 1. Authenticates tenant-a-access-token
# 2. Loads tenant-a's config (provider, api_key, model)
# 3. Decrypts tenant-a's API key
# 4. Routes request to OpenAI with tenant-a's credentials
# 5. Tracks usage under tenant-a for billing
```

---

### Thank You

I love the OpenFang architecture (Rust, WASM sandbox, 16 security layers, Hands for automation). The autonomous agent features are unique and powerful.

I just need to understand if the multi-tenant credential isolation exists today, or if it's a planned feature.

Any guidance would be greatly appreciated! 🙏

---

**Environment:**
- OpenFang version: v0.1.0 (latest)
- Deployment: Docker / Kubernetes
- Expected scale: 10,000+ tenants, 100k+ requests/day

### Alternatives Considered

_No response_

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Multi-Tenant Support with Per-Tenant API Keys and Models #712

Description