mpk-droid · tarun-etikala · Feb 26, 2026 · Feb 26, 2026 · Mar 2, 2026 · Mar 2, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -43,14 +43,14 @@ All operations go through `agentctl` (add `scripts/` to PATH):
 
 ```bash
 agentctl local-up                  # Start local dev environment
-agentctl local-down [--force]      # Stop local (--force cleans stuck containers)
-agentctl deploy [--namespace ns]   # Deploy full stack to OpenShift
+agentctl local-down [--force]      # Stop local (--force removes volumes too)
+agentctl deploy [--image img] [-n ns]  # Deploy to OpenShift (--image uses built image, without it pulls default from Docker Hub)
 agentctl destroy [--namespace ns]  # Remove all cluster resources
 agentctl flows save                # Local Langflow -> flows/ dir
 agentctl flows load                # flows/ dir -> Local Langflow
 agentctl flows pull [-n ns]        # Cluster Langflow -> flows/ dir
 agentctl flows push [-n ns]        # flows/ dir -> Cluster Langflow
-agentctl build <flow.json> [reg]   # Build flow into container image
+agentctl build [--prod] <flow.json> [reg] [tag]  # Build flow image (--prod for API-only runtime)
 agentctl list [--all-namespaces]   # List deployed agents
 agentctl status <name> [-n ns]     # Show agent status
 ```
@@ -78,17 +78,27 @@ Both local and cluster use `LANGFUSE_INIT_*` env vars to auto-create org, projec
 - `flows push/load` uploads via `POST /api/v1/flows/upload/` (multipart file)
 - Auth via `POST /api/v1/login` with default credentials `langflow/langflow`
 
+### Build & Deploy
+- `agentctl build <flow.json> [registry] [tag]` — builds full Langflow image (UI + flow baked in)
+- `agentctl build --prod` — builds **Langflow Runtime** image (backend-only, no UI)
+- `-n <namespace>` on build rewrites model endpoints (api_base, model_name) in the flow JSON to point to the cluster's KServe URL
+- Images are built for `linux/amd64` via `podman build --platform linux/amd64`
+- `agentctl deploy --image <img>` — deploys with the built image
+- `agentctl deploy` (no `--image`) — no image is built or pushed; OpenShift pulls the default Langflow image from Docker Hub. Use `agentctl flows push` to upload flows after deploy.
+- The Helm template conditionally sets `LANGFLOW_BACKEND_ONLY=true` and `LANGFLOW_SKIP_AUTH_AUTO_LOGIN=true`
+- `LANGFLOW_LOAD_FLOWS_PATH` must point to a **directory** (not a file) — Langflow calls `iterdir()` on it
+- `LANGFLOW_SKIP_AUTH_AUTO_LOGIN=true` is required for Langflow >= 1.5 to allow unauthenticated API access with auto-login
+- Registry images must be public or the cluster needs a pull secret
+
 ### Custom vLLM Component
 Flows using the cluster's model serving have a custom `VLLMModel` component with hardcoded `base_url`. When moving flows between environments, the model endpoint URL must be changed:
-- Local: `http://ollama:11434/v1` with model `llama3.2`
-- Cluster: `http://llama-31-8b-instruct-predictor.langflow-agent.svc.cluster.local:8080/v1` with model `llama-31-8b-instruct`
+- Local: `http://ollama:11434/v1` with model `qwen2.5:7b`
+- Cluster: `http://qwen25-7b-instruct-predictor.<namespace>.svc.cluster.local:8080/v1` with model `qwen25-7b-instruct`
 
 ## Known Issues
 
-- `flows save`/`flows pull` downloads ALL flows including Langflow's built-in starter templates (34+), not just user-created flows
 - `flows push`/`flows load` creates duplicates if the flow already exists on the target
 - Langfuse INIT vars only run on first database creation — if the database already exists from a prior run, wipe volumes and restart
-- The README.md is outdated (still references MLflow which was removed)
 
 ## Images Used
 

diff --git a/README.md b/README.md
@@ -78,6 +78,22 @@ To check its status:
 podman machine list
 ```
 
+**Memory**: The Podman VM needs at least 8GB of memory. If it's set lower, increase it:
+
+```bash
+podman machine stop
+podman machine set --memory 8192
+podman machine start
+```
+
+**Platform**: The `platform` field in `local/podman-compose.yml` must match your machine's architecture. Update it if needed:
+
+| Machine | Platform |
+|---------|----------|
+| Mac (Apple Silicon) | `linux/arm64` |
+| Mac (Intel) | `linux/amd64` |
+| Linux (x86_64) | `linux/amd64` |
+
 ### Cluster Login
 
 To get the `oc login` command for your cluster:
@@ -163,21 +179,68 @@ agentctl destroy              # Remove cluster resources
 agentctl local-down           # Stop local environment
 ```
 
+### Build & Deploy (Production)
+
+Develop locally, then package and deploy as a container image.
+
+```bash
+# ── Inner Loop (Development) ──────────────────────────
+
+# 1. Start local environment
+agentctl local-up
+
+# 2. Build and test your agent flow in the Langflow UI
+open http://localhost:7860
+
+# 3. Save flows to the flows/ directory
+agentctl flows save
+
+# 4. Commit flows to Git
+git add flows/ && git commit -m "Add agent flow"
+
+# ── Outer Loop (Production) ──────────────────────────
+
+# 5. Build a container image with the flow baked in
+podman login quay.io
+agentctl build flows/my-flow.json quay.io/myorg v1.0             # full UI
+agentctl build --prod flows/my-flow.json quay.io/myorg v1.0      # headless API only
+
+# 6. Deploy to OpenShift with the built image
+oc login https://your-cluster:6443
+agentctl deploy --image quay.io/myorg/langflow-my-flow:v1.0
+
+# 7. Test the agent via API (URL printed by deploy)
+curl -X POST https://<route>/api/v1/run/<flow-id> \
+  -H "Content-Type: application/json" \
+  -d '{"input_value": "Hello", "output_type": "chat", "input_type": "chat"}'
+
+# Cleanup
+agentctl destroy
+agentctl local-down
+```
+
+- `agentctl build`: builds a full Langflow image with UI + flow baked in
+- `agentctl build --prod`: builds a **Langflow Runtime** image — headless API server, no UI
+- `agentctl deploy --image`: deploys with your built image
+- `agentctl deploy` (no `--image`): no image is built or pushed — OpenShift pulls the default Langflow image directly from Docker Hub. Use `agentctl flows push` to upload your flows after deploy.
+
+> **Note:** `agentctl build` will create a new repository in Quay.io when pushing. The repository defaults to **private**. You need to make it **public** in the Quay.io UI so that OpenShift can pull the image without a pull secret.
+
 ## CLI Reference
 
 All operations go through `agentctl`:
 
 | Command | Description |
 |---------|-------------|
 | `agentctl local-up` | Start local dev environment |
-| `agentctl local-down [--force]` | Stop local environment (`--force` cleans stuck containers) |
-| `agentctl deploy [--namespace ns]` | Deploy full stack to OpenShift |
+| `agentctl local-down [--force]` | Stop local environment (`--force` removes volumes too) |
+| `agentctl deploy [--image img] [--namespace ns]` | Deploy full stack to OpenShift |
 | `agentctl destroy [--namespace ns]` | Remove all cluster resources |
 | `agentctl flows save` | Local Langflow &rarr; `flows/` directory |
 | `agentctl flows load` | `flows/` directory &rarr; Local Langflow |
 | `agentctl flows pull [-n ns]` | Cluster Langflow &rarr; `flows/` directory |
 | `agentctl flows push [-n ns]` | `flows/` directory &rarr; Cluster Langflow |
-| `agentctl build <flow.json> [registry] [tag]` | Build flow into a container image and push to registry |
+| `agentctl build [--prod] <flow.json> [registry] [tag]` | Build flow into a container image (`--prod` for API-only runtime) |
 | `agentctl list [--all-namespaces]` | List deployed agents |
 | `agentctl status <name> [-n ns]` | Show agent status and metadata |
 
@@ -191,9 +254,10 @@ Key values in `helm/langflow-agent/values.yaml`:
 |-----------|-------------|---------|
 | `langflow.image` | Langflow container image | `langflowai/langflow:1.7.1` |
 | `langflow.replicas` | Number of Langflow replicas | `1` |
+| `langflow.backendOnly` | Run as Langflow Runtime (API-only, no UI) | `false` |
 | `langfuse.enabled` | Deploy Langfuse for tracing | `true` |
 | `modelServing.enabled` | Deploy vLLM + KServe | `false` |
-| `modelServing.modelName` | Model to serve | `meta-llama/Llama-3.1-8B-Instruct` |
+| `modelServing.modelName` | Model to serve | `Qwen/Qwen2.5-7B-Instruct` |
 | `modelServing.gpu.count` | GPUs for model serving | `1` |
 
 ### Deploy with Model Serving
@@ -221,8 +285,26 @@ To enable model serving, set `modelServing.enabled: true` in `values.yaml`. Requ
 | Langfuse | http://localhost:3000 |
 | Ollama API | http://localhost:11434 |
 
+## Using a Third-Party Model (OpenAI, Anthropic, etc.)
+
+You don't need Ollama or vLLM if you want to use a third-party model provider. Just configure the model component directly in the Langflow UI:
+
+1. Open your flow in Langflow
+2. Use a built-in **OpenAI** / **Anthropic** / **OpenAI-compatible** component
+3. Set `api_base` to the provider's URL and `api_key` to your key
+
+This works the same locally and on cluster — no infrastructure changes needed.
+
+Alternatively, set the model endpoint in `local/.env`:
+
+```bash
+OPENAI_API_BASE=https://api.openai.com/v1
+OPENAI_API_KEY=sk-...
+```
+
 ## Notes
 
 - Flows pulled from the cluster may contain model components pointing to cluster-internal URLs. When loading these locally, update the model endpoint in the Langflow UI to point to Ollama (`http://ollama:11434/v1`).
-- `flows save` and `flows pull` download all flows including Langflow's built-in starter templates. Only user-created flows are relevant for version control.
-- Langfuse auto-provisioning (org, project, API keys) only runs on first database creation. If you need to reset, remove the PostgreSQL volume and restart.
+- Langfuse auto-provisioning (org, project, API keys) only runs on first database creation. If you need to reset, remove the PostgreSQL volume and restart.
+- When pushing images to Quay.io or other registries, ensure the repository is **public** or create a pull secret on the cluster (`oc create secret docker-registry ...`).
+- Images are built for `linux/amd64` by default to match typical OpenShift cluster architecture.
diff --git a/custom_components/kserve_vllm.py b/custom_components/kserve_vllm.py
@@ -0,0 +1,55 @@
+from langchain_openai import ChatOpenAI
+
+from langflow.base.models.model import LCModelComponent
+from langflow.field_typing import LanguageModel
+from langflow.io import FloatInput, IntInput, SecretStrInput, StrInput
+
+
+class KServeVLLMComponent(LCModelComponent):
+    display_name = "KServe vLLM"
+    description = "Language model served via KServe + vLLM (OpenAI-compatible API)."
+    icon = "server"
+    name = "KServeVLLM"
+
+    inputs = LCModelComponent._base_inputs + [
+        StrInput(
+            name="api_base",
+            display_name="API Base URL",
+            info="KServe vLLM OpenAI-compatible endpoint.",
+            value="",
+            required=True,
+        ),
+        StrInput(
+            name="model_name",
+            display_name="Model Name",
+            value="",
+            required=True,
+        ),
+        SecretStrInput(
+            name="api_key",
+            display_name="API Key",
+            info="API key (use 'EMPTY' if not required).",
+            value="EMPTY",
+        ),
+        FloatInput(
+            name="temperature",
+            display_name="Temperature",
+            value=0.1,
+        ),
+        IntInput(
+            name="max_tokens",
+            display_name="Max Tokens",
+            value=4096,
+            advanced=True,
+        ),
+    ]
+
+    def build_model(self) -> LanguageModel:
+        return ChatOpenAI(
+            api_key=self.api_key or "EMPTY",
+            model=self.model_name,
+            base_url=self.api_base,
+            temperature=self.temperature,
+            max_tokens=self.max_tokens if self.max_tokens > 0 else None,
+            timeout=120,
+        )