Initial setup #312
Replies: 7 comments 6 replies
-
|
@johny-mnemonic apologies for the confusion — the docs were out of date. Provider configuration lives in the Providers app (top-level app in the dock, not a section inside Settings). The line in Adding your Ollama endpoint via the UI:
About your manual YAML edit: The About a setup wizard: You're right, there isn't one yet — first-launch wizard is on the roadmap but not built. A model has to be configured through the Providers app before Agents can deploy. Tracking that gap. External LiteLLM: taOS bundles its own LiteLLM proxy that's automatically configured from your Providers entries — that's the path of least friction. Pointing taOS at a separate LiteLLM you already run elsewhere isn't supported as a first-class option today; it'd require pointing the agents directly at your LiteLLM URL and skipping the Providers layer entirely. Possible but undocumented. If your existing LiteLLM has providers configured already, the simpler path is to add those same providers to taOS's Providers app — taOS's bundled LiteLLM will route through them and you keep one source of truth in the UI. Thanks for the detailed report — it's the kind of feedback that closes onboarding gaps fast. |
Beta Was this translation helpful? Give feedback.
-
|
@jaylfc Thanks for help. I have discovered the Providers app basically at the same time you replied here 😃 In the end I had to destroy my Ubuntu 25.04 VM and start fresh as I was unable to make the iGPU passthrough working. Unfortunately seems like I will need help with few other things.
Also I have noticed you have support for EXO, but that does not work with AMD ROCm and 2 out of 3 of my GPUs are AMD, so I am planning to go llama.cpp RPC route instead, where it should work. That should work with TAOS natively as any other llama.cpp endpoint, right? It will just lack the knowledge it talks to distributed LLM provider. |
Beta Was this translation helpful? Give feedback.
-
|
@johny-mnemonic Glad 25.10 worked. Tackling these in order. llama.cpp not in the StoreIt is in the catalog (
If neither helps, I need:
That'll show which tier the resolver's placed you in and whether it matches what llama-cpp expects. Providers appYou're right on both, both are real holes. Filed:
Workaround until both ship: put auth-required endpoints behind your existing LiteLLM and point taOS at LiteLLM as a single Ollama-shaped provider (LiteLLM exposes an Ollama-compatible surface). Not great but unblocks you. llama.cpp RPCShould work as a regular llama.cpp endpoint. RPC distributes layers across GPUs but exposes the same HTTP server ( Caveats: the API key thing from above will still block you if your RPC server has auth, taOS won't see the underlying GPU topology (one provider, one model list as far as it's concerned), and yes EXO doesn't work with ROCm so RPC is the right path. Want a more direct loop?Most of taOS so far has been built around the hardware I run, Orange Pi 5 Plus with the RK3588 NPU, single node, no auth in front of inference. Your setup is basically the opposite end of the spectrum and it surfaces gaps I'd otherwise hit way later when other people with similar hardware showed up. I want to spin up a dedicated discussion thread for your stack. Not a debug ticket, just an ongoing space where you can post whenever, raise things, share ideas, ask questions, complain about something half-broken, whatever. I'd treat you as the canonical voice for the multi-vendor and distributed-inference path. First person testing this stuff for real outside my own bubble. How that'd actually work:
Say no and I'll keep replying here piecemeal, totally fine. But the offer's there if you want it. Either way the multi-vendor story is exactly where taOS needs to grow and you're closer to it than anyone else who's tried it so far. If you're up for the thread, I'll spin it up and link from here. |
Beta Was this translation helpful? Give feedback.
-
|
@johny-mnemonic Spotted, fair cop 🙂. Quick honesty note: I read every comment myself and the technical calls (what to fix, what to file, what to push back on) are mine. I delegate the typing on longer replies because I'm one person and there's a lot going on. The thinking and direction are still me; if anything in a reply ever feels off or like it's not actually engaging with what you said, call it out and I'll re-do it directly. Onto the new findings, all genuinely useful. Vulkan detection bugThis is a real bug, not a config thing on your end. The bash install-script correctly detected your iGPU ( Workaround right now: Real fix tracked in #354 — install-server.sh should pull One thing to flag: with 3.4GB RAM you'll still be tight for most modern LLMs — most variants in the catalog have 4GB+ RAM floors. Once Vulkan's detected you'll see more options but the Q4-quantised small models (Qwen3 1.7B, SmolLM2, etc.) are realistically what you'll be running on this box. Bigger stuff will need to live on your Ollama Windows machine or eventually the AMD RPC cluster. Store filter UXThe "compatible with my device" filter exists but the UI surface is the Ollama models invisible in agent pickerReal bug, filed as #356. The Providers app has three categories — Workaround: if you can run Ollama directly on the same machine as taOS (instead of a remote Windows box), the models should show under Dedicated threadSpun up: #357. I've seeded it with a snapshot of your hardware/setup and links to every issue from this conversation, so you've got a single home that's already populated with what we're working through. From here forward, that's your thread — drop into it freely. |
Beta Was this translation helpful? Give feedback.
-
|
Haha fair play mate. I'm on the road today so I'm delegating a lot to claude and I'm just double checking and testing what I can etc, I'll have it look into your issues in a mo. Perfect use case for taOS, having your gaming PC's as workers, exactly what I do. I'm actually running full sweeps of benchmarks right now for taOSmd using different size models so I'll let you know how it goes! thanks again! |
Beta Was this translation helpful? Give feedback.
-
|
Fair shot, that closing was painted on a bit thick 😄 Vulkan still false after installing vulkan-toolsThat's a separate cache bug. taOS writes The proper fix is either always re-probe on startup or add a "Refresh hardware" button in Settings. Filed as #366. Embedding model — let taOSmd decideWorth zooming out here, because the way you're thinking about it ("pick the right embedding model for my hardware") is exactly what we're trying to take off the user's hands. The memory layer behind taOS is taOSmd. It's not a separate thing you opt into — every agent you deploy is preconfigured with it. Embedding, retrieval, ranking, query expansion are all handled inside that layer with model defaults that have been benchmarked end-to-end (currently 97% Judge accuracy on LongMemEval-S with the default stack). You don't pick the embedding model per agent; agents inherit whatever taOSmd is configured to use, and that picks based on what hardware is available. For your N100 right now: once Vulkan flips on and you're in an Where this is going, and the part that's relevant to your multi-machine setup: the memory system is designed to be moved around the cluster — embeddings on one node, ranking on another, archive on a third — so you can put the heavier stuff on your gaming PCs (with their real GPUs) while the controller-side store stays light. That's not all wired up yet, but it's the direction. Your stack is exactly the cluster shape that work needs. Bottom line: install bge-m3 from the Store if you want a manual pick, but the proper play is letting taOSmd's defaults handle it once your hardware tier is right. Store filter only in "Models" sectionReal UX gap. The compatibility filter should apply across Models, LLM Runtime, MCP, Services — anywhere the catalog lists installable things. Today the filter being absent in LLM Runtime is exactly why you ended up trying to install llama-cpp despite the resolver thinking it was incompatible. Filed as #367. "device has 0 MB free" + disk quota monitor init failureSame bug. The disk quota subsystem failed to initialise (per your log: "No container backend configured. Call set_backend() first."Probably the underlying reason your Fedora-worker showed up with empty hardware too. The container backend (Incus or Docker) isn't installed or detected on your Ubuntu host, so anything that tries to spin up a worker container raises this. Two questions: do you have Incus or Docker installed on the Ubuntu box, and any preference between them? taOS supports both; auto-detect should pick whichever's present. If neither, the install script should be installing Incus by default (lighter than Docker on a home server). Filed as #369. TZ off by 2 hoursAlmost certainly not a taOS bug — taOS reads system time. Check Bigger pictureAlmost everything you're hitting falls into the same bucket: install-time gaps where taOS detects something but doesn't fully wire it up, or relies on a host-side dependency we should be bundling. Vulkan-tools missing on Ubuntu, container backend not auto-detected, hardware cache not refreshing, etc. We're going to do an install-hardening pass — the bar is "install script finishes, taOS works fully without you Thread moveSounds good — once Vulkan's unblocked and the disk quota / container backend stuff is filed (already done above), I'll seed #357 with the open threads and we can keep using it from there. |
Beta Was this translation helpful? Give feedback.
-
|
Vulkan tier looks right. EmbedderYou don't install anything yourself. taOSmd self-configures when you deploy an agent: pick a framework (OpenClaw or Hermes), the wizard's memory step asks which device runs the memory layer and what tier (Lite/Standard/Heavy), and the embedder + reranker + runtime get installed and started for you. After deploy, the Activity app (left sidebar) shows the loaded models with their host/port — that's how you confirm it's actually serving. If you haven't deployed an agent yet, nothing memory-side will be running yet — that's by design. Standalone memory-without-an-agent isn't a first-class flow today; it's coming as part of the resource-scheduling work where the memory pipeline can move around the cluster independently of agents. Container backendNothing got installed because that part of install-server.sh landed after your install ran. Re-running the script today installs Incus and inits the storage pool automatically. Without re-running, Since the last commentPostgres is now part of the install so per-agent virtual keys actually work (no more One thing in flight that's relevant for you: when you wire up the gaming PCs as workers, the install-targets endpoint now matches incus remote ↔ worker by URL host, so a name mismatch between the remote name and the worker registration name doesn't show up as "unknown hardware" anymore. Will move ongoing chat to #357 once an agent's deployed and the memory layer's serving. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have successfully installed TAOS and logged into Web Desktop, but there is no model to select and I see no way to add model providers. There was no "setup wizard" on first launch. I guess it is not implemented yet...
There is a reference to "Settings → Providers → Add Provider" but I don't see such section in the Settings.
There is a "Advanced" section in the Settings, but when I try to add for example one of my Ollama endpoints it produces
Save failed (404)error and the "Validate" button seems to do nothing. In the log it says"PUT /api/settings/config HTTP/1.1" 404 Not Found, so I guess this is also not implemented yet and I will have to find where the config file is and modify it on CLI, right?I have modified the defaults there to look like this:
Hope this is correct.
I already have LiteLLM running in my network. Not sure if that can be used instead of internal LiteLLM referenced in the docs.
Beta Was this translation helpful? Give feedback.
All reactions