Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 18 additions & 4 deletions docs/STATUS.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,21 @@
SINGLE SOURCE OF TRUTH for cross-agent handoff.
Last updated: 2026-06-14 ~12:10 BST, @taOS (active).
Last updated: 2026-06-14 ~13:30 BST, @taOS (PAUSED for a fresh session).

▶▶ SESSION PAUSED 2026-06-14 ~13:30 BST (Jay asked to pause + update handoff). NEW SESSION START HERE:
- master=51837bed, dev=118409a5. Working tree clean. NO uncommitted work anywhere.
- TWO PRs IN FLIGHT, both now CLEAN + FULLY GREEN as of ~13:35 BST (all checks + Gitar/Kilo/CodeRabbit SUCCESS) -- READY TO MERGE, left for the fresh session per the pause:
• PR #884 feat(agent) agent-controlled image generation. Branch feat/agent-image-gen, tip ddeb1bec. Commits this session: 443e70ff canvas wiring + e94de444 describe_image_capabilities + 165e0b83/10f4732c/a578a870 bot-review hardening + ddeb1bec image-prompting manual. WHEN GREEN: merge to dev, then DEPLOY Pi and drive the storybook flow.
• PR #886 fix(store) rkllama install entry (#844). Branch fix/rkllama-store-install, tip d6960af0 (cleanly off origin/dev, 3 code files + tests + manual). WHEN GREEN: merge to dev, then dev->master (Jay wanted #844 fixed for the target audience).
- REMAINING BOT NITS on both PRs are MINOR + non-blocking (judged, not yet actioned, left for your call): #884 kilo wants _image_backends_from_worker hardened per-entry (worker-level guard already contains it; symmetric 1-line isinstance guard would fully satisfy). #886 kilo flags the install-rkllama.sh `"models"` short-circuit on `{"models":[]}` (that is CORRECT: an empty-but-running rkllama IS installed; models are a separate concern) and non-string model names in verify (can't false-match a string app_id, safe). Decide per-nit; none block merge.
- MERGE GATE (handoff 0f): green CI + Kilo + Qodo + Gitar + author. CodeRabbit is legacy/rate-limited, do not block on it. Check INLINE bot comments, not just the check summary.
- Tasks #30 (rkllama/#844, in_progress -> close when #886 merges) and #35 (NEW: ~19 other catalog manifests reference missing install scripts; separate follow-up) capture the store-install debt.
- 3060 SD BACKEND UNBLOCKED (task #34): @taOSmd relayed Jay's GO 2026-06-14 ~12:30 -- the Fedora RTX 3060 window is OURS to install the SD backend ourselves (stable-diffusion.cpp or ComfyUI, our pick); @taOSmd manages nothing outside taOSmd, so we own the SD backend + its model + pointing the controller's image_backend_url at it. Do this AFTER #884 merges so the storybook image step has a real GPU backend. Box access = resolve the Fedora node via our own tailscale (NEVER commit the IP / put it on the bus).
- Re-arm on arrival: freshness cron (:08/:38), A2A SSE monitor, repo-watch (:23). Resume pair for the 15:40Z window is armed (primary 16:42, retry 17:01 local).


▶ RELEASED TO MASTER 2026-06-14 (#883, master=c9c5b0c9, Jay asked "merge dev to main so all users get updates"): the whole overnight body of work is now on master — agent OS control framework (#877-882), macOS-dark theme + purple purge (#879), App Store/real-desktop/Agents/chat redesigns, mobile chat #880 + chat-pwa theme #881. Merge-commit (history preserved), dev NOT deleted. master strict-mode + behind required an admin merge.
▶ IN FLIGHT: PR #884 agent-controlled image generation (the storybook demo's image step): generate_image now returns image_ref (fixed a broken b64-of-JSON bug) + canvas_add_image copies the workspace PNG into the project canvas files so art renders on the board; + describe_image_capabilities (read-only cluster tier/tool awareness, agent picks model by intent, system owns load/unload/queue). 26 tests. Baking -> merge to dev -> deploy Pi -> drive the FULL storybook flow to verify. DEMO BLOCKER TO CHECK: an image backend (sd-cpp on 3060 / rkllama on Pi NPU) must be installed+reachable or generate_image fails. Cross-worker image routing (Pi->3060) is a SEPARATE greenlight (TaskRouter exists, not auto-invoked).
-- 2026-06-14 ~12:50: pushed bot-review fixes to #884 (commit 10f4732c): skills.py now REFRESHES builtin skill rows after INSERT OR IGNORE (so the Pi, seeded by #882 with the old file_id canvas schema, converges on image_ref); strict filename type-check in image_tool + honest fallback docstring; project_tools canvas-path slug guard (reject non-slugify slug + assert inside projects_root); cluster_tools includes ram_mb. 67 touched-suite tests green. Re-baking for bot re-review. CAUTION LEARNED: local `dev` had 3 unpushed #884 commits -> a branch cut from it (the rkllama branch) accidentally bundled them; fixed by rebasing #886 --onto origin/dev and resetting local dev to origin/dev. Always branch from origin/dev.
▶ X POST (Jay, premium): drafted, honest framing (NOT first-ever; Goose/Open-Interpreter/Self-Operating-Computer exist). Angle = agent-native OS so a 4B local model drives the whole thing offline; post WITH the demo video. Add the win to README+website too (draft for approval). Private reasoning only.

▶▶ MORNING MUST-DO (Jay overnight ask, asleep): features tested+working by morning; agent OS control DONE simple; **offline agent RESULTS by morning**.
Expand Down Expand Up @@ -36,23 +48,25 @@ Last updated: 2026-06-14 ~12:10 BST, @taOS (active).
4. PROMO HERO PROGRAM (memory [[promo-hero-initiative]]): only the agent CHAT + a demo PROJECT stay mock; build everything else REAL. Hero = multi-window (chat + project canvas + store), 5:2 X-cut on all promo. Needs store (#871), project canvas/mind-map (#16, net-new), demo seed (#17), agent window-mgmt API (#18). Mock data PRIVATE on local `marketing` branch (never push/merge; MARKETING.md).
5. Also queued: store popularity LIVE stars backend (#13), per-app install telemetry -> the now-secured stats page (#15), widget redesign (#19, NOT in the shot), mobile audit, wallpaper picker #864, island v2 #854, GitHub #858 ph2, live-wallpaper package brainstorm.

Branch tips: master=6394a3ed. dev=67dceb64 (#877-#882 agent OS control + mobile + theme MERGED). Merged overall this session: #867 #868 #869 #870 (theme/wallpaper), #871 (store redesign), #873 (real desktop: dock right-click + inline New Folder + FS-backed icons + rename API), #874 (window.taosDesktop control API + docs/desktop-control.md); taos-website #5 (stats Basic Auth -> main, set STATS_USER/STATS_PASS in Coolify). Local-only `marketing` branch (private, no upstream; NEVER push/merge).
Branch tips: master=51837bed (#887 released #885 to master), dev=d5c089e9 (#885 mobile branch-dropdown fix merged). Merged overall this session: #867 #868 #869 #870 (theme/wallpaper), #871 (store redesign), #873 (real desktop: dock right-click + inline New Folder + FS-backed icons + rename API), #874 (window.taosDesktop control API + docs/desktop-control.md); taos-website #5 (stats Basic Auth -> main, set STATS_USER/STATS_PASS in Coolify). Local-only `marketing` branch (private, no upstream; NEVER push/merge).

Session state: ACTIVE (autonomous overnight). ALL baking PRs MERGED to dev (tip=4ecc7961): #872 (tsParticles wallpaper + sliders), #873 (real desktop), #874 (agent OS controls). Open-PR queue drained (only draft #476 remains; #846 already CLOSED). #872 SWAPS the animated wallpaper renderer from the hand-rolled canvas NeuralWallpaper (component "neural") to tsParticles ParticlesWallpaper (component "particles"); theme-store registers id "neural-live" w/ component "particles" -- VERIFY the tsParticles look LIVE on Pi (headless can't rasterize it). #25 (tiled double-header) CLOSED: not a bug, was the 32px top-bar chrome. SECURITY: dependabot alert #5 (esbuild RCE < 0.28.1) is STALE -- desktop already pins esbuild 0.28.1 via overrides (lockfile + installed both 0.28.1); leave for dependabot to auto-close, no code change. #19 widget redesign HELD for Jay (taste + depends on the desktop/widget/dash mode-switcher brainstorm [[project_desktop_modes]]). FEDORA MODEL TESTS (Jay 2026-06-14 ~02:00): eval harness + runbook built PRIVATE (~/tinyagentos-private/specs/storybook-demo/storybook_toolcall_eval.py) -- scores local models on the storybook tool-call flow incl ID-threading; A2A sent to @taOSmd (msg 431) to coordinate Fedora box (it's mid E-009 sweep, do NOT interrupt); awaiting its ping + local-model list. tsParticles look + Safari dark<->light + live-wallpaper animation + desktop icons/thumbnails are all best checked LIVE on the Pi (preview has no backend; tsParticles canvas does not rasterize headless).
Session state: ACTIVE (autonomous overnight). OPEN PRs in flight: #884 (agent image-gen, review fixes pushed, baking), #886 (rkllama store fix #844, off origin/dev, baking), #876 (dependabot SPA deps), draft #476. #885 merged dev->master via #887. #872 SWAPS the animated wallpaper renderer from the hand-rolled canvas NeuralWallpaper (component "neural") to tsParticles ParticlesWallpaper (component "particles"); theme-store registers id "neural-live" w/ component "particles" -- VERIFY the tsParticles look LIVE on Pi (headless can't rasterize it). #25 (tiled double-header) CLOSED: not a bug, was the 32px top-bar chrome. SECURITY: dependabot alert #5 (esbuild RCE < 0.28.1) is STALE -- desktop already pins esbuild 0.28.1 via overrides (lockfile + installed both 0.28.1); leave for dependabot to auto-close, no code change. #19 widget redesign HELD for Jay (taste + depends on the desktop/widget/dash mode-switcher brainstorm [[project_desktop_modes]]). FEDORA MODEL TESTS (Jay 2026-06-14 ~02:00): eval harness + runbook built PRIVATE (~/tinyagentos-private/specs/storybook-demo/storybook_toolcall_eval.py) -- scores local models on the storybook tool-call flow incl ID-threading; A2A sent to @taOSmd (msg 431) to coordinate Fedora box (it's mid E-009 sweep, do NOT interrupt); awaiting its ping + local-model list. tsParticles look + Safari dark<->light + live-wallpaper animation + desktop icons/thumbnails are all best checked LIVE on the Pi (preview has no backend; tsParticles canvas does not rasterize headless).

WEBSITE: taos.my live. All 4 taos-website PRs merged (stats/changelog/nav/accessibility).

CI: test suite parallelized via #839 (xdist -n auto). CodeRabbit may be out of credits -- do not merge on a fake rate-limit pass. Use @coderabbitai full review to retrigger; manual review OK for tiny already-reviewed PRs.

OPEN PRs:
- #886 fix(store): rkllama service install entry (#844) -- 3 files off origin/dev, baking; merge dev->master when green
- #884 feat(agent): agent-controlled image generation -- bot-review fixes pushed (10f4732c), baking; merge to dev then deploy Pi
- #876 chore(deps): dependabot SPA deps group bump (32 updates) -- review and merge when CI green


- #476 DRAFT feat(userspace): App Runtime v1 -- stays DRAFT, not ready to merge
(#872/#871 MERGED to dev; #846 SUPERSEDED by #849 on dev; taos-website #5 merged to main.)

Notable open issues (bugs first):
- #844 rkllama store-UI install chain broken (wrong script + non-interactive false-success) -- unresolved
- #844 rkllama store-UI install chain broken (wrong script + non-interactive false-success) -- FIX IN PR #886 (off origin/dev): adds scripts/install-rkllama.sh (idempotent headless wrapper -> delegates to install-rknpu.sh with TAOS_RKNPU_SETUP=1 so it can't take the false-success exit-0; short-circuits when rkllama already answers 7833/8080) + hardens RkllamaInstaller /api/tags verify (retry then fail, no more swallowed false success) + regression guard test. NOTE found while auditing: ~19 OTHER catalog manifests (stable-diffusion-cpp, wan2gp, dify, agents...) reference install scripts that also don't exist at repo root -- separate follow-up, NOT in #886.
- #841 update check shows no updates when local branch diverged from origin -- unresolved
- #825 taOS agent model swap breaks routing (stale per-agent key preferred over master key)
- #840 chat: per-agent framework slash commands (Telegram-style) in DMs and via @agent /
Expand Down
14 changes: 10 additions & 4 deletions docs/agent-manual/09-os-control.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,16 @@ update the open Projects app in real time):
Returns a `project_id` to use in the next calls.
- **add_task** — add a to-do task to a project's board. Args: `project_id`, `title`.
- **canvas_add_image** — place a generated image on a project's ideas board. Args:
`project_id`, `file_id` (from `generate_image`), optional `alt`.

A typical flow: open the Projects app, create_project, add a few tasks, generate
an image, then canvas_add_image it onto the board.
`project_id`, `image_ref` (the `image_ref` returned by `generate_image`), optional `alt`.
- **describe_image_capabilities** — see the hardware tiers (this host + any cluster
workers, e.g. an NVIDIA box) and which image tools/models each has loaded. Use it
to pick the right model before `generate_image`: an NPU model for a fast draft, a
GPU model for a quality cover. The system loads/unloads and queues for you — you
just choose the model.

A typical flow: open the Projects app, create_project, add a few tasks, call
generate_image and keep its `image_ref`, then canvas_add_image(project_id, image_ref)
to drop it on the board.

These drive the user's own desktop in their session. Use them to make your work
visible: open the relevant app so the user can watch, then carry out the task with
Expand Down
89 changes: 89 additions & 0 deletions docs/agent-manual/10-image-prompting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
<!-- How to write good prompts for the generate_image tool. -->

# Generating good images

When you call `generate_image`, the quality of the result depends mostly on the
prompt. A vague prompt gives a generic image; a specific, well-ordered one gives
what the user actually asked for. Spend a sentence getting it right rather than
regenerating five times.

## Structure a prompt

Lead with the subject, then layer detail. A reliable order:

1. **Subject** — what it is. "a small red sailboat", "a friendly cartoon fox".
2. **Descriptors** — appearance, colour, material, mood. "weathered wooden hull,
bright red sail".
3. **Setting / background** — where it is. "on a calm blue lake at sunrise".
4. **Composition** — framing and viewpoint. "wide shot, centred, low angle".
5. **Style** — the look. "watercolour children's book illustration", "flat vector
art", "photorealistic", "oil painting". Naming a concrete style matters more
than any other single word.
6. **Lighting / quality** — "soft warm light, gentle shadows, highly detailed".

Example: `a friendly cartoon fox reading a book under a tree, autumn leaves,
warm soft light, watercolour children's book illustration, centred, highly detailed`.

## Principles

- **Be specific, not long.** Concrete nouns and adjectives beat a wall of vague
words. "golden retriever puppy on grass" beats "a nice cute lovely beautiful
amazing dog".
- **Front-load what matters.** Earlier words carry more weight. Put the subject
and the must-have details first.
- **One clear scene.** Don't pack several unrelated ideas into one prompt; the
model blends them into mush. Generate separate images instead.
- **Name the style explicitly.** If the user wants a storybook look, say
"children's book illustration" or "storybook watercolour". If they want a logo,
say "flat minimalist vector logo".
- **Match the user's intent.** Ask yourself what they pictured and describe that,
not a generic version of it. For a book cover, say "book cover, title space at
the top, central character".

## Use negative_prompt to remove faults

`negative_prompt` lists what to avoid (comma-separated). It is the fix for common
defects:

- General cleanup: `blurry, low quality, jpeg artifacts, watermark, text, signature`.
- People/animals: add `deformed hands, extra fingers, extra limbs, mutated`.
- Keep a clean style: add `cluttered, busy background` if you want simplicity.

Reach for it when a first result has a recurring flaw rather than rewriting the
whole prompt.

## Parameters (what the tool exposes)

- **size** — `256x256`, `384x384`, or `512x512`. Use 512x512 for the final
artwork; a smaller size is only worth it for a quick rough draft.
- **steps** — 1 to 8 (default 4). These backends are tuned for few-step
generation; 4 is a good balance, 6 to 8 for a bit more detail. More is not
always better here.
- **guidance_scale** — 1 to 20 (default 7.5). How strictly the image follows the
prompt. Lower (2 to 5) is looser and more artistic; higher (8 to 12) sticks to
the prompt harder. Raise it when the model ignores a detail you asked for;
lower it if results look over-baked or harsh.
- **seed** — omit for a fresh random image. To make small edits to an image the
user liked, reuse its returned `seed` and tweak the prompt so the composition
stays close.
- **model** — call `describe_image_capabilities` first and pick a model that fits
the task: a fast NPU draft model for iterating, a GPU model for the final cover.
Omit it to let the scheduler choose.

## Picking a model by intent

Different model families respond to prompts differently:

- **FLUX-style models** follow natural-language sentences well and render text
reasonably. Write a full descriptive sentence.
- **SDXL-style models** respond well to comma-separated descriptive phrases and
strong style keywords.
- **Text in the image** (a title, a sign, a label) is unreliable on most models;
prefer a model noted for text if one is loaded, keep the text very short, and
put it in quotes, e.g. `a poster with the title "Brave Little Fox"`.

## Iterate deliberately

If the first image is close but not right, change one thing at a time: adjust the
style word, add a missing detail, or add a negative term for the defect, keeping
the same seed. Tell the user what you changed so they can steer.
1 change: 1 addition & 0 deletions docs/agent-manual/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ Run `python3 scripts/build-agent-manual.py` to compile these into `docs/taos-age
| `07-after-update.md` | Breakage-log-first troubleshooting for post-update reports |
| `08-answer-templates.md` | Canned answer shapes for common questions |
| `09-os-control.md` | Driving the desktop: open_app / arrange_windows tools |
| `10-image-prompting.md` | Writing good prompts for the generate_image tool |
Loading
Loading