Skip to content

Release: agent image generation (#884) + rkllama store install fix (#844/#886)#889

Merged
jaylfc merged 13 commits into
masterfrom
dev
Jun 14, 2026
Merged

Release: agent image generation (#884) + rkllama store install fix (#844/#886)#889
jaylfc merged 13 commits into
masterfrom
dev

Conversation

@jaylfc

@jaylfc jaylfc commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Promotes the two needed fixes from dev to master so all users get them.

Both green on dev (CI + Gitar + Kilo + CodeRabbit).

Summary by CodeRabbit

New Features

  • Added describe_image_capabilities tool to inspect available image generation models and hardware tiers across local and cluster resources.
  • Added comprehensive image prompting guide with best practices for prompt structure, parameter tuning, and iterative refinement.

Documentation

  • Updated OS control manual with revised image workflow guidance and new tool documentation.

Bug Fixes

  • Enhanced image generation installation verification with retry logic and improved error detection.
  • Improved generated image handling in project canvas integration.

jaylfc added 13 commits June 14, 2026 12:38
The demo's image step was broken two ways:
- generate_image b64-encoded the route's JSON response as if it were PNG bytes,
  so 'image_b64' was garbage. It now parses the JSON and returns 'image_ref'
  (the saved filename) + 'url' (the served path). No consumer used image_b64.
- canvas_add_image took a 'file_id' that had to already be a project canvas file,
  but generate_image saves to the workspace -> the image never rendered. It now
  takes 'image_ref', copies the workspace PNG into the project's canvas files
  (projects_root/<slug>/files/canvas/<uuid>.png, where ImageShape renders it),
  then creates the element. Still ownership-checked; '.name' strips path parts.

So the agent flow works end to end: generate_image -> canvas_add_image(image_ref)
-> the cover art appears live on the ideas board. Manual + skill schema updated;
project_tools/image_tool tests green (19).
Read-only tool so the agent knows what hardware tiers exist (this host's
NPU/GPU/CPU + cluster workers like an NVIDIA box) and which image tools/models
each has loaded, before calling generate_image. Maps backend type -> tier
(rkllama=npu, sd-cpp=cpu/gpu, comfyui=gpu) and reports loaded state from the
backend catalog + cluster manager. Defensive when cluster/catalog absent.

The agent picks the model by intent (npu draft vs gpu cover) and generate_image
routes there; the scheduler + lifecycle manager own load/unload/queue. Seeded as
a skill, wired into skill_exec, manual updated; 4 tool tests + 26 related green.
- describe_image_capabilities returned dataclass fields from a real
  hardware_profile -> JSON-serialise would 500. Coerce hardware values to JSON
  primitives (_json_safe) + test with an object profile (Gitar).
- generate_image now fails instead of false-succeeding when the scheduler
  response has no filename (CodeRabbit/Kilo).
- canvas_add_image: drop the misleading file_id fallback (a real canvas file_id
  isn't in the workspace); require image_ref (Kilo).
- worker image_backends now report a 'loaded' field for parity with local
  backends (Gitar). 24 tests green.
- skills.py: refresh code-owned (builtin) skill rows after INSERT OR IGNORE
  so installs seeded by an earlier release (e.g. the Pi, with the pre-image_ref
  canvas_add_image schema) converge on the current tool_schema. Scoped to
  install_method='builtin' so user-installed skills are never overwritten.
- image_tool.py: type-validate the scheduler 'filename' (not just truthiness)
  before claiming success; document that the controller-unreachable fallback
  returns image_b64 (no controller workspace to save into), so the image_ref
  contract only applies to the in-process scheduler path.
- project_tools.py: reject a project slug that isn't its own slugify and verify
  the resolved canvas dir stays within projects_root (defense-in-depth against
  a legacy/odd slug escaping the tree).
- cluster_tools.py: include HardwareProfile's real 'ram_mb' field in the
  hardware summary so total RAM (a tier signal) isn't dropped.
- tests: scheduler mocks now return real filename/path metadata.

Refs #884
The rkllama service manifest declared install.method: script pointing at
scripts/install-rkllama.sh, but that file never existed -- only
scripts/install-rknpu.sh did. ScriptInstaller resolves install.script
relative to the repo root, so clicking Install in the store failed with
'script not found'. rkllama is the NPU LLM backend our RK3588 target
audience needs from the first boot, so this broke their primary path.

- Add scripts/install-rkllama.sh: an idempotent, headless wrapper the
  store's ScriptInstaller can run non-interactively. It short-circuits to
  exit 0 when a live rkllama already answers (7833 or legacy 8080), and
  otherwise delegates to the verified install-rknpu.sh with
  TAOS_RKNPU_SETUP=1 set explicitly. That env var is required: without it
  install-rknpu.sh takes its 'non-interactive shell, nothing to confirm ->
  exit 0' path and would report success while installing nothing. Keeping
  the heavy NPU runtime install behind a store-triggered script (cloned at
  install time, not bundled) keeps the arms-length, source-available
  posture intact.

- Harden RkllamaInstaller's /api/tags verification. A 200 from /api/pull
  is necessary but not sufficient; only /api/tags confirms the weight is
  loadable. Previously an unreachable /api/tags was swallowed and the
  install returned a false success. Now it retries a few times and, if the
  check never succeeds, returns success: False with an actionable error --
  a model the agent can't load is worse than a clear failure.

- Tests: install() verification (confirmed / absent / unreachable) plus a
  regression guard asserting the rkllama manifest's install.script exists.
- branch tips: master=51837bed (#887 released #885), dev=d5c089e9
- open PRs: #884 (agent image-gen, review fixes baking), #886 (rkllama
  store fix #844, off origin/dev), #876 (deps); #885 merged dev->master
- record the #844/#884 fix details and the catalog-manifest debt note
- RkllamaInstaller verify loop: catch ValueError so a 200 with a non-JSON
  /api/tags body fails the check instead of raising JSONDecodeError out of
  install(); validate the response shape (dict with a list of dict models);
  retry on an absent model too, not just on HTTPError, so registration lag
  after the pull's 200 no longer reports a false failure.
- install-rkllama.sh: the idempotent short-circuit now requires an
  rkllama/Ollama-shaped /api/tags body (a "models" key) rather than any
  HTTP 200, so an unrelated service on 7833/8080 can't be mistaken for an
  installed rkllama. Mirrors _port_responds_with_rkllama().
- tests: non-JSON /api/tags and late-appearing model.

Refs #844 #886
One malformed backend or worker entry no longer drops the whole capability
menu (CodeRabbit). Guard each backend/worker independently instead of
wrapping the entire loop in a single try/except, and extract a shared
_model_id helper for dict-or-str model entries.

Refs #884
The agent drives generate_image; result quality is mostly the prompt. Add
agent-manual/10-image-prompting.md (compiled into taos-agent-manual.md):
prompt structure (subject -> descriptors -> setting -> composition -> style
-> lighting), be-specific/front-load/one-scene principles, negative_prompt
for common defects, the tool's actual parameters (size/steps/guidance_scale/
seed/model with sensible ranges), model-family differences (FLUX sentences
vs SDXL phrases, text-in-image caveat), and deliberate iteration. Also
enrich the generate_image 'prompt' field description with a compact inline
hint so guidance is present at call time.

Refs #884
- add a SESSION PAUSED 'NEW SESSION START HERE' block: branch tips, the two
  in-flight PRs (#884 image-gen tip ddeb1be, #886 rkllama #844 tip d6960af,
  both baking on tests only), the remaining minor non-blocking bot nits with
  my assessment, the merge gate, and re-arm reminders
- record the 3060 SD-backend GO from @taOSmd (task #34 unblocked; do after #884)
…r tier awareness)

feat(agent): agent-controlled image generation (canvas wiring + cluster tier awareness)
fix(store): make the rkllama service install entry actually work (#844)
@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@github-actions

Copy link
Copy Markdown

👋 Thanks for the PR! This one targets master, which is our
stable branch (it's what live installs track). Please retarget it to
dev — click Edit next to the PR title and change the base
branch dropdown from master to dev. Your commits and any review
carry over, nothing is lost.

See CONTRIBUTING.md for the branch model.

@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: b9def797-76c9-473b-baaa-430ec369fc48

📥 Commits

Reviewing files that changed from the base of the PR and between 51837be and 409ec6a.

📒 Files selected for processing (16)
  • docs/STATUS.md
  • docs/agent-manual/09-os-control.md
  • docs/agent-manual/10-image-prompting.md
  • docs/agent-manual/index.md
  • docs/taos-agent-manual.md
  • scripts/install-rkllama.sh
  • tests/test_cluster_tools.py
  • tests/test_image_tool.py
  • tests/test_project_tools.py
  • tests/test_rkllama_installer.py
  • tinyagentos/installers/rkllama_installer.py
  • tinyagentos/routes/skill_exec.py
  • tinyagentos/skills.py
  • tinyagentos/tools/cluster_tools.py
  • tinyagentos/tools/image_tool.py
  • tinyagentos/tools/project_tools.py

📝 Walkthrough

Walkthrough

This PR introduces an image_ref-based contract for the image generation workflow: generate_image now returns a reference instead of raw bytes, and canvas_add_image copies the workspace image into the project canvas using that reference. A new describe_image_capabilities cluster tool and builtin skill are added. In parallel, the rkllama installer gains retry-based post-pull verification and a new headless install wrapper script.

Changes

Image Generation Workflow (image_ref + describe_image_capabilities)

Layer / File(s) Summary
generate_image: JSON response and image_ref contract
tinyagentos/tools/image_tool.py, tests/test_image_tool.py
execute_image_generation parses the scheduler's /api/images/generate response as JSON to extract filename, returning image_ref and url instead of image_b64. Three tests are updated to mock the scheduler JSON response accordingly.
canvas_add_image: image_ref copy-to-canvas implementation
tinyagentos/tools/project_tools.py, tests/test_project_tools.py
execute_canvas_add_image is reworked to accept image_ref, resolve the workspace path via _data_dir, validate the project slug and canvas path for path-escape safety, copy image bytes, and return the new file_id. Tests are updated with a filesystem-backed _req(), a _seed_generated_image() helper, missing-file and auth test cases.
describe_image_capabilities: cluster tool
tinyagentos/tools/cluster_tools.py, tests/test_cluster_tools.py
New cluster_tools.py with tier-mapping, JSON-safe hardware helpers, and the async execute_describe_image_capabilities entrypoint that assembles local and worker image-backend tiers with per-entry fault tolerance. Full test suite covering local tiers, workers, offline exclusion, empty state, malformed backends, and JSON-safe serialization.
Skill store: registration and builtin update logic
tinyagentos/skills.py, tinyagentos/routes/skill_exec.py
canvas_add_image skill schema switches to image_ref. describe_image_capabilities is added as a new builtin skill. _seed_defaults now UPDATEs existing builtin rows on startup to refresh code-owned fields. A new _skill_describe_image_capabilities route handler is registered in SKILL_IMPLEMENTATIONS.
Agent manual and image-prompting guide
docs/agent-manual/09-os-control.md, docs/agent-manual/10-image-prompting.md, docs/agent-manual/index.md, docs/taos-agent-manual.md
09-os-control.md and the compiled manual are updated for the new describe_image_capabilities / canvas_add_image(project_id, image_ref) flow. New 10-image-prompting.md covers prompt structure, parameters, model selection, and iteration; registered in index.md.

rkllama Installer Hardening

Layer / File(s) Summary
RkllamaInstaller: retry-based /api/tags verification
tinyagentos/installers/rkllama_installer.py, tests/test_rkllama_installer.py
install() polls /api/tags up to 3 times after /api/pull, handles non-JSON and connection errors as failures, and returns success: False if app_id is unconfirmed. New TestInstallVerification and TestRkllamaServiceManifest suites cover all verification paths and the manifest regression guard.
scripts/install-rkllama.sh: idempotent headless wrapper
scripts/install-rkllama.sh
New Bash script probes /api/tags on primary and legacy ports to short-circuit if already running, then delegates to install-rknpu.sh via exec with TAOS_RKNPU_SETUP=1.
STATUS.md: rkllama fix status
docs/STATUS.md
Marks #844 rkllama as fixed via #886, refreshes session branch tips, open-PR queue, and CI notes.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(173, 216, 230, 0.5)
    Note over Agent,canvas_add_image: Image generation and placement flow
    Agent->>execute_describe_image_capabilities: call (no args)
    execute_describe_image_capabilities->>backend_catalog: backends_with_capability("image-generation")
    execute_describe_image_capabilities->>cluster_manager: get_workers()
    execute_describe_image_capabilities-->>Agent: {tiers: [{node, hw, image_backends}…], hint}
  end
  rect rgba(144, 238, 144, 0.5)
    Note over Agent,canvas_add_image: Agent selects model, generates image
    Agent->>execute_image_generation: call(prompt, model, size, …)
    execute_image_generation->>SchedulerController: POST /api/images/generate
    SchedulerController-->>execute_image_generation: JSON {filename, path}
    execute_image_generation-->>Agent: {image_ref, url, seed, model, size}
  end
  rect rgba(255, 222, 173, 0.5)
    Note over Agent,canvas_add_image: Agent places image on project board
    Agent->>canvas_add_image: call(project_id, image_ref)
    canvas_add_image->>project_tools: _data_dir → locate workspace image
    canvas_add_image->>project_tools: validate slug + path, copy bytes
    canvas_add_image-->>Agent: {element_id, file_id}
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • jaylfc/taOS#884: Implements the same image_ref-based image workflow and describe_image_capabilities skill, directly overlapping with this PR's changes to canvas_add_image, image_tool, skill registration, and tests.
  • jaylfc/taOS#882: Introduced canvas_add_image in project_tools.py; this PR reworks that function to use image_ref instead of file_id.
  • jaylfc/taOS#886: The rkllama installer retry verification in rkllama_installer.py, the new install-rkllama.sh script, and the expanded test_rkllama_installer.py coverage directly correspond to the fix tracked in this PR.

Poem

🐇 Hoppity-hop through the image queue,
no more raw bytes — just a neat image_ref clue.
The cluster tiers line up, NPU to CPU,
while rkllama retries till the tags come true.
A new prompting guide and a script so precise —
this rabbit thinks clean contracts are rather nice! 🎨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jaylfc jaylfc merged commit 2122412 into master Jun 14, 2026
13 of 14 checks passed
@github-project-automation github-project-automation Bot moved this from Todo to Done in TinyAgentOS Roadmap Jun 14, 2026
def _image_backends_from_worker(worker) -> list[dict]:
out = []
for b in (getattr(worker, "backends", None) or []):
caps = b.get("capabilities") or []

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Malformed worker backend entries can still drop the entire worker

The local catalog path guards each backend independently, but this worker path only guards at worker level. If worker.backends contains a non-dict entry, b.get(...) raises and the outer try skips the whole worker, so one bad backend can hide otherwise usable worker backends.


Reply with @kilocode-bot fix it to have Kilo Code address this issue.

@kilo-code-bot

kilo-code-bot Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code Review Summary

Status: 3 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 2
SUGGESTION 1
Issue Details (click to expand)

WARNING

File Line Issue
tinyagentos/tools/cluster_tools.py 86 Malformed worker backend entries can still drop the entire worker because _image_backends_from_worker only guards at worker level; a non-dict backend entry raises on b.get(...) and causes the outer worker guard to skip usable backends.
Other Observations (not in diff)

Issues found in unchanged code that cannot receive inline comments:

File Line Issue
tinyagentos/tools/image_tool.py 7 Tool schema still says the tool returns a base64-encoded PNG, but scheduler-routed calls now return image_ref/url and only the direct backend fallback returns image_b64.
tinyagentos/tools/image_tool.py 43 Model selection description points agents at list_image_models, while the PR introduces describe_image_capabilities as the intended tier/model discovery tool.
Files Reviewed (16 files)
  • docs/STATUS.md - 0 issues
  • docs/agent-manual/09-os-control.md - 0 issues
  • docs/agent-manual/10-image-prompting.md - 0 issues
  • docs/agent-manual/index.md - 0 issues
  • docs/taos-agent-manual.md - 0 issues
  • scripts/install-rkllama.sh - 0 issues
  • tests/test_cluster_tools.py - 0 issues
  • tests/test_image_tool.py - 0 issues
  • tests/test_project_tools.py - 0 issues
  • tests/test_rkllama_installer.py - 0 issues
  • tinyagentos/installers/rkllama_installer.py - 0 issues
  • tinyagentos/routes/skill_exec.py - 0 issues
  • tinyagentos/skills.py - 0 issues
  • tinyagentos/tools/cluster_tools.py - 1 issue
  • tinyagentos/tools/image_tool.py - 2 observations
  • tinyagentos/tools/project_tools.py - 0 issues

Fix these issues in Kilo Cloud


Reviewed by nex-n2-pro:free · 2,438,374 tokens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

1 participant