Skip to content

Commit 65c8207

Browse files
committed
make hugging face model downloader more robust.
1 parent 09edcc3 commit 65c8207

14 files changed

Lines changed: 1376 additions & 62 deletions

agents/progress.json

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -261,9 +261,9 @@
261261
{
262262
"id": 54,
263263
"name": "Unified model refs in [tool.cozy.models] (cozy/hf)",
264-
"description": "Standardize how tenant projects declare models in `pyproject.toml` so the worker can reliably download/cache weights from Cozy Hub and Hugging Face.\n\nGoal (phase 1): allow a single string value in `[tool.cozy.models]` to represent one of:\n- Cozy Hub snapshot (default): `org/repo:tag` or `org/repo@sha256:<digest>` (or explicitly `cozy:org/repo:tag`)\n- Hugging Face repo: `hf:org/repo` (downloaded via `huggingface_hub`)\n\nPhase 2 (deferred): civitai + arbitrary URL model refs.\n\nNotes:\n- This issue is about the tenant-facing ref contract + resolver plumbing.\n- Cozy Hub snapshot/object downloading details are tracked separately (see worker issue id=55).\n\nDeferred schemes (issue id=56) must enforce strict size caps and SSRF protections for direct URL fetches.",
264+
"description": "Standardize how tenant projects declare models in `pyproject.toml` so the worker can reliably download/cache weights from Cozy Hub and Hugging Face.\n\nGoal (phase 1): allow a single string value in `[tool.cozy.models]` to represent one of:\n- Cozy Hub model ref (default): `org/repo:tag` or `org/repo@blake3:<digest>` (or explicitly `cozy:org/repo:tag`)\n- Hugging Face repo: `hf:org/repo` (downloaded via `huggingface_hub`)\n\nNotes:\n- `:latest` is supported and resolved by cozy-hub (it means highest revision).\n- Workers select the best artifact variant via cozy-hub `resolve_artifact`.\n\nPhase 2 (deferred): civitai + arbitrary URL model refs.\n\nDeferred schemes (issue id=56) must enforce strict size caps and SSRF protections for direct URL fetches.",
265265
"tasks": [
266-
"[x] Define the model ref grammar and validation rules (phase 1)\n - Supported schemes: cozy (default), hf\n - Cozy refs support `org/repo:tag` and `org/repo@sha256:<digest>` (optionally prefixed with `cozy:`)\n - Add clear error messages for unknown schemes (civitai/url deferred)",
266+
"[x] Define the model ref grammar and validation rules (phase 1)\n - Supported schemes: cozy (default), hf\n - Cozy refs support `org/repo:tag` and `org/repo@blake3:<digest>` (optionally prefixed with `cozy:`)\n - Add clear error messages for unknown schemes (civitai/url deferred)",
267267
"[x] Update manifest contract and docs for `[tool.cozy.models]`\n - Document cozy + hf examples\n - Clarify that manifest stores model refs (not necessarily Cozy Hub numeric IDs)\n - Define precedence/compatibility for older manifests",
268268
"[x] Implement the resolver plumbing (cozy + hf)\n - Route `cozy:` refs to the Cozy Hub snapshot downloader (see id=55)\n - Route `hf:` refs to Hugging Face Hub via `huggingface_hub` (no custom downloader)\n - Ensure concurrency limits and atomic writes where applicable",
269269
"[x] Implement Hugging Face downloads via `huggingface_hub`\n - Add optional dependency group for `huggingface_hub`\n - Use `snapshot_download` (or equivalent) so HF’s cache/resume logic is reused\n - Respect `HF_HOME` / `TRANSFORMERS_CACHE` / `DIFFUSERS_CACHE` and prefer shared volume paths",
@@ -330,6 +330,11 @@
330330
"[x] Worker implementation (Python)\n - Expand `gen_worker.hf_downloader` selection to cover variants + sharded weights\n - Keep using `snapshot_download(allow_patterns=...)`\n - Add unit tests (no network)",
331331
"[x] Local completeness check (no network)\n - If the repo snapshot is already present in the HF cache, validate required files locally\n - If complete, skip HF API calls + skip snapshot_download\n - If partial, fall back to normal download path\n - Treat `*.incomplete` (HF cache) as incomplete",
332332
"[x] Fallback for repos without `model_index.json`\n - If `model_index.json` is missing, infer a diffusers-like component set from repo structure (known component dirs)\n - Prefer sharded safetensors when available (index + shards)\n - Stay strict on weight formats (no silent `.bin`/`.ckpt` fallback)",
333+
"[x] Generalize component discovery (no hardcoded names)\n - Derive component folder names from `model_index.json` keys (ignore `_` keys)\n - Remove reliance on a fixed allowlist like `unet/text_encoder/vae` for planning\n - Keep default skip list: `safety_checker`, `feature_extractor`",
334+
"[x] Generalize “small tree” components using model_index types\n - Detect tokenizer-like components via `model_index.json` entries (library==transformers and class contains Tokenizer)\n - Detect scheduler-like components via `model_index.json` entries (library==diffusers and class contains Scheduler)\n - For these, include the entire folder (still excluding `.bin/.ckpt`)",
335+
"[x] Robust safetensors precision selection via header inspection\n - For candidate weight files, fetch the safetensors header via HTTP Range and parse tensor dtypes\n - Prefer float dtypes in order: fp16 then bf16 (no silent fp32 unless explicitly allowed)\n - Use this to choose between `*.fp16.safetensors` and non-variant `*.safetensors` when both exist",
336+
"[x] Robust sharded safetensors selection per component\n - If multiple `*.safetensors.index.json` exist in a component folder, choose the best candidate by probed dtype\n - Expand selected index → shards via `weight_map` and include those files only\n - Avoid accidentally selecting tiny LoRA-like safetensors by using a size-based tie-breaker (prefer the larger candidate when dtype matches)",
337+
"[x] Add fixtures/tests for non-standard component names\n - Add unit tests using a `model_index.json` like Z-Image (e.g. `transformer` instead of `unet`)\n - Assert selection still finds the correct folders and only required files are downloaded",
333338
"[ ] Mirror implementation (Go tool)\n - Implement the same selection logic in the HF→CozyHub mirroring tool (Go)\n - Use Hugging Face Hub HTTP APIs to list files + fetch only required ones\n - Add tests that verify behavior against the golden fixtures",
334339
"[ ] Repo-size / safety checks\n - Add a conservative, non-configurable safety limit (or remove this task if not desired)\n - Avoid adding new env/config knobs",
335340
"[ ] End-to-end validation\n - E2E should cover both:\n - worker HF-direct (`hf:...`)\n - mirror HF→CozyHub then worker CozyHub download\n - Verify both paths downloaded only the minimal file set",
@@ -363,6 +368,23 @@
363368
"[x] Add a basic integration test\n - Added a dev-only pytest that spawns a worker process and submits one task (skipped by default)"
364369
],
365370
"completed": true
371+
},
372+
{
373+
"id": 61,
374+
"name": "Cozy Hub v2 model flow: resolve_artifact + snapshots/blobs cache + cozy.pipeline.lock.yaml",
375+
"description": "Implement the full Cozy Hub model flow in python-gen-worker so tenant functions only declare dependencies and do inference.\n\nWorker responsibilities:\n- Resolve `cozy:` model refs via Cozy Hub `resolve_artifact`.\n- Download the returned snapshot manifest (single manifest listing all files).\n- Cache content-addressed blobs locally (HF-style blobs store).\n- Materialize a snapshot checkout that looks like a normal diffusers repo.\n- Prefer `cozy.pipeline.lock.yaml` when present; fall back to `cozy.pipeline.yaml`.\n\nThis must be implemented in worker/library code, not in tenant functions.",
376+
"tasks": [
377+
"[x] Implement Cozy Hub `resolve_artifact` client\n - Request: tag + ordered preferences + worker capabilities\n - Response: `repo_revision_seq` number, `artifact.snapshot_digest`, `snapshot_manifest.files[]` (paths + blake3 + urls)\n - Handle 409 'no compatible artifact' errors with a clear message",
378+
"[x] Define worker-side selection policy (hardcoded)\n - Prefer file_type: flashpack → safetensors\n - Prefer quantization: fp8 → bf16 (fp16 fallback)\n - Prefer file_layout: diffusers\n - Include capability detection (cuda version, gpu sm, installed libs) in requests",
379+
"[x] Implement local cache layout: snapshots + blobs\n - Cache root: `${WORKER_MODEL_CACHE_DIR}/cozy/`\n - Blobs: `${WORKER_MODEL_CACHE_DIR}/cozy/blobs/blake3/<aa>/<bb>/<digest>`\n - Snapshots: `${WORKER_MODEL_CACHE_DIR}/cozy/snapshots/<snapshot_digest>/...`",
380+
"[x] Download blobs with verification + resume\n - Download to `<digest>.part`, Range resume when supported, then atomic rename\n - Verify `size_bytes` and BLAKE3 digest\n - Singleflight: concurrent downloads for the same digest share one in-flight transfer",
381+
"[x] Materialize snapshot from snapshot manifest\n - Create the exact file tree under `snapshots/<snapshot_digest>/` using `files[].path`\n - Materialize each file from the blob store (prefer hardlink; fallback symlink/copy)\n - Atomic activation: build in `.building` then rename",
382+
"[ ] Lockfile-aware pipeline loading (future)\n - Prefer `cozy.pipeline.lock.yaml` when present\n - Fall back to `cozy.pipeline.yaml` when lockfile missing",
383+
"[x] Integrate with the existing injection/model loading flow\n - Model refs in `pyproject.toml` resolve to a local Cozy snapshot root directory\n - Worker diffusers injection calls `from_pretrained(<snapshot_root>)`",
384+
"[x] Tests\n - Mock Cozy Hub resolve_artifact + manifest responses (no network)\n - Validate blobs cache, snapshot materialization, and concurrent download singleflight behavior",
385+
"[x] E2E Docker smoke test (manual)\n - Start Cozy Hub + MinIO\n - Run `sd15-worker` + `mock_orchestrator` and verify it downloads and generates `runs/<run_id>/outputs/image.png`"
386+
],
387+
"completed": false
366388
}
367389
]
368390
}

examples/sd15/Dockerfile

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
ARG BASE_IMAGE=cozycreator/python-worker:cuda12.8-torch2.9
2+
FROM ${BASE_IMAGE}
3+
4+
WORKDIR /app
5+
COPY . /app
6+
7+
WORKDIR /app/examples/sd15
8+
9+
RUN pip install --no-cache-dir uv
10+
RUN uv sync --no-dev
11+
12+
RUN mkdir -p /app/.cozy && .venv/bin/python -m gen_worker.discover > /app/.cozy/manifest.json
13+
14+
ENTRYPOINT ["/app/examples/sd15/.venv/bin/python", "-m", "gen_worker.entrypoint"]

src/gen_worker/cozy_hub_policy.py

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
from __future__ import annotations
2+
3+
import importlib.util
4+
from dataclasses import dataclass
5+
from typing import Any, Dict, List, Optional
6+
7+
8+
@dataclass(frozen=True)
9+
class CozyHubWorkerCapabilities:
10+
cuda_version: str
11+
gpu_sm: int
12+
torch_version: str
13+
installed_libs: List[str]
14+
15+
def to_dict(self) -> Dict[str, Any]:
16+
return {
17+
"cuda_version": self.cuda_version,
18+
"gpu_sm": self.gpu_sm,
19+
"torch_version": self.torch_version,
20+
"installed_libs": list(self.installed_libs),
21+
}
22+
23+
24+
def _is_importable(module_name: str) -> bool:
25+
try:
26+
return importlib.util.find_spec(module_name) is not None
27+
except Exception:
28+
return False
29+
30+
31+
def detect_worker_capabilities(*, extra_libs: Optional[List[str]] = None) -> CozyHubWorkerCapabilities:
32+
"""
33+
Detect worker capabilities for Cozy Hub artifact selection.
34+
35+
This is intentionally conservative: if torch/cuda isn't available, we report
36+
empty/zero values so Cozy Hub can avoid selecting capability-gated artifacts
37+
(e.g. fp8 or flashpack) unless explicitly supported.
38+
"""
39+
installed: List[str] = []
40+
41+
# Known optional libs that affect artifact compatibility.
42+
# Keep this hardcoded (no env config), per Cozy design.
43+
known = ["flashpack", "bitsandbytes", "torchao", "transformer_engine"]
44+
if extra_libs:
45+
known.extend(extra_libs)
46+
for name in known:
47+
mod = name
48+
if name == "transformer_engine":
49+
mod = "transformer_engine"
50+
if _is_importable(mod):
51+
installed.append(name)
52+
53+
cuda_version = ""
54+
gpu_sm = 0
55+
torch_version = ""
56+
try:
57+
import torch # type: ignore
58+
59+
torch_version = str(getattr(torch, "__version__", "") or "")
60+
cuda_version = str(getattr(getattr(torch, "version", None), "cuda", "") or "")
61+
if getattr(torch, "cuda", None) is not None and torch.cuda.is_available():
62+
major, minor = torch.cuda.get_device_capability()
63+
gpu_sm = int(major) * 10 + int(minor)
64+
except Exception:
65+
pass
66+
67+
installed.sort()
68+
return CozyHubWorkerCapabilities(
69+
cuda_version=cuda_version,
70+
gpu_sm=gpu_sm,
71+
torch_version=torch_version,
72+
installed_libs=installed,
73+
)
74+
75+
76+
def default_resolve_preferences() -> Dict[str, List[str]]:
77+
"""
78+
Hardcoded worker-side preference order for Cozy Hub selection.
79+
80+
Policy:
81+
- Prefer file_type: flashpack -> safetensors
82+
- Prefer quantization: fp8 -> bf16 -> fp16 (fp16 is a pragmatic fallback)
83+
- Prefer file_layout: diffusers
84+
"""
85+
return {
86+
"file_type_preference": ["flashpack", "safetensors"],
87+
"quantization_preference": ["fp8", "bf16", "fp16"],
88+
"file_layout_preference": ["diffusers"],
89+
}
90+

src/gen_worker/cozy_hub_v2.py

Lines changed: 203 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
from __future__ import annotations
2+
3+
from dataclasses import dataclass
4+
from typing import Any, Dict, List, Mapping, Optional
5+
6+
import aiohttp
7+
8+
9+
class CozyHubError(RuntimeError):
10+
pass
11+
12+
13+
class CozyHubNoCompatibleArtifactError(CozyHubError):
14+
def __init__(self, message: str, *, debug: Optional[object] = None) -> None:
15+
super().__init__(message)
16+
self.debug = debug
17+
18+
19+
@dataclass(frozen=True)
20+
class CozyHubArtifact:
21+
label: str
22+
file_layout: str
23+
file_type: str
24+
quantization: str
25+
26+
27+
@dataclass(frozen=True)
28+
class CozyHubSnapshotFile:
29+
path: str
30+
size_bytes: int
31+
blake3: str
32+
url: Optional[str]
33+
34+
35+
@dataclass(frozen=True)
36+
class CozyHubResolveArtifactResult:
37+
repo_revision_seq: int
38+
snapshot_digest: str
39+
artifact: Optional[CozyHubArtifact]
40+
files: List[CozyHubSnapshotFile]
41+
42+
43+
class CozyHubV2Client:
44+
"""
45+
Cozy Hub v2 model APIs (resolve_artifact).
46+
47+
Endpoint:
48+
- POST /api/v1/repos/<org>/<repo>/resolve_artifact
49+
50+
Response (v1):
51+
- repo_revision_seq: number
52+
- snapshot_digest: hex
53+
- artifact: {label, file_layout, file_type, quantization}
54+
- snapshot_manifest: {version, files:[{path,size_bytes,blake3,url?}]}
55+
"""
56+
57+
def __init__(self, base_url: str, token: Optional[str] = None, timeout_s: int = 30) -> None:
58+
self.base_url = base_url.rstrip("/")
59+
self.token = (token or "").strip() or None
60+
self.timeout_s = timeout_s
61+
62+
def _headers(self) -> Dict[str, str]:
63+
h: Dict[str, str] = {"Content-Type": "application/json"}
64+
if self.token:
65+
h["Authorization"] = f"Bearer {self.token}"
66+
return h
67+
68+
async def resolve_artifact(
69+
self,
70+
*,
71+
org: str,
72+
repo: str,
73+
tag: str,
74+
include_urls: bool,
75+
preferences: Mapping[str, Any],
76+
capabilities: Mapping[str, Any],
77+
) -> CozyHubResolveArtifactResult:
78+
if not org or not repo:
79+
raise ValueError("org/repo required")
80+
tag = (tag or "").strip() or "latest"
81+
82+
url = f"{self.base_url}/api/v1/repos/{org}/{repo}/resolve_artifact"
83+
payload = {
84+
"tag": tag,
85+
"include_urls": bool(include_urls),
86+
"preferences": dict(preferences),
87+
"capabilities": dict(capabilities),
88+
}
89+
90+
timeout = aiohttp.ClientTimeout(total=self.timeout_s)
91+
async with aiohttp.ClientSession(timeout=timeout, headers=self._headers()) as session:
92+
async with session.post(url, json=payload) as resp:
93+
if resp.status == 409:
94+
try:
95+
data = await resp.json()
96+
except Exception:
97+
data = {}
98+
raise CozyHubNoCompatibleArtifactError(
99+
"no compatible artifact for worker",
100+
debug=data.get("debug") if isinstance(data, dict) else None,
101+
)
102+
resp.raise_for_status()
103+
data = await resp.json()
104+
if not isinstance(data, dict):
105+
raise ValueError("unexpected response shape")
106+
107+
return _parse_resolve_artifact_response(data, include_urls=include_urls)
108+
109+
async def get_snapshot_manifest(self, *, org: str, repo: str, digest: str) -> List[CozyHubSnapshotFile]:
110+
"""
111+
Fetch a snapshot manifest by digest (already pinned).
112+
113+
Endpoint:
114+
- GET /api/v1/repos/<org>/<repo>/snapshots/<digest>/manifest
115+
"""
116+
if not org or not repo or not digest:
117+
raise ValueError("org/repo/digest required")
118+
url = f"{self.base_url}/api/v1/repos/{org}/{repo}/snapshots/{digest}/manifest"
119+
120+
timeout = aiohttp.ClientTimeout(total=self.timeout_s)
121+
async with aiohttp.ClientSession(timeout=timeout, headers=self._headers()) as session:
122+
async with session.get(url) as resp:
123+
resp.raise_for_status()
124+
data = await resp.json()
125+
if not isinstance(data, dict):
126+
raise ValueError("unexpected response shape")
127+
128+
manifest = data.get("files")
129+
if not isinstance(manifest, list):
130+
manifest = data.get("root_files")
131+
if not isinstance(manifest, list):
132+
raise ValueError("missing files list")
133+
out: List[CozyHubSnapshotFile] = []
134+
for ent in manifest:
135+
if not isinstance(ent, dict):
136+
continue
137+
path = str(ent.get("path") or "").strip()
138+
if not path:
139+
continue
140+
out.append(
141+
CozyHubSnapshotFile(
142+
path=path,
143+
size_bytes=int(ent.get("size_bytes") or 0),
144+
blake3=str(ent.get("blake3") or "").strip().lower(),
145+
url=str(ent.get("url") or "").strip() or None,
146+
)
147+
)
148+
if not out:
149+
raise ValueError("empty files list")
150+
return out
151+
152+
153+
def _parse_resolve_artifact_response(data: Mapping[str, Any], *, include_urls: bool) -> CozyHubResolveArtifactResult:
154+
repo_revision_seq = int(data.get("repo_revision_seq") or 0)
155+
snapshot_digest = str(data.get("snapshot_digest") or "").strip()
156+
art = data.get("artifact")
157+
if not isinstance(art, dict):
158+
raise ValueError("missing artifact")
159+
artifact = CozyHubArtifact(
160+
label=str(art.get("label") or "").strip(),
161+
file_layout=str(art.get("file_layout") or "").strip(),
162+
file_type=str(art.get("file_type") or "").strip(),
163+
quantization=str(art.get("quantization") or "").strip(),
164+
)
165+
if repo_revision_seq <= 0 or not snapshot_digest:
166+
raise ValueError("missing snapshot_digest/repo_revision_seq")
167+
if not artifact.label:
168+
raise ValueError("missing artifact.label")
169+
170+
manifest = data.get("snapshot_manifest")
171+
if not isinstance(manifest, dict):
172+
raise ValueError("missing snapshot_manifest")
173+
files_raw = manifest.get("files")
174+
if not isinstance(files_raw, list):
175+
raise ValueError("missing snapshot_manifest.files")
176+
177+
files: List[CozyHubSnapshotFile] = []
178+
for ent in files_raw:
179+
if not isinstance(ent, dict):
180+
continue
181+
path = str(ent.get("path") or "").strip()
182+
if not path:
183+
continue
184+
size_bytes = int(ent.get("size_bytes") or 0)
185+
blake3_hex = str(ent.get("blake3") or "").strip().lower()
186+
url = str(ent.get("url") or "").strip() if include_urls else ""
187+
files.append(
188+
CozyHubSnapshotFile(
189+
path=path,
190+
size_bytes=size_bytes,
191+
blake3=blake3_hex,
192+
url=url or None,
193+
)
194+
)
195+
if not files:
196+
raise ValueError("empty snapshot file list")
197+
198+
return CozyHubResolveArtifactResult(
199+
repo_revision_seq=repo_revision_seq,
200+
snapshot_digest=snapshot_digest,
201+
artifact=artifact,
202+
files=files,
203+
)

0 commit comments

Comments
 (0)