Public-facing project name for this repository:
local-image-models.
This repository is a local image model extension bundle: a source bundle for 3 self-contained model extensions plus one shared runtime.
It is not a weights mirror and it does not redistribute the upstream model weights.
extensions/sd15/extensions/sdxl-base/extensions/flux-schnell/shared/runtime/local_image_runtime/tools/sync_extension_runtime.py
Each extension is the installable unit and contains its own:
manifest.jsongenerator.pysetup.pysrc/local_image_runtime/vendored runtime copy
The canonical shared code lives in shared/runtime/local_image_runtime/ and is synced into each extension root.
| Extension ID | Visible name | Hugging Face model | Capabilities | Practical VRAM guidance |
|---|---|---|---|---|
sd15 |
Stable Diffusion 1.5 | runwayml/stable-diffusion-v1-5 |
text-to-image, image-to-image |
~6GB+ recommended |
sdxl-base |
SDXL Base 1.0 | stabilityai/stable-diffusion-xl-base-1.0 |
text-to-image, image-to-image |
~12GB+ recommended |
flux-schnell |
FLUX.1-schnell | black-forest-labs/FLUX.1-schnell |
text-to-image |
~16GB+ recommended; 24GB is more comfortable |
These VRAM values are practical recommendations, not hard guarantees. Actual usage depends on resolution, precision, CPU/GPU offload, driver/runtime behavior, and memory optimizations available in the local environment.
Current baseline bundle flows are verified on the active Linux ARM64 path and remain intended to support the existing Windows candidate path. On Linux ARM64 with NVIDIA GB10 and torch==2.11.0+cu130, both SDXL Base image-to-image Style reference (sdxl_ip_adapter_style) and SD1.5 image-to-image Style reference (sd15_ip_adapter_style) have passed installed local-only smoke. SD1.5 promotion is scoped to image-to-image only; SD1.5 text-to-image is unchanged. This does not promote ControlNet, Windows compatibility, or public release readiness. Windows remains prepared/intended but candidate/unverified until GitHub-installed Install/Repair, readiness, and generation evidence is collected on Windows.
For FLUX.1-schnell weights, open https://huggingface.co/black-forest-labs/FLUX.1-schnell, log in, accept the model conditions, and share contact information if Hugging Face requests it. Use the same Hugging Face account/token that Modly uses for the download; otherwise the extension cannot download the weights.
- This is a bundled local image model extension pack for Modly-compatible local model workflows.
- Host-side bundle/model support is still ahead of upstream Modly in a few areas and is being discussed in Modly issue #114, so use a compatible Modly branch/fork until that support lands upstream.
- Current preview flows use the primary generated image (
output_path). The runtime can preserve additional output metadata internally, but richer multi-output display is a future Modly integration task.
- SD15 and SDXL expose text-to-image and image-to-image controls such as prompt, negative prompt, size, steps, guidance scale, seed, and output format.
- SDXL and SD1.5 image-to-image support the optional named
Style referenceinput backed by IP-Adapter on the verified Linux ARM64/GB10/cu130 local path. SD1.5 text-to-image is unchanged. ControlNet remains future explicit-node work. - FLUX.1-schnell exposes text-to-image controls tuned for its pipeline, including prompt, size, steps, guidance scale, maximum sequence length, seed, and output format.
- PNG is the default output format. JPEG output is available with a configurable JPEG quality value.
- FLUX.1-schnell recommends low step counts (
1-4) and guidance0.0, but higher values are allowed for experimentation and may not improve quality.
The existing capability IDs remain unchanged by design:
sd15sdxl-baseflux-schnell
This repository distributes:
- repository code
- manifests
- setup/integration scaffolding
- runtime glue code
- documentation
This repository does not distribute:
- model weights
- checkpoints
- safetensors bundles
- upstream model artifacts
The repository code is licensed under MIT. See LICENSE.
The referenced models remain subject to their original upstream licenses and access conditions. See MODEL_LICENSES.md.
If you use any referenced model, YOU are responsible for obtaining and using its files in compliance with the applicable upstream terms.
- The architecture is model-first.
- Extension identity is the family identity.
params.model_idis legacy compatibility only and must match the fixed extension.- Capabilities are declared in each manifest, not by central branching.
Shared runtime responsibilities:
- bootstrap of
.local-image-runtime/ - state normalization and migration
- Modly
Install GitHub/Repaircontract handling - legacy local install validation
- payload/request validation helpers
- backend dispatch boundary
Per-family responsibilities:
- manifest identity
- exposed nodes
- node defaults and help text
- minimum local source requirements
Current persisted state version: v2.
extensions: ownership by family/extension IDlegacy_models: retained legacy residue for fallback, audit, and later cleanup
Each child keeps its own .local-image-runtime/, but the canonical weight location for Modly is external to this repository:
modelsDir/<ext.id>/<node.id>/...
Examples:
modelsDir/sd15/text-to-image/model_index.jsonmodelsDir/sdxl-base/image-to-image/model_index.jsonmodelsDir/flux-schnell/text-to-image/model_index.json
When shared/runtime/local_image_runtime/ changes, resync the vendored copies:
python3 tools/sync_extension_runtime.py
python3 tools/sync_extension_runtime.py --checkThe operational flow has two separate steps:
- Install GitHub / Repair: run each extension's
setup.pywith a Modly JSON payload to createvenv, install dependencies, and persist readiness. - Install Weights: download model files outside this repo into
modelsDir/<ext.id>/<node.id>/....
Generation is local-only/no-download after Install/Repair and weight installation have acquired the required baseline and optional assets. Install/Repair may acquire supported optional IP-Adapter assets for SDXL and SD1.5 style-reference readiness; generation must use local files only. Windows installation support is prepared as a windows-amd64 candidate path for later validation and must not be described as verified until real Windows evidence exists. ControlNet is intentionally separate future work with explicit nodes.
Example setup invocation:
python3 extensions/sd15/setup.py '{"python_exe":"/usr/bin/python3","ext_dir":"/tmp/modly-sd15","gpu_sm":"87","cuda_version":"128"}'Legacy local CLI commands still exist for scaffold/manual use, but they are not the main Modly contract.
model_index.jsonscheduler/text_encoder/tokenizer/unet/vae/
model_index.jsonscheduler/text_encoder/text_encoder_2/tokenizer/tokenizer_2/unet/vae/
model_index.jsonscheduler/text_encoder/text_encoder_2/tokenizer/tokenizer_2/transformer/vae/
Included here:
- source bundle for 3 local image model extensions
- syncable shared runtime
- persisted v1 -> v2 state migration
- manifests and CLIs per family
Out of scope:
- Modly core changes
- model weight hosting or redistribution
- build/release automation
- Sync the shared runtime and verify vendored copies.
- Verify the
Install GitHub/RepairJSON contract for each child root. - Verify generator protocol behavior with valid JSON over
stdin. - Verify node-scoped weight readiness under
modelsDir. - Recheck the same baseline for
sd15,sdxl-base, andflux-schnell.
This repository remains source-first: it provides the extension bundle and runtime integration while leaving model weight hosting and redistribution to the upstream model providers.