Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions crates/deny.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,16 @@ ignore = [
"RUSTSEC-2025-0090", # unic-emoji-char
"RUSTSEC-2025-0098", # unic-ucd-version
"RUSTSEC-2025-0100", # unic-ucd-ident

# pyo3 vulnerabilities fixed in 0.29.0, but we are pinned to 0.27 because our
# transitive deps numpy (<=0.28) and pyo3-async-runtimes (<=0.28) only support
# pyo3 ^0.28 and have no pyo3 0.29-compatible release yet. Neither vulnerable
# code path is reachable in coglet: we never call nth/nth_back on PyList/PyTuple
# iterators, never use PyCFunction::new_closure, and run GIL-bound (abi3-py310,
# no free-threading). Remove these once numpy/pyo3-async-runtimes ship pyo3 0.29
# support and we can upgrade.
"RUSTSEC-2026-0176", # Out-of-bounds read in nth/nth_back for PyList/PyTuple iterators
"RUSTSEC-2026-0177", # Missing Sync bound on PyCFunction::new_closure closures
]

[licenses]
Expand Down
15 changes: 15 additions & 0 deletions examples/blur/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Blur

This model applies box blur to an input image.

## Usage

First, make sure you've got the [latest version of Cog](https://github.com/replicate/cog#install) installed.

Run predictions on the model:

```sh
cog predict -i image=@examples/kodim24.png -i blur=4

cog predict -i image=@examples/kodim24.png -i blur=6
```
4 changes: 4 additions & 0 deletions examples/blur/cog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
build:
python_version: "3.12"
python_requirements: requirements.txt
run: "run.py:Runner"
Binary file added examples/blur/examples/kodim24-blur.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/blur/examples/kodim24.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions examples/blur/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
pillow==12.1.1
20 changes: 20 additions & 0 deletions examples/blur/run.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
import tempfile

from PIL import Image, ImageFilter

from cog import BaseRunner, Input, Path


class Runner(BaseRunner):
def run(
self,
image: Path = Input(description="Input image"),
blur: float = Input(description="Blur radius", default=5),
) -> Path:
if blur == 0:
return image
im = Image.open(str(image))
im = im.filter(ImageFilter.BoxBlur(blur))
out_path = Path(tempfile.mkdtemp()) / "out.png"
im.save(str(out_path))
return out_path
15 changes: 15 additions & 0 deletions examples/canary/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Canary

This simple model takes a string as input and returns a streaming string output.

## Usage

First, make sure you've got the [latest version of Cog](https://github.com/replicate/cog#install) installed.

Run predictions on the model:

```sh
cog predict -i text=Athena

cog predict -i text=Zeus
```
3 changes: 3 additions & 0 deletions examples/canary/cog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
build:
python_version: "3.12"
run: "run.py:Runner"
10 changes: 10 additions & 0 deletions examples/canary/run.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
from cog import BaseRunner, ConcatenateIterator, Input


class Runner(BaseRunner):
def run(
self, text: str = Input(description="Text to prefix with 'hello there, '")
) -> ConcatenateIterator[str]:
yield "hello "
yield "there, "
yield text
144 changes: 144 additions & 0 deletions examples/experimental/resnet-managed-weights/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# examples/experimental/resnet-managed-weights

ResNet50 image classifier (microsoft/resnet-50 from HuggingFace) packaged
with v1 managed weights. Takes an image, returns top-3 ImageNet classes.

Use this as a starting point for packaging a real model with managed weights.

## What are managed weights?

Managed weights separate your model weights from your model image. Instead of
baking multi-GB weight files into the Docker image (slow builds, huge layers),
cog packs them into dedicated OCI layers that get mounted at runtime.

The key idea: your `run.py` reads weights from a path like
`/src/weights/resnet50`, but those files don't live inside the Docker image --
they arrive separately and get overlaid at that path when the container starts.

## File layout

```
examples/experimental/resnet-managed-weights/
├── cog.yaml # model config -- declares weights, build settings
├── run.py # runner -- loads weights from target path
├── requirements.txt # python deps
├── weights.lock # generated by `cog weights import` -- don't hand-edit
├── .dockerignore # keeps local weight dirs out of the Docker build context
├── .gitignore # keeps local weight dirs and .cog/ out of git
├── hotdog.png # test image
└── cat.png # test image
```

Weight files themselves don't live in the project directory. `cog weights import`
downloads them into a content-addressed store at `~/.cache/cog/weights/` (override
with `$COG_CACHE_DIR`). When you run `cog run`, cog assembles a temporary
directory under `.cog/mounts/` using hardlinks from the store and bind-mounts it
into the container at the `target` path. The mount dir is cleaned up when the
container stops.

## How `cog.yaml` works

```yaml
weights:
- name: resnet50
source:
uri: hf://microsoft/resnet-50 # where to fetch from
exclude: # files to skip
- "pytorch_model.bin"
- "flax_model.msgpack"
- "tf_model.h5"
- "README.md"
- ".gitattributes"
target: /src/weights/resnet50 # where files appear in the container
```

**`name`** -- an identifier for this weight set. Used in lockfile entries and
OCI tags. Pick something short and descriptive.

**`source.uri`** -- where the weights come from. Two formats:

- `hf://<org>/<repo>` -- pulls from HuggingFace Hub
- A local directory path (e.g. `weights/`) -- uses files already on disk

**`source.exclude`** -- glob patterns for files to skip. Most HF repos ship
weights in multiple formats (PyTorch, TF, Flax, ONNX). Exclude the ones you
don't need -- it'll save gigabytes.

**`target`** -- the absolute path where weight files land inside the container.
Your `run.py` loads from this path. Must start with `/`.

## Getting started

### 1. Import weights

This downloads weight files from HuggingFace into the local cache and
generates `weights.lock`:

```sh
cd examples/experimental/resnet-managed-weights
cog weights import
```

The lockfile records digests and sizes for every file. It's how cog knows
whether weights have changed on subsequent imports. Commit `weights.lock`
to version control.

### 2. Run a prediction locally

```sh
cog run -i image=@hotdog.png
```

Locally, cog assembles the weight files from the cache and bind-mounts them
into the container at the `target` path. You don't need to push anything to test.

### 3. Build and push

```sh
cog push
```

This builds the model image and pushes it as the model named by `model:`
in `cog.yaml`, alongside the weight layers as an OCI image index. The weights
and model image are separate artifacts in the registry -- the image index ties
them together.

## Important: `.dockerignore`

The `.dockerignore` excludes `weights/` and `.cog/weights-cache/` from the
Docker build context. This matters if you're using local directory weight
sources -- without it, Docker would send the full weight directory to the
build daemon on every `cog build`.

## Adapting this for your own model

1. Copy this directory as a starting point
2. Edit `cog.yaml`:
- Change `source.uri` to your HuggingFace repo (or a local path)
- Adjust `exclude` patterns for the formats you don't need
- Set `target` to wherever your code expects to find the weights
- Set `model` to your model name (required for `cog push`)
3. Edit `run.py` to load your model from `WEIGHTS_DIR`
4. Update `requirements.txt` with your dependencies
5. Run `cog weights import` to fetch weights and generate the lockfile
6. Test with `cog run`
7. Push with `cog push`

### Using local weights instead of HuggingFace

If you already have weights on disk (downloaded separately, trained locally,
etc.), point the source at a local directory:

```yaml
weights:
- name: my-model
source:
uri: my-weights-dir/
include:
- "*.safetensors"
- "*.json"
target: /src/weights/my-model
```

Then run `cog weights import` as usual -- it'll hash the local files and
generate the lockfile.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
28 changes: 28 additions & 0 deletions examples/experimental/resnet-managed-weights/cog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# ResNet50 image classifier using v1 managed weights.
#
# Weights are pulled from HuggingFace at import time:
# cog weights import
#
# Build and push:
# cog push

model: resnet-managed-weights

build:
gpu: true
python_version: "3.13"
python_requirements: requirements.txt

run: "run.py:Runner"

weights:
- name: resnet50
source:
uri: hf://microsoft/resnet-50
exclude:
- "pytorch_model.bin" # legacy format, redundant with model.safetensors
- "flax_model.msgpack" # Flax/JAX weights
- "tf_model.h5" # TensorFlow weights
- "README.md"
- ".gitattributes"
target: /src/weights/resnet50
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
pillow==12.1.1
torch==2.8.0
transformers==4.52.3
27 changes: 27 additions & 0 deletions examples/experimental/resnet-managed-weights/run.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
import torch
from PIL import Image
from transformers import AutoImageProcessor, ResNetForImageClassification

from cog import BaseRunner, Input, Path

WEIGHTS_DIR = "/src/weights/resnet50"


class Runner(BaseRunner):
def setup(self) -> None:
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
self.processor = AutoImageProcessor.from_pretrained(WEIGHTS_DIR)
self.model = ResNetForImageClassification.from_pretrained(WEIGHTS_DIR)
self.model = self.model.to(self.device)
self.model.eval()

def run(self, image: Path = Input(description="Image to classify")) -> dict:
img = Image.open(image).convert("RGB")
inputs = self.processor(img, return_tensors="pt").to(self.device)

with torch.no_grad():
logits = self.model(**inputs).logits

top3 = logits[0].softmax(0).topk(3)
labels = self.model.config.id2label
return {labels[i.item()]: p.item() for p, i in zip(*top3, strict=True)}
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,21 @@
{
"name": "resnet50",
"target": "/src/weights/resnet50",
"source": {
"uri": "hf://microsoft/resnet-50",
"fingerprint": "commit:34c2154c194f829b11125337b98c8f5f9965ff19",
"include": [],
"exclude": [
".gitattributes",
"README.md",
"flax_model.msgpack",
"pytorch_model.bin",
"tf_model.h5"
],
"importedAt": "2026-04-30T21:49:23.515142Z"
},
"sources": [
{
"uri": "hf://microsoft/resnet-50",
"fingerprint": "commit:34c2154c194f829b11125337b98c8f5f9965ff19",
"include": [],
"exclude": [
".gitattributes",
"README.md",
"flax_model.msgpack",
"pytorch_model.bin",
"tf_model.h5"
]
}
],
"importedAt": "2026-06-11T20:45:55.187706Z",
"digest": "sha256:d2daafad96409df82d69df3c92192d2e651344f579a12683a59e4a6140a5abf5",
"setDigest": "sha256:52924993c7eff45d5d1deaecf1f375d774c30faa1b4ce61379f5d552fd376744",
"size": 102552676,
Expand Down
18 changes: 18 additions & 0 deletions examples/hello-concurrency/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# The .dockerignore file excludes files from the container build process.
#
# https://docs.docker.com/engine/reference/builder/#dockerignore-file

# Exclude Git files
.git
.github
.gitignore

# Exclude Python cache files
__pycache__
.mypy_cache
.pytest_cache
.ruff_cache

# Exclude Python virtual environment
.venv
venv
2 changes: 2 additions & 0 deletions examples/hello-concurrency/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
.venv
honeycomb_token.key
26 changes: 26 additions & 0 deletions examples/hello-concurrency/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# hello-concurrency

This is an example Cog project that demonstrates the newly added concurrency support within
cog >= 0.14.0.

The key piece is the new `concurrency` field in the cog.yaml.

```yaml
concurrency:
max: 4
```

This combined with the async setup and run methods in `run.py` allows Cog to run up to
4 concurrent predictions. If Cog reaches the max concurrency threshold it will reject subsequent
predictions with a `409 Conflict` response.
Comment thread
anish-sahoo marked this conversation as resolved.

### Telemetry

It also uses the open-telemetry package to demonstrate how to collect telemetry for your model.

This requires a file named `honeycomb_token.key` to be included in the image build.

It will then start sending events to the `cog-model` data source. You can configure this by
editing the `OTEL_SERVICE_NAME`. If you use a custom endpoint this can be configured via `OTEL_EXPORTER_OTLP_ENDPOINT`.

Lastly, there is a section in `run.py` that can be uncommented to run telemetry locally and print events to the console for debugging.
9 changes: 9 additions & 0 deletions examples/hello-concurrency/cog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Configuration for Cog ⚙️
# Reference: https://github.com/replicate/cog/blob/main/docs/yaml.md
build:
gpu: false
python_version: "3.12"
python_requirements: requirements.txt
run: "run.py:Runner"
concurrency:
max: 4
3 changes: 3 additions & 0 deletions examples/hello-concurrency/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
opentelemetry-api
opentelemetry-sdk
opentelemetry-exporter-otlp-proto-http
Loading
Loading