arjunrajlaboratory · arjunrajlab · Feb 6, 2026 · Feb 6, 2026 · Feb 6, 2026 · Feb 25, 2026
diff --git a/.claude/skills/nimbus-interface/SKILL.md b/.claude/skills/nimbus-interface/SKILL.md
@@ -0,0 +1,110 @@
+---
+name: nimbus-interface
+description: Reference for the NimbusImage/Girder API used by all workers in this repository. Use when building, debugging, or testing NimbusImage workers — including image loading, annotation CRUD, property computation, multi-channel merging, coordinate conversions, local test environments, and infrastructure troubleshooting (e.g. HTTP 500 errors). Also use when writing test scripts that interact with the Nimbus API.
+---
+
+# NimbusImage Worker Development
+
+## Quick Start
+
+Determine the task type:
+- **Building/modifying a worker** → See [references/api.md](references/api.md) for full API patterns
+- **Debugging HTTP 500 errors** → Check prerequisites below
+- **Writing local test scripts** → See local testing section below
+- **Coordinate confusion** → See critical pitfalls below
+
+## Infrastructure Prerequisites
+
+The Girder server requires **MongoDB**. Without it, all endpoints return HTTP 500 (except `/system/version`). Debug with:
+```bash
+docker ps | grep mongo  # Must be running
+curl -s http://localhost:8080/api/v1/system/version  # Works without MongoDB
+```
+
+Full stack: `girder`, `worker` (celery), `rabbitmq`, `memcached`, `mongodb`.
+Compose file: `/home/arjun/UPennContrast/docker-compose.yaml`.
+
+## Critical Pitfalls
+
+### Coordinate swap (numpy vs annotations)
+Numpy is `[row, col]` = `[y, x]`. Annotations use `{'x': pixel_x, 'y': pixel_y}`.
+```python
+# skimage contour (row, col) → annotation:
+coords = [{'x': float(col), 'y': float(row)} for row, col in contour]
+
+# Use annotation_tools helpers to avoid manual swaps:
+from annotation_utilities.annotation_tools import polygons_to_annotations, annotations_to_polygons
+```
+
+### The 0.5 pixel offset
+scikit-image uses pixel centers; Girder uses top-left corner:
+```python
+polygon = np.array([[c['y'] - 0.5, c['x'] - 0.5] for c in annotation['coordinates']])
+rr, cc = draw.polygon(polygon[:, 0], polygon[:, 1], shape=image.shape)
+```
+
+### Tags interface returns a list, not a dict
+```python
+# CORRECT:
+tags = params['workerInterface'].get('Training Tag', [])
+# WRONG (crashes with AttributeError):
+tags = params['workerInterface'].get('Training Tag', {}).get('tags', [])
+```
+
+### Multi-channel merge output dtype
+`process_and_merge_channels` returns `float64` with values 0-255 (not 0-1). Convert for ML:
+```python
+rgb_uint8 = np.clip(merged, 0, 255).astype(np.uint8)
+```
+
+Typical shapes:
+- `getRegion().squeeze()`: `(H, W)` uint16
+- `get_images_for_all_channels`: each `(H, W, 1)` uint16
+- `process_and_merge_channels`: `(H, W, 3)` float64, values 0-255
+
+## Local Testing
+
+### Avoid importing entrypoint.py
+Worker entrypoints import heavy ML libraries (torch, sam2) at module level. Copy helper functions locally instead of importing the entrypoint.
+
+### Local venv dependencies
+```bash
+pip install girder-client tifffile
+pip install -e /home/arjun/UPennContrast/devops/girder/annotation_client
+pip install -e /home/arjun/ImageAnalysisProject/annotation_utilities
+pip install -e /home/arjun/ImageAnalysisProject/worker_client
+pip install numpy scipy scikit-image shapely matplotlib pillow numba
+# ML deps (torch, sam2, etc.) only needed for inference, not API testing
+```
+
+### Authentication for test scripts
+```python
+import girder_client
+gc = girder_client.GirderClient(apiUrl='http://localhost:8080/api/v1')
+gc.authenticate('username', 'password')
+token = gc.token  # Use this token with annotation_client classes
+```
+Env vars: `NIMBUS_API_URL` (default `http://localhost:8080/api/v1`), `NIMBUS_TOKEN`.
+
+### Test dataset
+Dataset `69988c84b48d8121b565aba4`: 2 channels (Brightfield, YFP), 7Z, 4T, 6XY, 1024x1022 uint16. 544 polygons tagged "YFP blob" at XY=0 Z=3 Time=0.
+
+## Key Packages
+
+| Package | Location |
+|---------|----------|
+| annotation_client | `/home/arjun/UPennContrast/devops/girder/annotation_client/` |
+| annotation_utilities | `/home/arjun/ImageAnalysisProject/annotation_utilities/` |
+| worker_client | `/home/arjun/ImageAnalysisProject/worker_client/` |
+| Workers | `/home/arjun/ImageAnalysisProject/workers/` |
+
+Key source files: `annotation_client/{annotations,tiles,workers}.py`, `annotation_utilities/{annotation_tools,batch_argument_parser}.py`
+
+## Detailed API Reference
+
+See [references/api.md](references/api.md) for complete API patterns including:
+- Image access (single frame, subregion, multi-channel merge)
+- Annotation CRUD (fetch, filter, create, delete)
+- Property value computation and submission
+- Writing images back to Girder
+- Worker interface type table
diff --git a/.claude/skills/nimbus-interface/references/api.md b/.claude/skills/nimbus-interface/references/api.md
@@ -0,0 +1,204 @@
+# NimbusImage API Reference
+
+## Table of Contents
+- [Image Access](#image-access)
+- [Annotations](#annotations)
+- [Property Values](#property-values)
+- [Writing Images to Girder](#writing-images-to-girder)
+- [Worker Interface Types](#worker-interface-types)
+
+---
+
+## Image Access
+
+### Setup
+```python
+import annotation_client.tiles as tiles
+tileClient = tiles.UPennContrastDataset(apiUrl=apiUrl, token=token, datasetId=datasetId)
+```
+
+### Metadata
+```python
+idx = tileClient.tiles['IndexRange']
+num_channels = idx.get('IndexC', 1)
+num_z = idx.get('IndexZ', 1)
+num_time = idx.get('IndexT', 1)
+num_xy = idx.get('IndexXY', 1)
+size_x = tileClient.tiles['sizeX']
+size_y = tileClient.tiles['sizeY']
+channel_names = tileClient.tiles.get('channels', [])
+pixel_scale = tileClient.tiles.get('mm_x')  # mm per pixel
+```
+
+### Single frame
+```python
+frame = tileClient.coordinatesToFrameIndex(XY, Z=z, T=time, channel=channel)
+image = tileClient.getRegion(datasetId, frame=frame).squeeze()
+# Returns (H, W) uint16
+```
+
+### Subregion
+```python
+image = tileClient.getRegion(datasetId, frame=frame,
+    left=x_min, top=y_min, right=x_max, bottom=y_max,
+    units="base_pixels").squeeze()
+```
+
+### Multi-channel merged RGB
+```python
+import annotation_utilities.annotation_tools as annotation_tools
+
+images = annotation_tools.get_images_for_all_channels(tileClient, datasetId, XY, Z, Time)
+# Each: (H, W, 1) uint16
+layers = annotation_tools.get_layers(tileClient.client, datasetId)
+merged = annotation_tools.process_and_merge_channels(images, layers)
+# Returns: (H, W, 3) float64, values 0-255
+```
+Merge modes: `'lighten'` (max, default), `'add'` (sum), `'screen'`.
+
+---
+
+## Annotations
+
+### Client setup
+```python
+import annotation_client.annotations as annotations_client
+annotationClient = annotations_client.UPennContrastAnnotationClient(apiUrl=apiUrl, token=token)
+```
+
+### Data structure
+```python
+{
+    'shape': 'polygon',  # or 'point', 'line'
+    'coordinates': [{'x': float, 'y': float}, ...],
+    'location': {'XY': int, 'Z': int, 'Time': int},
+    'channel': int,
+    'datasetId': str,
+    'tags': ['tag1', 'tag2'],
+}
+```
+
+### Fetch
+```python
+polygons = annotationClient.getAnnotationsByDatasetId(datasetId, shape='polygon')
+
+# Filter by tags server-side (must JSON-serialize)
+import json
+polygons = annotationClient.getAnnotationsByDatasetId(
+    datasetId, shape='polygon', tags=json.dumps(['my_tag']))
+
+ann = annotationClient.getAnnotationById(annotationId)
+```
+
+### Client-side filtering
+```python
+import annotation_utilities.annotation_tools as annotation_tools
+
+filtered = annotation_tools.get_annotations_with_tags(annotations, tags, exclusive=False)
+# exclusive=False: any matching tag; exclusive=True: exact tag set match
+
+filtered = annotation_tools.filter_elements_T_XY_Z(annotations, time, xy, z)
+```
+
+### Create
+```python
+annotationClient.createAnnotation(annotation_dict)
+annotationClient.createMultipleAnnotations(annotation_list)  # preferred
+
+# Using helpers (handles coordinate swap):
+from annotation_utilities.annotation_tools import polygons_to_annotations
+annotations = polygons_to_annotations(
+    shapely_polygons, datasetId, XY=0, Time=0, Z=0, tags=['my_tag'], channel=0)
+```
+
+### Delete
+```python
+annotationClient.deleteAnnotation(annotationId)
+annotationClient.deleteMultipleAnnotations([id1, id2, ...])
+```
+
+---
+
+## Property Values
+
+### Setup
+```python
+import annotation_client.workers as workers
+workerClient = workers.UPennContrastWorkerClient(datasetId, apiUrl, token, params)
+```
+
+### Get annotations for computation
+```python
+annotationList = workerClient.get_annotation_list_by_shape('polygon', limit=0)
+annotationList = annotation_tools.get_annotations_with_tags(
+    annotationList,
+    params.get('tags', {}).get('tags', []),
+    params.get('tags', {}).get('exclusive', False))
+```
+
+### Submit values
+```python
+property_values = {}
+for ann in annotationList:
+    property_values[ann['_id']] = {
+        'Area': float(area),
+        'MeanIntensity': float(mean),
+    }
+workerClient.add_multiple_annotation_property_values({datasetId: property_values})
+```
+
+### Nested properties (per-Z, per-channel)
+```python
+property_values[ann['_id']] = {
+    'MeanIntensity': {'z001': 42.0, 'z002': 84.0},
+}
+```
+
+### Pixel scale
+```python
+pixel_size = params['scales']['pixelSize']  # {'unit': 'mm', 'value': 0.000219}
+z_step = params['scales']['zStep']
+t_step = params['scales']['tStep']
+```
+
+---
+
+## Writing Images to Girder
+
+```python
+import large_image as li
+
+sink = li.new()
+for i, frame in enumerate(tileClient.tiles['frames']):
+    large_image_params = {f'{k.lower()[5:]}': v for k, v in frame.items()
+                          if k.startswith('Index') and len(k) > 5}
+    image = tileClient.getRegion(datasetId, frame=i).squeeze()
+    processed = your_function(image)
+    sink.addTile(processed, 0, 0, **large_image_params)
+
+if 'channels' in tileClient.tiles:
+    sink.channelNames = tileClient.tiles['channels']
+sink.mm_x = tileClient.tiles['mm_x']
+sink.mm_y = tileClient.tiles['mm_y']
+sink.magnification = tileClient.tiles['magnification']
+
+sink.write('/tmp/output.tiff')
+gc = tileClient.client
+item = gc.uploadFileToFolder(datasetId, '/tmp/output.tiff')
+gc.addMetadataToItem(item['itemId'], {'tool': 'YourWorker'})
+```
+
+---
+
+## Worker Interface Types
+
+| Type | Returns | Example |
+|------|---------|---------|
+| `number` | `int`/`float` | `32`, `0.5` |
+| `text` | `str` | `"1-3, 5-8"` |
+| `select` | `str` | `"model_name.pt"` |
+| `checkbox` | `bool` | `True` |
+| `channel` | `int` | `0` |
+| `channelCheckboxes` | `dict[str, bool]` | `{"0": True, "1": False}` |
+| `tags` | `list[str]` | `["DAPI blob"]` |
+| `layer` | `str` | `"layer_id"` |
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -72,6 +72,53 @@ def compute(datasetId, apiUrl, token, params):
 
 Interface types: `number`, `text`, `select`, `checkbox`, `channel`, `channelCheckboxes`, `tags`, `layer`, `notes`
 
+### Interface Parameter Data Types (What `params['workerInterface']` Returns)
+
+Each interface type returns a specific data type in `params['workerInterface']['FieldName']`:
+
+| Interface Type | Returns | Example Value |
+|----------------|---------|---------------|
+| `number` | `int` or `float` | `32`, `0.5` |
+| `text` | `str` | `"1-3, 5-8"`, `""` |
+| `select` | `str` | `"sam2.1_hiera_small.pt"` |
+| `checkbox` | `bool` | `True`, `False` |
+| `channel` | `int` | `0` |
+| `channelCheckboxes` | `dict` of `str` → `bool` | `{"0": True, "1": False, "2": True}` |
+| `tags` | **`list` of `str`** | `["DAPI blob"]`, `["cell", "nucleus"]` |
+| `layer` | `str` | `"layer_id"` |
+
+**Common pitfall with `tags`**: The `tags` type returns a **plain list of strings**, NOT a dict. Do not call `.get('tags')` on the result.
+
+```python
+# CORRECT - tags returns a list directly:
+training_tags = params['workerInterface'].get('Training Tag', [])
+# training_tags = ["DAPI blob"]
+
+# WRONG - will crash with AttributeError: 'list' object has no attribute 'get':
+training_tags = params['workerInterface'].get('Training Tag', {}).get('tags', [])
+```
+
+**Note**: `params['tags']` (the top-level output tags for the worker, NOT a workerInterface field) is also a plain list of strings (e.g., `["DAPI blob"]`). Meanwhile, `params['tags']` used in property workers via `workerClient.get_annotation_list_by_shape()` uses `{'tags': [...], 'exclusive': bool}` — these are two different things.
+
+**Validating tags** (recommended pattern from cellpose_train, piscis):
+```python
+tags = workerInterface.get('My Tag Field', [])
+if not tags or len(tags) == 0:
+    sendError("No tag selected", "Please select at least one tag.")
+    return
+```
+
+**Using tags to filter annotations**:
+```python
+# Pass the list directly to annotation_tools
+filtered = annotation_tools.get_annotations_with_tags(
+    annotation_list, tags, exclusive=False)
+
+# Or with Girder API (must JSON-serialize)
+annotations = annotationClient.getAnnotationsByDatasetId(
+    datasetId, shape='polygon', tags=json.dumps(tags))
+```
+
 ### Key APIs
 
 **annotation_client** (installed from NimbusImage repo):

diff --git a/build_machine_learning_workers.sh b/build_machine_learning_workers.sh
@@ -39,6 +39,11 @@ docker build . -f ./workers/annotations/sam2_automatic_mask_generator/Dockerfile
 # Command for M1:
 # docker build . -f ./workers/annotations/sam2_automatic_mask_generator/Dockerfile_M1 -t annotations/sam2_automatic_mask_generator:latest $NO_CACHE
 
+echo "Building SAM2 few-shot segmentation worker"
+docker build . -f ./workers/annotations/sam2_fewshot_segmentation/Dockerfile -t annotations/sam2_fewshot_segmentation:latest $NO_CACHE
+# Command for M1:
+# docker build . -f ./workers/annotations/sam2_fewshot_segmentation/Dockerfile_M1 -t annotations/sam2_fewshot_segmentation:latest $NO_CACHE
+
 echo "Building SAM2 propagate worker"
 docker build . -f ./workers/annotations/sam2_propagate/$DOCKERFILE -t annotations/sam2_propagate_worker:latest $NO_CACHE