Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ on:
branches: [main]
pull_request:

permissions:
contents: read

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.ref_name }}
cancel-in-progress: true
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/integration-nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ on:
description: "Specific model ID to test (blank = all non-gated models that fit in memory)"
required: false

permissions:
contents: read

concurrency:
group: nightly-integration
cancel-in-progress: true
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/integration-prerelease.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ on:
types: [created]
workflow_dispatch:

permissions:
contents: read

concurrency:
group: prerelease-integration
cancel-in-progress: true
Expand Down
59 changes: 43 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,7 @@ Most local LLM tools serve **one model at a time** and leave you to figure out w

```bash
uv tool install mlx-stack
mlx-stack init --accept-defaults # detects hardware, picks models, generates configs
mlx-stack up # 3 model servers + API gateway, one command
mlx-stack setup # detects hardware, picks models, pulls, starts — one command
# → OpenAI-compatible API at http://localhost:4000/v1
```

Expand Down Expand Up @@ -141,6 +140,33 @@ uvx mlx-stack profile

## Quick Start

The fastest way to get running is the interactive setup command:

```bash
mlx-stack setup
```

This walks you through hardware detection, model selection, downloading, and starting all services in one guided flow. For CI or scripting, pass `--accept-defaults` to skip all prompts:

```bash
mlx-stack setup --accept-defaults
```

The OpenAI-compatible API is now available at `http://localhost:4000/v1`.

```bash
# Check service health
mlx-stack status

# Stop everything when done
mlx-stack down
```

<details>
<summary>Manual step-by-step setup</summary>

If you prefer full control over each step:

```bash
# 1. Detect your hardware
mlx-stack profile
Expand All @@ -158,17 +184,20 @@ mlx-stack up
mlx-stack status
```

The OpenAI-compatible API is now available at `http://localhost:4000/v1`.

```bash
# Stop everything when done
mlx-stack down
```
</details>

## CLI Reference

### Setup & Configuration

**`mlx-stack setup`** — Interactive guided setup: detects hardware, selects models, pulls weights, and starts the stack in one command.

| Option | Description |
|--------|-------------|
| `--accept-defaults` | Skip all prompts and use recommended defaults |
| `--intent <balanced\|agent-fleet>` | Use case intent (prompted if not provided) |
| `--budget-pct <10-90>` | Memory budget as percentage of unified memory (default: 40) |

| Command | Description |
|---------|-------------|
| `mlx-stack profile` | Detect Apple Silicon hardware and save profile to `~/.mlx-stack/profile.json` |
Expand Down Expand Up @@ -294,7 +323,7 @@ mlx-stack is designed to run unattended on always-on hardware like a Mac Mini.
### Quick setup

```bash
mlx-stack init --accept-defaults
mlx-stack setup --accept-defaults
mlx-stack install
```

Expand Down Expand Up @@ -407,14 +436,12 @@ See [DEVELOPING.md](DEVELOPING.md) for the full developer guide, including proje
# Install dev dependencies
uv sync

# Run tests
uv run pytest

# Type checking
uv run python -m pyright
# Run all checks (lint + typecheck + tests) — same as CI
make check

# Linting
uv run ruff check src/ tests/
# Or individually
make lint # ruff + pyright
make test # pytest with coverage
```

## Contributing
Expand Down
16 changes: 12 additions & 4 deletions tests/unit/test_ops_cross_area.py
Original file line number Diff line number Diff line change
Expand Up @@ -410,6 +410,15 @@ def follow_thread() -> None:
output_callback=lambda text: captured.append(text),
)

def wait_for_content(marker: str, timeout: float = 5.0) -> bool:
"""Wait until marker appears in captured output."""
end = time.monotonic() + timeout
while time.monotonic() < end:
if any(marker in c for c in captured):
return True
time.sleep(0.05)
return False

thread = threading.Thread(target=follow_thread, daemon=True)
thread.start()

Expand All @@ -423,7 +432,9 @@ def follow_thread() -> None:
with open(log, "a") as f:
f.write("after-truncation\n")

time.sleep(1.0)
assert wait_for_content("after-truncation"), (
f"Timed out waiting for 'after-truncation' in captured output: {captured}"
)

import ctypes

Expand All @@ -435,9 +446,6 @@ def follow_thread() -> None:
)
thread.join(timeout=3)

combined = "\n".join(captured)
assert "after-truncation" in combined

def test_follow_continues_after_multiple_rotations(
self, mlx_stack_home: Path, logs_dir: Path
) -> None:
Expand Down
Loading