Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v4
with:
submodules: recursive

- name: Setup Bun
uses: oven-sh/setup-bun@v2
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v4
with:
submodules: recursive

- name: Setup Bun
uses: oven-sh/setup-bun@v2
Expand Down
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "ACE-Step-DAW"]
path = ACE-Step-DAW
url = https://github.com/ace-step/ACE-Step-DAW.git
1 change: 1 addition & 0 deletions ACE-Step-DAW
Submodule ACE-Step-DAW added at da9c7e
112 changes: 95 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,73 @@

→ **Full API reference**: [`docs/API.md`](docs/API.md)

## ACE-Step-DAW (submodule + same-origin UI) for Acestep.cpp

![acestep-daw-demo-ezgif com-video-to-gif-converter](https://github.com/user-attachments/assets/1ef3a031-8a84-4bee-8a29-567f620ffa59)

This repo includes **[ACE-Step-DAW](https://github.com/ace-step/ACE-Step-DAW)** as a **git submodule** at `ACE-Step-DAW/`.

Clone with submodules:

```bash
git clone --recurse-submodules <repo-url>
# or after clone:
git submodule update --init --recursive
```

**Demo (DAW UI + API):** set **`ACESTEP_MODELS_DIR`** to a folder containing the usual Hugging Face / acestep.cpp **`.gguf`** files (flat directory). The server **auto-detects** LM, embedding, VAE, DiT **base**, DiT **turbo**, and turbo+**shift** by filename (see [Models directory](#models-directory-always-via-env)). You can still override any path with **`ACESTEP_LM_MODEL`**, etc. Then bundle binaries, build the DAW, start:

```bash
export ACESTEP_MODELS_DIR="$HOME/models/acestep"
bun run bundle:acestep # once per machine: fetch ace-lm / ace-synth
bun run daw:build
bun run start
# Startup logs list scanned roles + effective paths.
```

### How to open the DAW UI

The API and the built DAW share **one HTTP server**. There is no separate “DAW port.”

1. **`bun run daw:build`** must have run successfully so **`ACE-Step-DAW/dist/index.html`** exists (the log line `ACE-Step-DAW static (if built): …` should point at that folder).
2. Start the server (**`bun run start`** or **`bun run src/index.ts`**).
3. In a browser open the **root URL** of that server — by default:

**http://127.0.0.1:8001/**

If you set **`ACESTEP_API_HOST`** / **`ACESTEP_API_PORT`**, use those instead (e.g. `http://127.0.0.1:9000/`).

The server serves the Vite **`dist/`** for ordinary **`GET`** requests (e.g. `/`, `/assets/…`). Deep links to client routes still work because unknown paths fall back to **`index.html`**.

If **`GET /`** returns JSON **`Not Found`**, `dist/` is missing or empty — run **`bun run daw:build`** again or set **`ACESTEP_DAW_DIST`** to a folder that contains a production **`index.html`**.

Static files are served from `ACE-Step-DAW/dist` unless you override **`ACESTEP_DAW_DIST`**.

The DAW’s production client calls **`/api/...`**. This server accepts the **same routes with or without the `/api` prefix** (e.g. `/api/health` and `/health` both work), so you can use the built UI on the **same origin** without Vite’s dev proxy.

Optional: set backend URL in the DAW to **`http://127.0.0.1:<port>`** (no `/api`) in Settings / `localStorage['ace-step-daw-backend-url']` — then requests go to `/release_task`, `/health`, etc. directly.

| Env | Purpose |
|-----|---------|
| **`ACESTEP_DAW_DIST`** | Absolute path to a Vite `dist/` folder (defaults to `<app-root>/ACE-Step-DAW/dist`) |

**Not supported here:** `task_type: stem_separation` (returns **501** — needs the full Python ACE-Step stack). **`/format_input`** / **`/create_random_sample`** remain stubs for API shape compatibility.

**Building the DAW (no submodule edits):** **`bun run daw:build`** runs **`vite build`** inside **`ACE-Step-DAW/`** only. We intentionally do **not** run the submodule’s **`tsc -b`** step here, so vendored **ACE-Step-DAW** stays a pristine upstream checkout while still producing a usable **`dist/`** for this API server. For the full upstream pipeline (typecheck + Vite), run **`npm run build`** inside the submodule yourself when you need it.

CLI usage matches the upstream [acestep.cpp README](https://github.com/audiohacking/acestep.cpp/blob/master/README.md): **MP3 by default** (128 kbps, overridable), **`--wav`** for stereo 48 kHz WAV, plus optional **`--lora`**, **`--lora-scale`**, **`--vae-chunk`**, **`--vae-overlap`**, **`--mp3-bitrate`**.

## Bundled acestep.cpp (v0.0.3)

`bun run build` downloads the correct asset from **[acestep.cpp releases v0.0.3](https://github.com/audiohacking/acestep.cpp/releases/tag/v0.0.3)** for the **current** OS/arch, installs the **full archive contents** under `acestep-runtime/bin/`, compiles `dist/acestep-api`, then copies `acestep-runtime` next to the executable.
`bun run build` downloads the correct asset from **[acestep.cpp releases v0.0.3](https://github.com/audiohacking/acestep.cpp/releases/tag/v0.0.3)** for the **current** OS/arch, **flattens the full archive** into **`acestep-runtime/bin/`** (every file by basename in one directory — no nested `lib/` tree), compiles `dist/acestep-api`, then copies **`acestep-runtime/`** next to the executable.

The prebuilt archives include executables and all shared libraries needed to run them:
The prebuilt archives include executables and all shared libraries needed to run them.

```text
dist/
acestep-api # or acestep-api.exe
acestep-runtime/
bin/
bin/ # flat: entire prebuild payload
ace-lm # 5Hz LM (text + lyrics → audio codes)
ace-synth # DiT + VAE (audio codes → audio)
ace-server # standalone HTTP server
Expand All @@ -24,6 +80,7 @@ dist/
quantize # GGUF requantizer
libggml*.so / *.dylib # GGML shared libraries (Linux / macOS)
*.dll # GGML DLLs (Windows)
(any other files from the release archive)
```

Run the API **from `dist/`** (or anywhere) — the binary resolves siblings via `dirname(execPath)`:
Expand All @@ -39,25 +96,36 @@ Override layout with **`ACESTEP_APP_ROOT`** (directory that should contain `aces

## Models directory (always via env)

GGUF paths can be **absolute**, **relative to the app root** (`./models/...`), or **bare filenames** resolved under a models directory:
Set **`ACESTEP_MODELS_DIR`** (or **`ACESTEP_MODEL_PATH`** / **`MODELS_DIR`**) to a directory containing **`.gguf`** files. The API **scans that directory** (non-recursive) and assigns:

| Variable | Purpose |
|----------|---------|
| **`ACESTEP_MODELS_DIR`** | Base directory for default LM / embedding / DiT / VAE **filenames** |
| **`ACESTEP_MODEL_PATH`** | Alias (same as above) |
| **`MODELS_DIR`** | Extra alias |
| Detected role | Typical filename hints |
|---------------|-------------------------|
| **LM (5Hz)** | `*5Hz*lm*` / acestep LM gguf |
| **Embedding** | `*Embedding*` (e.g. Qwen3-Embedding) |
| **VAE** | `*vae*` (excluding embedding) |
| **DiT base** | `*v15-base*` — **required for [lego mode](https://github.com/audiohacking/acestep.cpp/blob/master/examples/lego.sh)** (turbo does not support lego) |
| **DiT turbo** | `*v15-turbo*` without `shift` |
| **DiT turbo + shift** | `*v15-turbo*` with `shift` |

**Overrides (optional):** if set, these win over scan — **`ACESTEP_LM_MODEL`**, **`ACESTEP_EMBEDDING_MODEL`**, **`ACESTEP_DIT_MODEL`**, **`ACESTEP_VAE_MODEL`**. Paths can be **absolute**, **relative to app root**, or **basenames** under the models directory.

Example (paths from [Hugging Face ACE-Step-1.5-GGUF](https://huggingface.co/Serveurperso/ACE-Step-1.5-GGUF)):
**Logical DiT names** (for `model` / DAW picker): auto-filled from scan into **`ACESTEP_MODEL_MAP`** unless you pass your own JSON in **`ACESTEP_MODEL_MAP`**: `acestep-v15-base`, `acestep-v15-turbo`, `acestep-v15-turbo-shift3`.

- **Default logical model:** **`acestep-v15-base`** (lego-safe). Override with **`ACESTEP_DEFAULT_MODEL`**.
- **Default `model` when none selected:** resolves to **base** DiT if present, else turbo.
- **`task_type: lego`:** always uses **base** DiT, matching [examples/lego.sh](https://github.com/audiohacking/acestep.cpp/blob/master/examples/lego.sh) (phase 2). Request JSON defaults for lego follow [examples/lego.json](https://github.com/audiohacking/acestep.cpp/blob/master/examples/lego.json): **inference_steps 50**, **guidance_scale 1.0**, **shift 1.0** when the client omits them.

Explicit example (same files as [Hugging Face ACE-Step-1.5-GGUF](https://huggingface.co/Serveurperso/ACE-Step-1.5-GGUF)) — optional if autodetect already finds them:

```bash
export ACESTEP_MODELS_DIR="$HOME/models/acestep"
export ACESTEP_LM_MODEL=acestep-5Hz-lm-4B-Q8_0.gguf
export ACESTEP_EMBEDDING_MODEL=Qwen3-Embedding-0.6B-Q8_0.gguf
export ACESTEP_DIT_MODEL=acestep-v15-turbo-Q8_0.gguf
export ACESTEP_DIT_MODEL=acestep-v15-base-Q8_0.gguf # optional; scan prefers base as default DiT
export ACESTEP_VAE_MODEL=vae-BF16.gguf
```

Per-request `lm_model_path` and **`ACESTEP_MODEL_MAP`** values use the same resolution rules.
Per-request **`lm_model_path`** and **`ACESTEP_MODEL_MAP`** still use the same path resolution rules.

## Multi-model support (GET /v1/models + per-request `model`)

Expand Down Expand Up @@ -102,14 +170,12 @@ Generation parameters (`inference_steps`, `guidance_scale`, `bpm`, etc.) are **a
```bash
bun install
bun run bundle:acestep # once: fetch v0.0.3 binaries for this machine
export ACESTEP_MODELS_DIR=...
export ACESTEP_LM_MODEL=...
export ACESTEP_EMBEDDING_MODEL=...
export ACESTEP_DIT_MODEL=...
export ACESTEP_VAE_MODEL=...
export ACESTEP_MODELS_DIR="$HOME/models/acestep" # drop-in GGUFs; roles autodetected
bun run start
```

Add **`ACESTEP_*_MODEL`** overrides only if a file is not detected. For lego, ensure a **`*v15-base*.gguf`** is in that folder (or map it — see [Models directory](#models-directory-always-via-env)).

## Build

```bash
Expand All @@ -128,6 +194,16 @@ bun run build:binary-only # compile only (reuse existing acestep-runtime/)

API `audio_format: "wav"` adds **`--wav`** (no `--mp3-bitrate`).

## Generation / subprocess logs

While a task runs, **`ace-lm`** and **`ace-synth`** **stdout/stderr** are forwarded to the **same terminal** as the API server (each line is interleaved with Bun logs). The server also logs one line per task with parsed flags: `thinking`, `use_format`, `sample_mode`, `needLm`, `lmConfigured`.

| Variable | Purpose |
|----------|---------|
| **`ACESTEP_QUIET_SUBPROCESS`** | Set to **`1`** to stop inheriting child output (logs are captured only on failure; use for CI or noisy runs). |

**DAW + multipart note:** form fields like `thinking=false` arrive as the string **`"false"`**. The API parses those explicitly so **`"false"` does not enable** the LM path (unlike `Boolean("false")` in JavaScript).

## Reference / source audio (cover, repaint, lego)

Modes that need a reference or source track (**cover**, **repaint**, **lego**) require one of:
Expand All @@ -144,6 +220,8 @@ If **`task_type`** is `cover`, `repaint`, or `lego` and neither a path nor an up

Worker uses **`src_audio_path`** when set, otherwise **`reference_audio_path`**; a single `--src-audio` is passed to ace-synth. Request JSON already supports **`audio_cover_strength`**, **`repainting_start`** / **`repainting_end`**, and **`lego`** (track name) per [acestep.cpp README](https://github.com/audiohacking/acestep.cpp/blob/master/README.md).

**Repaint bounds:** (1) If both are set and **`repainting_end` ≤ `repainting_start`**, they are cleared to **`-1`** before enqueue. (2) When **`--src-audio`** is a **WAV**, the worker measures its duration and reclamps repaint to **seconds on that file**; values larger than the file length are treated as **beats** using **`bpm`**, then clamped. If the window still collapses (**`end` ≤ `start`**), both are set to **`-1`** so ace-synth does not error on short context clips.

## API emulation notes

See [`docs/API.md`](docs/API.md) for the full endpoint reference. **`/format_input`** and **`/create_random_sample`** are shape-compatible stubs (no separate LM HTTP service required).
Expand Down
4 changes: 3 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@
"build:windows": "bun run bundle:acestep && bun build ./src/index.ts --compile --minify --sourcemap=external --outfile ./dist/acestep-api.exe && bun run sync:runtime",
"build:binary-only": "bun build ./src/index.ts --compile --minify --sourcemap=external --outfile ./dist/acestep-api && bun run sync:runtime",
"build:all": "bun run build && bun run build:windows",
"test": "bun test"
"test": "bun test ./test",
"daw:install": "(cd ACE-Step-DAW && npm install)",
"daw:build": "(cd ACE-Step-DAW && npm install && npx vite build)"
},
"devDependencies": {
"@types/bun": "latest"
Expand Down
91 changes: 64 additions & 27 deletions scripts/bundle-acestep.ts
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
#!/usr/bin/env bun
/**
* Downloads acestep.cpp v0.0.3 release binaries for the current OS/arch and
* installs the full archive contents under <repo>/acestep-runtime/bin
* (ace-lm, ace-synth, ace-server, ace-understand, neural-codec, mp3-codec,
* quantize, and all shared libraries).
* installs the **full archive** into a **single flat directory**:
* `<repo>/acestep-runtime/bin/` (every file by basename — ace-lm, ace-synth,
* ace-server, ace-understand, neural-codec, mp3-codec, quantize, and all
* shared libraries; no nested lib/ tree).
*
* @see https://github.com/audiohacking/acestep.cpp/releases/tag/v0.0.3
* @see https://github.com/audiohacking/acestep.cpp/blob/master/README.md
*/
import { mkdir, readdir, copyFile, chmod, rm } from "fs/promises";
import { join, dirname } from "path";
import { mkdir, readdir, chmod, rm, copyFile, stat } from "fs/promises";
import { join, dirname, basename } from "path";
import { existsSync } from "fs";

const TAG = "v0.0.3";
Expand Down Expand Up @@ -63,6 +64,41 @@ async function extractArchive(archivePath: string, destDir: string): Promise<voi
}
}

/** Release archives usually contain a single top-level directory. */
async function resolvePackageRoot(extractRoot: string): Promise<string> {
const entries = await readdir(extractRoot, { withFileTypes: true });
const dirs = entries.filter((e) => e.isDirectory());
const files = entries.filter((e) => e.isFile());
if (dirs.length === 1 && files.length === 0) {
return join(extractRoot, dirs[0]!.name);
}
return extractRoot;
}

/**
* Copy every file from the extracted tree into `outBin` using **basename only**
* (flat layout so loaders find libs next to ace-lm / ace-synth).
*/
async function flattenIntoBin(packageRoot: string, outBinDir: string): Promise<number> {
await mkdir(outBinDir, { recursive: true });
const sources = await walkFiles(packageRoot);
const seen = new Map<string, string>();

for (const src of sources) {
const name = basename(src);
const prev = seen.get(name);
if (prev && prev !== src) {
throw new Error(
`[bundle-acestep] Duplicate basename "${name}" in archive:\n ${prev}\n ${src}\n` +
"Cannot flatten to a single directory; report this layout."
);
}
seen.set(name, src);
await copyFile(src, join(outBinDir, name));
}
return sources.length;
}

async function main() {
const asset = pickAsset();
if (!asset) return;
Expand All @@ -88,38 +124,39 @@ async function main() {
console.log(`[bundle-acestep] Extracting to ${extractRoot}`);
await extractArchive(archivePath, extractRoot);

const all = await walkFiles(extractRoot);
const packageRoot = await resolvePackageRoot(extractRoot);
const all = await walkFiles(packageRoot);
const wantLm = process.platform === "win32" ? "ace-lm.exe" : "ace-lm";
const wantSynth = process.platform === "win32" ? "ace-synth.exe" : "ace-synth";
const lm = all.find((p) => (p.split(/[/\\]/).pop() ?? "") === wantLm);
const synth = all.find((p) => (p.split(/[/\\]/).pop() ?? "") === wantSynth);
const lm = all.find((p) => basename(p) === wantLm);
const synth = all.find((p) => basename(p) === wantSynth);
if (!lm || !synth) {
throw new Error(`Could not find ${wantLm} / ${wantSynth} under ${extractRoot}`);
throw new Error(`Could not find ${wantLm} / ${wantSynth} under ${packageRoot}`);
}

await rm(outBin, { recursive: true, force: true });
await mkdir(outBin, { recursive: true });

// Copy every file from the archive root so that shared libraries
// (libggml*.so / *.dylib / *.dll) and helper binaries are all present.
const installed: string[] = [];
for (const srcPath of all) {
const name = srcPath.split(/[/\\]/).pop() ?? "";
const destPath = join(outBin, name);
await copyFile(srcPath, destPath);
installed.push(destPath);
}
const runtimeDir = join(root, "acestep-runtime");
await rm(runtimeDir, { recursive: true, force: true });
console.log(`[bundle-acestep] Flattening ${packageRoot} → ${outBin}`);
const n = await flattenIntoBin(packageRoot, outBin);

if (process.platform !== "win32") {
// Make all regular files (not static libs) executable so every binary works.
for (const destPath of installed) {
if (!destPath.endsWith(".a")) {
await chmod(destPath, 0o755);
}
for (const name of await readdir(outBin)) {
if (name.endsWith(".a")) continue;
const p = join(outBin, name);
const st = await stat(p).catch(() => null);
if (st?.isFile()) await chmod(p, 0o755);
}
}

console.log(`[bundle-acestep] Installed ${installed.length} file(s) to ${outBin}:\n ${installed.map((p) => p.split(/[/\\]/).pop()).join("\n ")}`);
if (!existsSync(join(outBin, wantLm)) || !existsSync(join(outBin, wantSynth))) {
throw new Error(`After install, missing ${wantLm} or ${wantSynth} under ${outBin}`);
}

console.log(
`[bundle-acestep] Installed ${n} files into ${outBin}\n` +
` ${join(outBin, wantLm)}\n` +
` ${join(outBin, wantSynth)}`
);
}

main().catch((e) => {
Expand Down
40 changes: 40 additions & 0 deletions src/audioDuration.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
import { readFile } from "fs/promises";

/** Parse a PCM WAV file and return duration in seconds, or null if not a readable WAV. */
export async function readWavDurationSeconds(filePath: string): Promise<number | null> {
try {
const buf = await readFile(filePath);
if (buf.length < 44) return null;
if (buf.toString("ascii", 0, 4) !== "RIFF" || buf.toString("ascii", 8, 12) !== "WAVE") return null;

let off = 12;
let sampleRate = 44100;
let dataSize = 0;
let bitsPerSample = 16;
let numChannels = 1;

while (off + 8 <= buf.length) {
const id = buf.toString("ascii", off, off + 4);
const size = buf.readUInt32LE(off + 4);
const chunkStart = off + 8;
if (chunkStart + size > buf.length) break;

if (id === "fmt ") {
numChannels = buf.readUInt16LE(chunkStart + 2);
sampleRate = buf.readUInt32LE(chunkStart + 4);
bitsPerSample = buf.readUInt16LE(chunkStart + 14);
} else if (id === "data") {
dataSize = size;
break;
}
off = chunkStart + size + (size % 2);
}

const bytesPerFrame = numChannels * (bitsPerSample / 8);
if (bytesPerFrame <= 0 || sampleRate <= 0 || dataSize <= 0) return null;
const numSamples = dataSize / bytesPerFrame;
return numSamples / sampleRate;
} catch {
return null;
}
}
Loading
Loading