Skip to content

Deployment Automation, UI, Audio, and Security Improvements for MiniCPM-o 4.5#1076

Open
LujiaJin wants to merge 13 commits intoOpenBMB:mainfrom
LujiaJin:main
Open

Deployment Automation, UI, Audio, and Security Improvements for MiniCPM-o 4.5#1076
LujiaJin wants to merge 13 commits intoOpenBMB:mainfrom
LujiaJin:main

Conversation

@LujiaJin
Copy link

Summary

This PR introduces improvements across deployment automation, cross-device access, audio quality, storage performance, and security hygiene for MiniCPM-o 4.5.


Changes

🚀 Remote Deployment Workflow (deploy/)

Added English and Chinese deployment guides (DEPLOY_WSL2_TO_H100_EN.md, DEPLOY_WSL2_TO_H100_ZH.md) covering the full workflow:

  • Build Docker images locally on Windows + WSL2
  • Download and package model weights for offline environments
  • Upload images and models to an air-gapped H100 server via SCP
  • Start services with Docker Compose or standalone docker run
  • Configure SSH port forwarding and Cloudflare tunnels for PC and mobile access
  • Generate and install self-signed SSL certificates for mobile HTTPS

Users can now deploy MiniCPM-o 4.5 in offline or enterprise environments reproducibly, with both PC and mobile browser access supported.


🖼️ UI: Icon Version Patch

Replaced remaining v2.6 icons with v4.5 assets across all frontend pages. Updated icon references in web_demos/minicpm-o_2.6/miniCPM4.5.svg.


🔊 Audio & 💾 Storage Optimization

Refactored the audio pipeline's PCM_16 encoding path and improved buffer management and I/O handling in the backend storage layer, primarily in web_demos/minicpm-o_2.6/model_server.py. Reduces artifacts and latency in real-time AI voice calls, and lowers stutter during high-concurrency or long-duration sessions.


🛠️ Misc

  • Improved inline comments and documentation in scripts and config files
  • Standardized variable naming and path conventions
  • Fixed minor bugs and inconsistencies found during deployment testing

Affected Files

deploy/
├── DEPLOY_WSL2_TO_H100_EN.md
├── DEPLOY_WSL2_TO_H100_ZH.md
├── Dockerfile.backend
├── Dockerfile.frontend
├── docker-compose.yml
├── nginx.docker.conf
├── gen_ssl_cert.sh
└── requirements.backend.txt
web_demos/minicpm-o_2.6/
├── miniCPM4.5.svg
└── model_server.py        # icon patch, audio codec, storage optimization

- Dockerfile.backend: GPU inference container (CUDA 12.8.1)
- Dockerfile.frontend: Vue.js + Nginx multi-stage build
- docker-compose.yml: orchestration with GPU passthrough
- nginx.docker.conf: reverse proxy with SSL support
- gen_ssl_cert.sh: self-signed certificate generation
- DEPLOY_WSL2_TO_H100_ZH.md: comprehensive deployment guide
- Update .gitignore to exclude models/ and build artifacts
Copilot AI review requested due to automatic review settings March 2, 2026 11:00
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the MiniCPM-o web demo and adds an offline/air-gapped deployment bundle aimed at reproducible remote deployment (including mobile HTTPS access), while also upgrading the demo backend to MiniCPM-o 4.5 streaming TTS semantics.

Changes:

  • Add Docker-based deployment assets under deploy/ (backend/frontend Dockerfiles, compose file, Nginx reverse proxy template, SSL cert script, and EN/ZH deployment guides).
  • Update the demo backend (model_server.py) for MiniCPM-o 4.5 APIs and improve the streaming audio path (in-memory WAV encoding, PCM_16).
  • Add/replace UI assets (new miniCPM4.5.svg) and ignore models/ in git.

Reviewed changes

Copilot reviewed 10 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
web_demos/minicpm-o_2.6/model_server.py Updates streaming prompt/prefill/generation logic for MiniCPM-o 4.5 and adjusts audio streaming pipeline.
web_demos/minicpm-o_2.6/miniCPM4.5.svg Adds 4.5 SVG asset for frontend UI.
web_demos/minicpm-o_2.6/miniCPM2.6-CxDaeLI9.svg.bak Adds a backup SVG artifact (appears unintended).
deploy/requirements.backend.txt Documents backend Python dependencies for offline install workflows.
deploy/nginx.docker.conf Adds Nginx reverse proxy config intended for SSE/WebSocket + HTTPS mobile access.
deploy/gen_ssl_cert.sh Adds a helper script to generate self-signed certs for HTTPS.
deploy/docker-compose.yml Adds compose-based orchestration for backend + frontend containers.
deploy/Dockerfile.frontend Adds multi-stage frontend build (pnpm + nginx).
deploy/Dockerfile.backend Adds CUDA-based backend image build for inference service.
deploy/DEPLOY_WSL2_TO_H100_ZH.md Adds Chinese offline deployment guide for WSL2 → H100 workflow.
deploy/DEPLOY_WSL2_TO_H100_EN.md Adds English offline deployment guide for WSL2 → H100 workflow.
README.md Adds a documentation link tip near the header.
.gitignore Ignores models/ directory.
Comments suppressed due to low confidence (10)

web_demos/minicpm-o_2.6/model_server.py:576

  • librosa.resample is called for every generated audio chunk. This is relatively expensive and can add noticeable latency/CPU load during streaming. Consider either emitting 24kHz to the frontend (if supported) or switching to a faster resampler (e.g., torchaudio.functional.resample / scipy.signal.resample_poly) and reusing any state if possible.
                            # Resample from model's 24kHz to frontend's expected 16kHz
                            audio_np = librosa.resample(audio_np, orig_sr=sr, target_sr=16000)

deploy/gen_ssl_cert.sh:17

  • This script uses placeholders (<YOUR_CN>, <YOUR_IP1>, etc.) inside the openssl command. If users run it without editing the script, certificate generation will fail (SAN IP: fields must be real IPs). Consider accepting CN/SAN as args/env vars and providing safe defaults (e.g., CN=localhost, SAN=DNS:localhost,IP:127.0.0.1).
OUT_DIR="${1:-<YOUR_CERTS_OUTPUT_DIR>}"
mkdir -p "$OUT_DIR"

echo ">>> Generating self-signed SSL certificate to $OUT_DIR ..."
openssl req -x509 -nodes -days 3650 \
    -newkey rsa:2048 \
    -keyout "$OUT_DIR/server.key" \
    -out "$OUT_DIR/server.crt" \
    -subj "/C=CN/ST=Local/L=Local/O=MiniCPMo/OU=Dev/CN=<YOUR_CN>" \
    -addext "subjectAltName=IP:<YOUR_IP1>,IP:<YOUR_IP2>,DNS:<YOUR_DNS>"

deploy/docker-compose.yml:27

  • ${MODEL_PATH:-<YOUR_MODEL_PATH>} uses an angle-bracket placeholder as the default. If MODEL_PATH is not set, Docker will try to mount a host path literally named <YOUR_MODEL_PATH>, which is confusing and likely wrong. Prefer a real default path (e.g., ./models/...) or make the variable required.
      - ${MODEL_PATH:-<YOUR_MODEL_PATH>}:/models/MiniCPM-o-4_5:ro

deploy/docker-compose.yml:53

  • ${CERTS_PATH:-<YOUR_CERTS_PATH>} uses an angle-bracket placeholder as the default. If CERTS_PATH is not set, Docker will mount a host path literally named <YOUR_CERTS_PATH>, which can lead to hard-to-debug TLS failures. Prefer a real default (e.g., ./certs) or require the env var.
      - ${CERTS_PATH:-<YOUR_CERTS_PATH>}:/etc/nginx/certs:ro

deploy/DEPLOY_WSL2_TO_H100_EN.md:12

  • This guide claims CUDA 12.4 matches the Dockerfile base image cuda:12.4.1, but deploy/Dockerfile.backend currently uses nvidia/cuda:12.8.1-.... Update the guide (or the Dockerfile) so the stated CUDA/base image version is accurate.
| GPU | NVIDIA H100 (driver 550.90.12) |
| CUDA | 12.4 (fully matches the Dockerfile base image `cuda:12.4.1`) |
| Local | Win10 + WSL2 Ubuntu |

deploy/DEPLOY_WSL2_TO_H100_ZH.md:12

  • This guide states CUDA 12.4 matches the Dockerfile base image cuda:12.4.1, but deploy/Dockerfile.backend currently uses nvidia/cuda:12.8.1-.... Please update the guide (or the Dockerfile) to keep the deployment instructions accurate.
| GPU | NVIDIA H100(驱动 550.90.12) |
| CUDA | 12.4(与 Dockerfile 基础镜像 `cuda:12.4.1` 完全匹配) |
| 本地 | Win10 + WSL2 Ubuntu |

deploy/Dockerfile.backend:5

  • The deployment docs mention a CUDA 12.4 base image (cuda:12.4.1), but this Dockerfile uses nvidia/cuda:12.8.1-.... Please align the Dockerfile and the docs on the intended CUDA base image version.
# Base image: NVIDIA CUDA 12.8 + Ubuntu 22.04
# ============================================
FROM nvidia/cuda:12.8.1-devel-ubuntu22.04

deploy/Dockerfile.backend:33

  • This image is based on CUDA 12.8, but PyTorch is installed from the cu124 index URL. While it may work, the mixed CUDA targeting is confusing and can lead to unexpected library/runtime issues; consider using a base image matching the wheel CUDA version (or document why the mismatch is intentional).
# ---- PyTorch (CUDA 12.4) ----
RUN pip install --no-cache-dir \
    "torch>=2.3.0,<=2.8.0" \
    "torchaudio<=2.8.0" \
    --index-url https://download.pytorch.org/whl/cu124

deploy/Dockerfile.backend:52

  • The backend Dockerfile installs several third-party Python packages from public registries using floating versions (e.g., accelerate, librosa, soundfile, onnxruntime, fastapi, uvicorn, aiofiles, pydantic) and a version range for torch, so each build may pull different artifacts and execute them inside the GPU-enabled backend. This creates a reproducible supply-chain attack surface: if any of these packages or the registry is compromised in the future, rebuilding the image could silently introduce malicious code into internal deployments. Please pin all third-party packages here to exact versions (and ideally hashes or an internal mirror) so backend images are deterministic and auditable, and updates happen only via explicit review.
RUN pip install --no-cache-dir \
    "torch>=2.3.0,<=2.8.0" \
    "torchaudio<=2.8.0" \
    --index-url https://download.pytorch.org/whl/cu124

# ---- MiniCPM-o core dependencies ----
RUN pip install --no-cache-dir \
    "transformers==4.51.0" \
    accelerate \
    "minicpmo-utils[all]>=1.0.5" \
    librosa \
    soundfile \
    onnxruntime \
    sentencepiece \
    Pillow \
    numpy

# ---- Web service dependencies ----
RUN pip install --no-cache-dir \
    fastapi \
    uvicorn \
    aiofiles \
    pydantic

deploy/requirements.backend.txt:29

  • The backend requirements file is intended for offline pip installs but leaves many critical dependencies (e.g., accelerate, librosa, soundfile, onnxruntime, fastapi, uvicorn[standard], aiofiles, pydantic, httpx) unpinned, so future installs may pull newer, unvetted versions from public registries. This lack of version pinning undermines reproducibility and exposes deployments to supply-chain risk: a compromised or malicious new release of any of these packages could be introduced into internal environments without any code change. Please pin these packages to specific versions (and ideally hashes that match the tested container image) so offline installs are deterministic and dependency updates go through explicit review.
accelerate
minicpmo-utils[all]>=1.0.5
sentencepiece

# == Audio/Video Processing ==
librosa
soundfile
onnxruntime
Pillow
numpy

# == Web Service ==
fastapi
uvicorn[standard]
aiofiles
pydantic
httpx

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 524 to 531
audio_stream = None
try:
with open(input_audio_path, 'rb') as wav_file:
audio_stream = wav_file.read()
except FileNotFoundError:
print(f"File {input_audio_path} not found.")
logger.warning(f"File {input_audio_path} not found.")
yield base64.b64encode(audio_stream).decode('utf-8'), "assistant:\n"

Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

audio_stream is initialized to None and then base64-encoded even when the input WAV file can't be read. If open(input_audio_path) fails, base64.b64encode(audio_stream) will raise a TypeError and abort the stream. Handle the missing-file case explicitly (e.g., return an error event / skip the initial yield / set audio_stream to b'' and log).

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +58
```powershell
$env:SSH_HOST = "127.0.0.1"
$env:SSH_HOST = "<YOUR_HOST>"
$env:SSH_PORT = "<YOUR_PORT>"
$env:SSH_USER = "<YOUR_USER>"

## PowerShell Daily Three-Command Quick Reference (Recommended)

```powershell
Set-MiniCPMSSH -Port "<YOUR_PORT>" -User "<YOUR_USER>"
# 1) Update SSH parameters when port changes
Set-MiniCPMSSH -Port "54062" -User "your_user"

# 2) Start mobile mode (open tunnel + print accessible URL)
Set-MiniCPMSSH -Port "<YOUR_PORT>" -User "<YOUR_USER>"
Start-MiniCPMMobile

# 3) Stop tunnel
Stop-MiniCPMMobile
scp -P $env:SSH_PORT .\file.tar.gz "$env:SSH_USER@$env:SSH_HOST:<YOUR_PATH>/deploy_pkg/"

Quick recovery after port change:

[string]$Host = "<YOUR_HOST>",
[string]$User = "<YOUR_USER>"
Restart-MiniCPMMobile
```

$env:SSH_HOST = $Host
$env:SSH_PORT = $Port
$env:SSH_USER = $User
ssh -p $env:SSH_PORT "$env:SSH_USER@$env:SSH_HOST"
scp -P $env:SSH_PORT .\file.tar.gz "$env:SSH_USER@$env:SSH_HOST:/data/minicpmo/deploy_pkg/"
```
Write-Host "[MiniCPM SSH] HOST=$env:SSH_HOST PORT=$env:SSH_PORT USER=$env:SSH_USER"
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The top section has broken/overlapping fenced code blocks and interleaves shell commands into PowerShell blocks (e.g., a powershell block opened at line 24 is never closed before another powershell starts at line 32). This makes the guide hard to follow and renders incorrectly in Markdown viewers; please fix the fencing and separate PowerShell vs Bash snippets cleanly.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

@LujiaJin
Copy link
Author

LujiaJin commented Mar 3, 2026

@copilot open a new pull request to apply changes based on the comments in this thread

LujiaJin and others added 3 commits March 3, 2026 10:42
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants