Shipping MiniCPM-V on-device in a consumer iOS app — launching on Product Hunt Tuesday #1107
Replies: 1 comment 1 reply
-
|
Hi @bb-coder, thanks for sharing this — really appreciate you taking the time to post here. Running MiniCPM-V entirely on-device with Metal acceleration and zero network calls is exactly the kind of use case we love to see. The fact that you got it working end-to-end as a consumer iOS app with streaming and grammar constraints is genuinely impressive. Best of luck with the Product Hunt launch this Tuesday. We'd be happy to help amplify it through our community channels as well — would be great to sync up on that. What's the best way to reach you? Or feel free to drop me an email at caitianchi@modelbest.cn. Also, if you run into any technical issues with the model or llama.cpp integration down the road, feel free to open an issue or reach out here. A "Powered by MiniCPM-V" mention somewhere in the app would be most welcome — no pressure, but we'd appreciate it. |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
-
Hi OpenBMB team,
I'm an indie iOS developer and I just shipped Bloomind — a thought-structuring app that runs MiniCPM-V (1.3B, Q4_K_M quantized) entirely on-device via llama.cpp with Metal GPU acceleration.
Key technical details:
Model: MiniCPM-V 4.6 text core (Qwen3.5-0.8B), ~500 MB on disk
Inference: llama.cpp + Apple Metal, streaming output with GBNF grammar constraints
Privacy: Zero network calls. All data stays on iPhone. No cloud, no telemetry.
The app helps users turn voice memos and quick thoughts into structured insights — powered entirely by your model running locally.
It's launching on Product Hunt this Tuesday (Pacific time). I'd be grateful if you could share or retweet the launch — it would mean a lot coming from the team behind the model.
App Store: https://apps.apple.com/app/id6772862305
Product Hunt: https://www.producthunt.com/products/bloomind?launch=bloomind
Happy to add a "Powered by MiniCPM-V" credit anywhere you'd like, or provide screenshots/video of the on-device inference for your showcase.
Thank you for making this possible — running a capable LLM on an iPhone with no internet is genuinely magical.
Beta Was this translation helpful? Give feedback.
All reactions