Hi MiniCPM-V team,
I wanted to share a downstream integration / community use case of MiniCPM-V 4.6 in embodied AI.
We recently integrated MiniCPM-V 4.6 as a new vision-language backbone for StarVLA, a Vision-Language-Action framework for robot manipulation.
StarVLA PR: starVLA/starVLA#354
The integration adds:
- A MiniCPM-V 4.6 VLM wrapper for StarVLA
- PI / GR00T-style VLA framework support
- LIBERO training and evaluation scripts
- Example configs for 8-GPU training
Using MiniCPM-V 4.6, we trained and evaluated on the LIBERO benchmark and got the following results:
| Benchmark |
Success Rate |
| LIBERO-Spatial |
94.0% |
| LIBERO-Object |
98.0% |
| LIBERO-Goal |
98.0% |
| LIBERO-10 |
92.4% |
| Overall |
95.6% |
Training setup:
- Backbone:
openbmb/MiniCPM-V-4.6
- Effective batch size: 128
- Max training steps: 80k
- Attention implementation:
flash_attention_2
- No modules are frozen by default (
FREEZE_MODULES=""), so MiniCPM-V is trainable unless overridden.
MiniCPM-V 4.6 looks like a promising lightweight VLM backbone for embodied AI / VLA-style robot learning. The initial LIBERO results are encouraging, especially given the compact model size compared with many larger VLM backbones.
As a next step, we are also considering testing this MiniCPM-V-based StarVLA model on a real SO-101 robot setup, to further evaluate its sim-to-real potential beyond LIBERO simulation benchmarks.
Thanks for releasing MiniCPM-V!
Hi MiniCPM-V team,
I wanted to share a downstream integration / community use case of MiniCPM-V 4.6 in embodied AI.
We recently integrated MiniCPM-V 4.6 as a new vision-language backbone for StarVLA, a Vision-Language-Action framework for robot manipulation.
StarVLA PR: starVLA/starVLA#354
The integration adds:
Using MiniCPM-V 4.6, we trained and evaluated on the LIBERO benchmark and got the following results:
Training setup:
openbmb/MiniCPM-V-4.6flash_attention_2FREEZE_MODULES=""), so MiniCPM-V is trainable unless overridden.MiniCPM-V 4.6 looks like a promising lightweight VLM backbone for embodied AI / VLA-style robot learning. The initial LIBERO results are encouraging, especially given the compact model size compared with many larger VLM backbones.
As a next step, we are also considering testing this MiniCPM-V-based StarVLA model on a real SO-101 robot setup, to further evaluate its sim-to-real potential beyond LIBERO simulation benchmarks.
Thanks for releasing MiniCPM-V!