[Community] MiniCPM-V 4.6 integrated into StarVLA for robot manipulation #1108
shaohua-pan
started this conversation in
General
Replies: 1 comment
-
|
Hi @shaohua-pan, this is really exciting — thanks for taking the time to share the integration and benchmark results in such detail. The LIBERO numbers are impressive, especially for a model of this size. Seeing MiniCPM-V 4.6 perform as a lightweight VLM backbone for robot manipulation is exactly the kind of downstream use case we'd hoped to see. The sim-to-real test on the SO-101 would be a great next step — definitely keep us posted on how that goes. Happy to help if you run into any model-side questions along the way. Feel free to open an issue or reach out here anytime. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi MiniCPM-V team,
I recently integrated MiniCPM-V 4.6 as a new vision-language backbone for StarVLA, a Vision-Language-Action framework for robot manipulation.
PR: starVLA/starVLA#354
The integration adds:
Using MiniCPM-V 4.6, we trained and evaluated on the LIBERO benchmark and got the following results:
Training setup:
openbmb/MiniCPM-V-4.6flash_attention_2FREEZE_MODULES=""), so MiniCPM-V is trainable unless overridden.MiniCPM-V 4.6 looks like a strong lightweight VLM backbone for embodied AI / VLA-style robot learning, especially when compared with larger VLM backbones.
As a next step, we are also considering testing this MiniCPM-V-based StarVLA model on a real SO-101 robot setup, to further evaluate its sim-to-real potential beyond LIBERO simulation benchmarks.
Thanks for releasing MiniCPM-V! Feedback or suggestions from the MiniCPM-V team would be very welcome.
Beta Was this translation helpful? Give feedback.
All reactions