Skip to content

cixtech/cix-vla-app

Repository files navigation

English | 中文

CIX VLA Robot Console

cix-vla-app is a browser-based robot control console built on top of lerobot, designed as a control platform for CIX VLA deployments based on the CIX P1 SoC. It packages runtime control, model serving, voice interaction, automation, telemetry, and operator-facing documentation into one application for target-device operation, validation, and deployment.

Model Assets

Runtime model files are not fully included in this repository. Download them from ModelScope: cix/cix-vla-app, then merge the corresponding directories into this project (for example, under model_zoo/ and omni/llm/).

CIX P1 Platform

CIX-VLA APP is primarily developed for the CIX P1 platform. In this project, the CIX P1 serves as the target compute platform for browser-based runtime control, model loading, voice interaction, automation orchestration, and operator-side task execution.

Based on publicly available platform information for CIX P1-based boards, the CIX P1 highlights include:

  • 6nm process technology
  • 12-core Arm CPU with a tri-cluster design
  • up to 45 TOPS heterogeneous AI computing capability
  • Arm Immortalis GPU class graphics support on public board platforms
  • high-speed peripheral support such as PCIe Gen4 and modern memory/storage options on target boards

From an application perspective, the project uses the CIX P1 as a local VLA deployment platform where heterogeneous compute, media, and I/O capabilities can be combined in one target system for:

  • edge-side runtime control and model serving
  • local audio, camera, and device access
  • task execution and automation orchestration on the target system
  • operator-facing debugging and validation workflows for deployed VLA capabilities

This makes the platform especially practical for scenarios that need local perception, task execution, model inference, and operator interaction to run close to the device.

Note: exact exposed interfaces and final specifications may vary by carrier board or system product built around the CIX P1.

What This Project Provides

  • Runtime control for powering the system on or off, loading or unloading models, and projecting runtime state to the UI
  • Manual task execution for the supported tabletop tasks such as tape and pen operations
  • Voice-command interaction with browser-side audio capture and Omni-based task classification
  • Detection and workflow orchestration controls for semi-automatic and automatic operation flows
  • Bilingual operator documentation, help pages, blogs, and debug views
  • Deployment helper scripts for virtual cameras, udev aliases, and performance collection

Demos

The following demos show representative operator flows supported by the current release.

Load Model Demo

The animation below shows the current model-loading dialog flow in the master line.

Load model demo

Manual Task Demo

The animation below shows the manual-task interaction flow. Operators can click a supported task button and the VLA runtime will execute the corresponding task on the target device.

Manual task demo

Automation Demo

The animation below shows the automation flow. Operators can configure the task order in advance, and the VLA runtime will execute the arranged workflow automatically.

Automation demo

Voice Command Demo 01

The animation below shows a voice-command demo where the recognized instruction is to retrieve the tape from the box, and the VLA runtime executes the corresponding task.

Voice command demo 01

Voice Command Demo 02

The animation below shows a voice-command demo where the recognized instruction is to place the tape back into the black box, and the VLA runtime executes the corresponding task.

Voice command demo 02

Current voice-command demos are configured for Chinese instructions for the supported task set. If you want to enable this capability in your own environment, use the bundled Omni model assets under omni/llm/ and follow the upstream Qwen Research License terms. See THIRD_PARTY_LICENSES.md. If you have different task wording or language requirements, you can customize the Omni prompt configuration in config/audio.yaml under omni_prompt_match to adapt the recognition behavior.

Main Components

  • app.py Application entry point for the Gradio-based operator console
  • src/ui/ Pages, components, event wiring, runtime polling, and bilingual UI behavior
  • src/runtime/ Runtime configuration loading, runner logic, patch management, and process coordination
  • src/services/ Voice, orchestration, Redis state, and UI-facing status services
  • src/telemetry/ Camera, action, and telemetry collection helpers
  • model_zoo/ Model assets, overlay payloads, and compatibility support
  • omni/ Omni runtime resources used by the voice command flow
  • scripts/ Operational helper scripts such as virtual-camera setup and performance capture

Typical Usage

This project is intended for:

  • operator-side control of a CIX VLA robot arm
  • validation and debugging of runtime, voice, and automation behavior
  • deployment on CIX P1-based target devices with local cameras, audio input, and model assets prepared

Documentation

License Notice

The first-party code in this project is licensed under Apache-2.0. Third-party components keep their own licenses. See LICENSE and THIRD_PARTY_LICENSES.md in this directory for the final licensing boundary.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors