Skip to content

[Question] Does the Android deployment support audio understanding (audio + text prompt)? #1090

@starrydreamawa

Description

@starrydreamawa

I'm trying to deploy MiniCPM-o on the Android platform. I would like to know whether MiniCPM-o on Android supports audio understanding, specifically:

Input: audio file (e.g., WAV/MP3) + natural language prompt
Output: textual understanding of the audio content (e.g., "What is the sound in this recording?")

I've deployed MiniCPM-o with llama.cpp by Android NDK and adb tool (direct building with Termux failed), checked the llama-minicpmv-cli help menu and found no --audio or similar parameter for audio input.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions