Skip to content

Local AI music generator with smart lyrics: Gradio web UI for HeartMuLa + Ollama/OpenAI, tags, history, and high-fidelity audio.

License

Notifications You must be signed in to change notification settings

strnad/HeartMuse

Repository files navigation

🎡 HeartMuse - AI Music Generator with Smart Lyrics

HeartMuse is an intuitive web-based interface for creating high-quality AI-generated music completely locally on your machine. It combines the power of HeartMuLa (state-of-the-art open-source music generation model for local inference) with intelligent lyrics generation using local LLMs, giving you complete creative control without relying on cloud services.

✨ What Makes HeartMuse Special?

While HeartMuLa provides state-of-the-art music generation capabilities, HeartMuse extends it with:

  • 🎨 User-Friendly Web Interface - No command-line expertise needed
  • πŸ“ Smart Lyrics Generation - Leverages local Ollama models or OpenAI API to automatically generate coherent, themed lyrics from simple descriptions
  • 🏷️ Intelligent Tagging - Automatically generates appropriate music style tags
  • πŸ’Ύ Complete Privacy - Run 100% locally with Ollama (no data leaves your machine)
  • πŸ“š Generation History - Browse, replay, and manage all your previous creations
  • βš™οΈ Flexible Configuration - Easy-to-use controls for fine-tuning generation parameters

🎯 Features

Smart Text Generation

  • Describe Your Vision: Simply write what kind of song you want (e.g., "upbeat pop song about summer adventures")
  • Automatic Lyrics: AI generates full lyrics matching your description and chosen theme
  • Song Titles: Creative, relevant titles generated automatically
  • Style Tags: Intelligent tagging system for music genre, mood, and instrumentation

Powerful Music Generation

  • HeartMuLa 3B Model: State-of-the-art open-source model for local music generation (3 billion parameters, RL-trained)
  • High-Fidelity Audio: Uses HeartCodec for superior audio quality
  • Customizable Parameters: Control temperature, CFG scale, Top-K sampling, and duration
  • GPU Acceleration: CUDA support with efficient memory management and lazy loading (reduces VRAM usage)
  • Memory Efficient: Lazy loading feature allows generation on GPUs with limited VRAM

Dual LLM Backend Support

  • Ollama (Recommended): Run completely locally with models like glm-4.7-flash, llama3, mistral, etc.
  • OpenAI API: Use GPT-4o, GPT-4o-mini, or other OpenAI models for lyrics generation

Seamless Workflow

  1. Enter a song description
  2. Let AI generate lyrics, title, and tags (or write your own)
  3. Click "Generate Music" and get professional-quality audio
  4. Browse your creation history anytime

πŸš€ Quick Start

Prerequisites

  • Git - For cloning repositories and submodules
  • Python 3.10 - 3.12 (3.10 recommended by HeartMuLa authors; newer versions may not work)
  • NVIDIA GPU with CUDA 12.4+ (recommended 12GB VRAM, 8GB VRAM minimum, for HeartMuLa-3B model)
  • Ollama (optional, for local lyrics generation) - Download Ollama

Installation

Linux / macOS:

git clone https://github.com/yourusername/heartmuse.git
cd heartmuse
./install.sh

Windows:

git clone https://github.com/yourusername/heartmuse.git
cd heartmuse
install.bat

The installer will:

  • Create a Python virtual environment
  • Clone the HeartMuLa library
  • Install all dependencies
  • Prepare your system for music generation

Running HeartMuse

Linux / macOS:

./run.sh

Windows:

run.bat

Open your browser to http://localhost:7860 and start creating!

βš™οΈ Configuration

Copy .env.example to .env and customize. See the file for available options and their descriptions.

Using Ollama (100% Local)

  1. Install Ollama from ollama.ai
  2. Download a model: ollama pull glm-4.7-flash (or llama3, mistral, etc.)
  3. Make sure Ollama is running: ollama serve
  4. Set LLM_BACKEND=Ollama in your .env

Using OpenAI API

  1. Get your API key from platform.openai.com
  2. Set OPENAI_API_KEY in your .env
  3. Set LLM_BACKEND=OpenAI

πŸ“– How It Works

HeartMuse orchestrates a two-stage generation pipeline:

Stage 1: Text Generation (LLM)

  • Takes your song description
  • Generates contextually appropriate lyrics
  • Creates a catchy title
  • Suggests music style tags (genre, mood, instruments)

Stage 2: Music Generation (HeartMuLa)

  • Processes lyrics and tags through HeartMuLa's 3B parameter model
  • Generates high-fidelity audio using HeartCodec
  • Saves output with complete metadata

All generations are saved to the output/ directory with JSON metadata, making it easy to track your creative journey.

πŸŽ“ Examples

Example 1: Upbeat Pop Song

Description: "Energetic pop song about chasing dreams"

Generated Output:

  • Title: "Dreams in Motion"
  • Lyrics: Full verses and chorus about ambition and perseverance
  • Tags: pop, upbeat, energetic, electronic, synthesizer
  • Audio: 2-3 minute high-quality music track

Example 2: Melancholic Ballad

Description: "Slow, emotional ballad about lost love"

Generated Output:

  • Title: "Fading Echoes"
  • Lyrics: Heartfelt verses about memories and longing
  • Tags: ballad, slow, melancholic, piano, emotional
  • Audio: Emotive instrumental with appropriate pacing

πŸ™ Credits & Acknowledgments

HeartMuse is built on top of the incredible work by the HeartMuLa team:

Huge thanks to the HeartMuLa authors for creating and open-sourcing their state-of-the-art music generation technology, making professional-quality AI music generation accessible to everyone for local inference!

πŸ› οΈ Technology Stack

  • HeartMuLa - 3B parameter music generation model
  • Gradio - Web interface framework
  • Ollama - Local LLM inference
  • OpenAI API - Cloud LLM option
  • PyTorch - Deep learning backend
  • Python 3.10 - 3.12 - Core runtime

πŸ“‹ System Requirements

Required:

  • Git
  • Python 3.10 - 3.12 (3.10 recommended)
  • NVIDIA GPU with 8GB+ VRAM (e.g., RTX 3070, RTX 4060, or better)
  • CUDA 12.4+
  • 16GB system RAM
  • 20GB disk space (for models and generated audio)

Memory Optimization:

  • Lazy loading is enabled by default (reduces VRAM footprint)
  • Manual "Unload Model" button frees GPU memory between generations
  • For GPUs with less VRAM, reduce MUSIC_MAX_LENGTH_SEC to generate shorter clips

πŸ› Troubleshooting

Models Not Downloading

The first run automatically downloads ~3GB of model weights from Hugging Face. Ensure you have:

  • Stable internet connection
  • Sufficient disk space in the ckpt/ directory

Out of Memory Errors

  • Use the "Unload Model" button between generations
  • Reduce MUSIC_MAX_LENGTH_SEC in GUI or .env

Installation Problems

  • Make sure you are using Python 3.10 - 3.12 (other versions are not supported)
  • Update your NVIDIA drivers to the latest version

Ollama Connection Issues

  • Ensure Ollama is running: ollama serve
  • Check OLLAMA_URL matches your Ollama installation
  • Verify the model is downloaded: ollama list

πŸ’– Support the Project

If HeartMuse saves you time or helps you create something cool, consider supporting development πŸ™

Sponsor via GitHub

Sponsor on GitHub

Donate with Bitcoin

bc1qgsn45g02wran4lph5gsyqtk0k7t98zsg6qur0y

πŸ“ License

This project is released under the MIT License. See LICENSE for details.

The HeartMuLa library has its own license - please refer to the HeartMuLa repository for licensing information.

🀝 Contributing

Contributions are welcome! Feel free to:

  • Report bugs via GitHub Issues
  • Suggest new features
  • Submit pull requests

πŸ“§ Support

For questions and support:


Made with ❀️ using HeartMuLa | Developed with assistance from Claude Code

Create music with AI, own your creativity

Sponsor this project

 

Contributors 2

  •  
  •