VACE-WAN is a powerful, fully open-source AI video creation tool that brings together cutting-edge generative AI models to enable one-click cinematic creation. With just a single theme input, users can generate scripts, narration, scene images, and videos β all seamlessly integrated into a cohesive short film.
π‘ Multilingual support included! Create in English or Chinese. π Powered by open-source models. No subscription. No hidden cost.
- Text-to-Script: Auto-generate creative screenplays with [deepseek-r1-distill-llama-70b]
- Text-to-Image: Generate vivid scene visuals with [Stable Diffusion 2.1]
- Text-to-Speech: Narrate your scenes using [gTTS] (Google Text-to-Speech)
- Image-to-Video: Animate scenes using [Stable Video Diffusion (Img2Vid-XT)]
- Gradio Interface: Intuitive UI for customization or one-click generation
- Full Pipeline: From concept to video β script β image β voice β video
Ensure your environment includes at least 20MiB GPU memory.
%pip install -U diffusers
%pip install -q gradio torch torchvision groq ffmpeg-python gTTS av
%pip install -U transformers accelerate sentencepiece protobuf opencv-python
%pip install imageio-ffmpegRun the notebook directly in Jupyter or Colab:
jupyter notebook "VACE-WAN - AI Video Generator.ipynb"Or deploy it via Gradio!
Enter a theme β e.g., "A robot discovers a beach paradise"
Generate Script β Automatically written screenplay
Generate Voice β Narration from the script
Generate Scene Images β AI-generated cinematic visuals
Render Video β Combine everything into a stunning short film
| Task | Model |
|---|---|
| Script Generation | deepseek-r1-distill-llama-70b via Groq API |
| Image Generation | stabilityai/stable-diffusion-2-1 |
| Speech Generation | gTTS (Google Text-to-Speech) |
| Video Generation | stabilityai/stable-video-diffusion-img2vid-xt |
Inspired by Google I/O 2025 and the revolutionary demo of Veo3 + Flow, this project was born to democratize video generation for everyone. While commercial tools remain powerful but costly and restricted, VACE-WAN aims to be:
πΈ Free and open-source
π Language-friendly (supports Chinese & English)
π οΈ Accessible for developers and creators
π§ Educational for learning generative AI pipelines
Recommended: 20MiB+ GPU memory for img2vid
For better stability, consider running locally or with Colab Pro
VACE-WAN/
βββ VACE-WAN - ιζΊ AI ε½±ηε΅δ½ε·₯ε
·.ipynb # Main notebook
βββ README.md # You're reading it!
βββ assets/ # (Optional) Screenshots or outputsFeel free to fork, submit issues, or contribute improvements! Letβs build the future of open-source AI filmmaking together.

