feat: Prepare hyvideo for RunPod serverless deployment#317
Open
ggmarts04 wants to merge 3 commits intodeepbeepmeep:mainfrom
Open
feat: Prepare hyvideo for RunPod serverless deployment#317ggmarts04 wants to merge 3 commits intodeepbeepmeep:mainfrom
ggmarts04 wants to merge 3 commits intodeepbeepmeep:mainfrom
Conversation
This commit introduces the necessary files and configurations to deploy the
`hyvideo` component on RunPod's serverless GPU instances.
Key changes include:
- **Dockerfile**: Added a Dockerfile that sets up the environment,
installs dependencies from a refined `requirements.txt`, and downloads
the required Hunyuan video models and associated text encoders/VAEs
from Hugging Face. It copies only the `hyvideo` directory and necessary
root files.
- **handler.py**: Created a RunPod request handler. This script loads the
`HunyuanVideoSampler` model and defines a `handler(job)` function
to process inference requests, generating videos based on input prompts
and parameters. Output videos are saved within the container.
- **runpod.toml**: Added a configuration file specifying GPU requirements,
the Docker image to be used (with a placeholder for you to fill),
the handler path, and basic scaling settings.
- **requirements.txt**: Refined the main `requirements.txt` to include
only dependencies essential for the `hyvideo` model's inference,
removing UI-specific (Gradio) and other non-essential packages to
create a leaner deployment. The Dockerfile has been updated to use
this refined list directly.
These changes provide a foundational setup for running the `hyvideo`
model in a serverless environment on RunPod.
This commit updates the RunPod serverless deployment setup for `hyvideo`
to support lip-sync video generation using image and audio URLs.
Key changes include:
- **handler.py**:
- Modified to accept `image_ref_url` and `audio_guide_url` as inputs.
- Added logic to download content from these URLs using the `requests`
library and save them to temporary files.
- Updated to primarily use the Hunyuan Video Avatar model by default
(`hunyuan_video_avatar_720_bf16.safetensors`).
- Adjusted default parameters (fps, guidance_scale, shift) to values
more suitable for the Avatar model.
- Ensured the `speech_encoder_path` for Whisper models is passed to the
`HunyuanVideoSampler`.
- Implemented cleanup of temporary downloaded files.
- Enhanced local testing block to cover URL inputs.
- **requirements.txt**:
- Added `requests` to the list of dependencies to support URL downloading.
- **Dockerfile**:
- Added `curl` to the list of installed packages for general utility.
- Model download instructions were previously confirmed to include the
necessary Avatar and Whisper models.
- **runpod.toml**:
- Reviewed and confirmed existing settings. The RTX 3090 GPU and
5-minute job timeout are considered reasonable starting points for
lip-sync tasks.
This enhances the `hyvideo` RunPod deployment to handle more advanced
use cases like lip-sync animation based on external image and audio sources.
Ensured that the Dockerfile uses an explicit absolute path `COPY handler.py /app/handler.py` for copying the handler script. This improves clarity and robustness of the Docker build process, helping to prevent 'file not found' errors for the handler script within the container.
|
Hi @ggmarts04 is this one ready for production |
Owner
|
Thanks I am sure lots of users will be happy with this support. Would you mind updating the PR so that I can merge it with WanGP v6 ? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit introduces the necessary files and configurations to deploy the
hyvideocomponent on RunPod's serverless GPU instances.Key changes include:
Dockerfile: Added a Dockerfile that sets up the environment,
installs dependencies from a refined
requirements.txt, and downloadsthe required Hunyuan video models and associated text encoders/VAEs
from Hugging Face. It copies only the
hyvideodirectory and necessaryroot files.
handler.py: Created a RunPod request handler. This script loads the
HunyuanVideoSamplermodel and defines ahandler(job)functionto process inference requests, generating videos based on input prompts
and parameters. Output videos are saved within the container.
runpod.toml: Added a configuration file specifying GPU requirements,
the Docker image to be used (with a placeholder for you to fill),
the handler path, and basic scaling settings.
requirements.txt: Refined the main
requirements.txtto includeonly dependencies essential for the
hyvideomodel's inference,removing UI-specific (Gradio) and other non-essential packages to
create a leaner deployment. The Dockerfile has been updated to use
this refined list directly.
These changes provide a foundational setup for running the
hyvideomodel in a serverless environment on RunPod.