Skip to content

feat: Prepare hyvideo for RunPod serverless deployment#317

Open
ggmarts04 wants to merge 3 commits intodeepbeepmeep:mainfrom
ggmarts04:runpod-hyvideo-setup
Open

feat: Prepare hyvideo for RunPod serverless deployment#317
ggmarts04 wants to merge 3 commits intodeepbeepmeep:mainfrom
ggmarts04:runpod-hyvideo-setup

Conversation

@ggmarts04
Copy link
Copy Markdown

This commit introduces the necessary files and configurations to deploy the hyvideo component on RunPod's serverless GPU instances.

Key changes include:

  • Dockerfile: Added a Dockerfile that sets up the environment,
    installs dependencies from a refined requirements.txt, and downloads
    the required Hunyuan video models and associated text encoders/VAEs
    from Hugging Face. It copies only the hyvideo directory and necessary
    root files.

  • handler.py: Created a RunPod request handler. This script loads the
    HunyuanVideoSampler model and defines a handler(job) function
    to process inference requests, generating videos based on input prompts
    and parameters. Output videos are saved within the container.

  • runpod.toml: Added a configuration file specifying GPU requirements,
    the Docker image to be used (with a placeholder for you to fill),
    the handler path, and basic scaling settings.

  • requirements.txt: Refined the main requirements.txt to include
    only dependencies essential for the hyvideo model's inference,
    removing UI-specific (Gradio) and other non-essential packages to
    create a leaner deployment. The Dockerfile has been updated to use
    this refined list directly.

These changes provide a foundational setup for running the hyvideo model in a serverless environment on RunPod.

This commit introduces the necessary files and configurations to deploy the
`hyvideo` component on RunPod's serverless GPU instances.

Key changes include:

-   **Dockerfile**: Added a Dockerfile that sets up the environment,
    installs dependencies from a refined `requirements.txt`, and downloads
    the required Hunyuan video models and associated text encoders/VAEs
    from Hugging Face. It copies only the `hyvideo` directory and necessary
    root files.

-   **handler.py**: Created a RunPod request handler. This script loads the
    `HunyuanVideoSampler` model and defines a `handler(job)` function
    to process inference requests, generating videos based on input prompts
    and parameters. Output videos are saved within the container.

-   **runpod.toml**: Added a configuration file specifying GPU requirements,
    the Docker image to be used (with a placeholder for you to fill),
    the handler path, and basic scaling settings.

-   **requirements.txt**: Refined the main `requirements.txt` to include
    only dependencies essential for the `hyvideo` model's inference,
    removing UI-specific (Gradio) and other non-essential packages to
    create a leaner deployment. The Dockerfile has been updated to use
    this refined list directly.

These changes provide a foundational setup for running the `hyvideo`
model in a serverless environment on RunPod.
This commit updates the RunPod serverless deployment setup for `hyvideo`
to support lip-sync video generation using image and audio URLs.

Key changes include:

-   **handler.py**:
    -   Modified to accept `image_ref_url` and `audio_guide_url` as inputs.
    -   Added logic to download content from these URLs using the `requests`
        library and save them to temporary files.
    -   Updated to primarily use the Hunyuan Video Avatar model by default
        (`hunyuan_video_avatar_720_bf16.safetensors`).
    -   Adjusted default parameters (fps, guidance_scale, shift) to values
        more suitable for the Avatar model.
    -   Ensured the `speech_encoder_path` for Whisper models is passed to the
        `HunyuanVideoSampler`.
    -   Implemented cleanup of temporary downloaded files.
    -   Enhanced local testing block to cover URL inputs.

-   **requirements.txt**:
    -   Added `requests` to the list of dependencies to support URL downloading.

-   **Dockerfile**:
    -   Added `curl` to the list of installed packages for general utility.
    -   Model download instructions were previously confirmed to include the
        necessary Avatar and Whisper models.

-   **runpod.toml**:
    -   Reviewed and confirmed existing settings. The RTX 3090 GPU and
        5-minute job timeout are considered reasonable starting points for
        lip-sync tasks.

This enhances the `hyvideo` RunPod deployment to handle more advanced
use cases like lip-sync animation based on external image and audio sources.
Ensured that the Dockerfile uses an explicit absolute path
`COPY handler.py /app/handler.py`
for copying the handler script. This improves clarity and robustness
of the Docker build process, helping to prevent 'file not found'
errors for the handler script within the container.
@phong-diagon1
Copy link
Copy Markdown

Hi @ggmarts04 is this one ready for production

@deepbeepmeep
Copy link
Copy Markdown
Owner

Thanks I am sure lots of users will be happy with this support. Would you mind updating the PR so that I can merge it with WanGP v6 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants