Support AR inference by miguelmartin75 · Pull Request #4 · miguelmartin75/diffusers

miguelmartin75 · 2026-02-05T00:41:53Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

update

fix

fix non-diffusers lora key handling for flux2

* update * update * update * update * update * update * update * update

…gface#13122) Improve docstring scheduling edm dpmsolver multistep

…ly in `GlmImagePipeline` (huggingface#13092) * allow loose input Signed-off-by: JaredforReal <w13431838023@gmail.com> * add tests Signed-off-by: JaredforReal <w13431838023@gmail.com> * format test_glm_image Signed-off-by: JaredforReal <w13431838023@gmail.com> --------- Signed-off-by: JaredforReal <w13431838023@gmail.com>

…ingface#13127) Improve docstring scheduling flow match euler discrete

…e} (huggingface#13066) * initial conversion script * cosmos control net block * CosmosAttention * base model conversion * wip * pipeline updates * convert controlnet * pipeline: working without controls * wip * debugging * Almost working * temp * control working * cleanup + detail on neg_encoder_hidden_states * convert edge * pos emb for control latents * convert all chkpts * resolve TODOs * remove prints * Docs * add siglip image reference encoder * Add unit tests * controlnet: add duplicate layers * Additional tests * skip less * skip less * remove image_ref * minor * docs * remove skipped test in transfer * Don't crash process * formatting * revert some changes * remove skipped test * make style * Address comment + fix example * CosmosAttnProcessor2_0 revert + CosmosAttnProcessor2_5 changes * make style * make fix-copies

* add tests for robust model loading. * apply review feedback.

…ed (huggingface#13121) Fix LTX-2 inference when num_videos_per_prompt > 1 and CFG is enabled

Try to fix setuptools pkg_resources issue on CI

…ngface#13130) Improve docstring scheduling flow match heun discrete

…ce#13132) Try to fix setuptools pkg_resources error for PR GPU test workflow

…le (huggingface#12524) * drop python 3.8 * remove list, tuple, dict from typing * fold Unions into | * up * fix a bunch and please me. * up * up * up * up * up * up * enforce 3.10.0. * up * up * up * up * up * up * up * up * Update setup.py Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * up. * python 3.10. * ifx * up * up * up * up * final * up * fix typing utils. * up * up * up * up * up * up * fix * up * up * up * up * up * up * handle modern types. * up * up * fix ip adapter type checking. * up * up * up * up * up * up * up * revert docstring changes. * keep deleted files deleted. * keep deleted files deleted. --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

…12994) * feat: implement apply_lora_scale to remove boilerplate. * apply to the rest. * up * remove more. * remove. * fix * apply feedback.

* fix ltx2 i2v docstring. * up

* up * style + copies * fix --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>

up

fix

* update create pipeline section * update more * update more * more * add a section on running pipeline moduarly * refactor update_components, remove support for spec * style * bullet points * update the pipeline block * small fix in state doc * update sequential doc * fix link * small update on quikstart * add a note on how to run pipeline without the componen4ts manager * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * remove the supported models mention * update more * up * revert type hint changes --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* up * up up * update outputs * style * add modular_auto_docstring! * more auto docstring * style * up up up * more more * up * address feedbacks * add TODO in the description for empty docstring * refactor based on dhruv's feedback: remove the class method * add template method * up * up up up * apply auto docstring * make style * rmove space in make docstring * Apply suggestions from code review * revert change in z * fix * Apply style fixes * include auto-docstring check in the modular ci. (huggingface#13004) * initial support: workflow * up up * treeat loop sequential pipeline blocks as leaf * update qwen image docstring note * add workflow support for sdxl * add a test suit * add test for qwen-image * refactor flux a bit, seperate modular_blocks into modular_blocks_flux and modular_blocks_flux_kontext + support workflow * refactor flux2: seperate blocks for klein_base + workflow * qwen: remove import support for stuff other than the default blocks * add workflow support for wan * sdxl: remove some imports: * refactor z * update flux2 auto core denoise * add workflow test for z and flux2 * Apply suggestions from code review * Apply suggestions from code review * add test for flux * add workflow test for flux * add test for flux-klein * sdxl: modular_blocks.py -> modular_blocks_stable_diffusion_xl.py * style * up * add auto docstring * workflow_names -> available_workflows * fix workflow test for klein base * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * fix workflow tests * qwen: edit -> image_conditioned to be consistent with flux kontext/2 such * remove Optional * update type hints * update guider update_components * fix more * update docstring auto again --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

change lora mixin Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* remove k-diffusion * fix copies

) accept recompile_limit from the user in tests

…ingface#12811) * support device type device_maps to work with offloading. * add tests. * fix tests * skip tests where it's not supported. * empty * up * up * fix allegro.

* [Bug Fix][Qwen-Image-Edit] Fix Qwen-Image-Edit series on NPU * Enhance NPU attention handling by converting attention mask to boolean and refining mask checks. * Refine attention mask handling in NPU attention function to improve validation and conversion logic. * Clean Code * Refine attention mask processing in NPU attention functions to enhance performance and validation. * Remove item() ops on npu fa backend. * Reuse NPU attention mask by `_maybe_modify_attn_mask_npu` * Apply style fixes * Update src/diffusers/models/attention_dispatch.py --------- Co-authored-by: zhangtao <zhangtao529@huawei.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* update * update * update * update * update * update

Improve docstring scheduling flow match lcm

* add example * feedback

…12777) * split tensors inside the transformer blocks to avoid checkpointing issues * clean up, fix type hints * fix merge error * Apply style fixes --------- Co-authored-by: s <you@example.com> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

…ed (huggingface#13149) * Pin setuptools version for dependencies which explicitly depend on pkg_resources * Revert setuptools pin as k-diffusion pipelines are now deprecated --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Guard ftfy import with is_ftfy_available * Remove xfail for PRX pipeline tests as they appear to work on transformers>4.57.1 * make style and make quality

) * up * up * up * up --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

update Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

…ize_gguf_tensor (huggingface#13166) [gguf] Convert to plain tensor earlier in dequantize_gguf_tensor Once dequantize_gguf_tensor fetches the quant_type attributed from the GGUFParamter tensor subclass, there is no further need of running the actual dequantize operations on the Tensor subclass, we can just convert to plain tensor right away. This not only makes PyTorch eager faster, but reduces torch.compile tracer compile time from 36 seconds to 10 seconds, because there is lot less code to trace now.

peft (fal) lora format

…ig (huggingface#13143)

…3123) * update * update

Fix typing import by converting to Python 3.9+ style type hint

* switch to transformers main again./ * more * up * up * fix group offloading. * attributes * up * up * tie embedding issue. * fix t5 stuff for more. * matrix configuration to see differences between 4.57.3 and main failures. * change qwen expected slice because of how init is handled in v5. * same stuff. * up * up * Revert "up" This reverts commit 515dd06. * Revert "up" This reverts commit 5274ffd. * up * up * fix with peft_format. * just keep main for easier debugging. * remove torchvision. * empty * up * up with skyreelsv2 fixes. * fix skyreels type annotation. * up * up * fix variant loading issues. * more fixes. * fix dduf * fix * fix * fix * more fixes * fixes * up * up * fix dduf test * up * more * update * hopefully ,final? * one last breath * always install from main * up * audioldm tests * up * fix PRX tests. * up * kandinsky fixes * qwen fixes. * prx * hidream

…gface#13060) * fix: graceful fallback when attention backends fail to import ## Problem External attention backends (flash_attn, xformers, sageattention, etc.) may be installed but fail to import at runtime due to ABI mismatches. For example, when `flash_attn` is compiled against PyTorch 2.4 but used with PyTorch 2.8, the import fails with: ``` OSError: .../flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEab ``` The current code uses `importlib.util.find_spec()` to check if packages exist, but this only verifies the package is installed—not that it can actually be imported. When the import fails, diffusers crashes instead of falling back to native PyTorch attention. ## Solution Wrap all external attention backend imports in try-except blocks that catch `ImportError` and `OSError`. On failure: 1. Log a warning message explaining the issue 2. Set the corresponding `_CAN_USE_*` flag to `False` 3. Set the imported functions to `None` This allows diffusers to gracefully degrade to PyTorch's native SDPA (scaled_dot_product_attention) instead of crashing. ## Affected backends - flash_attn (Flash Attention) - flash_attn_3 (Flash Attention 3) - aiter (AMD Instinct) - sageattention (SageAttention) - flex_attention (PyTorch Flex Attention) - torch_npu (Huawei NPU) - torch_xla (TPU/XLA) - xformers (Meta xFormers) ## Testing Tested with PyTorch 2.8.0 and flash_attn 2.7.4.post1 (compiled for PyTorch 2.4). Before: crashes on import. After: logs warning and uses native attention. * address review: use single logger and catch RuntimeError - Move logger to module level instead of creating per-backend loggers - Add RuntimeError to exception list alongside ImportError and OSError Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Apply style fixes --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Fix torchrun command argument order in docs

miguelmartin75 force-pushed the cosmos/transfer2.5-ar branch 2 times, most recently from 3a26e3e to 743091e Compare February 10, 2026 02:59

DN6 and others added 6 commits February 10, 2026 12:19

[Docs] Add guide for AutoModel with custom code (huggingface#13099)

bedc67c

update

[SkyReelsV2] Fix ftfy import (huggingface#13113)

5bf248d

fix

[lora] fix non-diffusers lora key handling for flux2 (huggingface#13119)

4d00980

fix non-diffusers lora key handling for flux2

[CI] Refactor Wan Model Tests (huggingface#13082)

c3a4cd1

* update * update * update * update * update * update * update * update

docs: improve docstring scheduling_edm_dpmsolver_multistep.py (huggin…

64e2adf

…gface#13122) Improve docstring scheduling edm dpmsolver multistep

miguelmartin75 force-pushed the cosmos/transfer2.5 branch 3 times, most recently from ddec8fb to 4bbedfb Compare February 11, 2026 23:56

miguelmartin75 force-pushed the cosmos/transfer2.5-ar branch from 743091e to 55834ed Compare February 12, 2026 00:13

delmalih and others added 5 commits February 11, 2026 16:39

docs: improve docstring scheduling_flow_match_euler_discrete.py (hugg…

06a0f98

…ingface#13127) Improve docstring scheduling flow match euler discrete

[modular] add tests for robust model loading. (huggingface#13120)

ed77a24

* add tests for robust model loading. * apply review feedback.

Fix LTX-2 Inference when num_videos_per_prompt > 1 and CFG is Enabl…

985d83c

…ed (huggingface#13121) Fix LTX-2 inference when num_videos_per_prompt > 1 and CFG is enabled

[CI] Fix setuptools pkg_resources Errors (huggingface#13129)

427472e

Try to fix setuptools pkg_resources issue on CI

miguelmartin75 force-pushed the cosmos/transfer2.5-ar branch 2 times, most recently from 0d0eeae to bff6af9 Compare February 12, 2026 22:24

delmalih and others added 11 commits February 12, 2026 14:32

docs: improve docstring scheduling_flow_match_heun_discrete.py (huggi…

5f3ea22

…ngface#13130) Improve docstring scheduling flow match heun discrete

[CI] Fix setuptools pkg_resources Bug for PR GPU Tests (huggingfa…

277e305

…ce#13132) Try to fix setuptools pkg_resources error for PR GPU test workflow

fix cosmos transformer typing. (huggingface#13134)

76af013

feat: implement apply_lora_scale to remove boilerplate. (huggingface#…

8abcf35

…12994) * feat: implement apply_lora_scale to remove boilerplate. * apply to the rest. * up * remove more. * remove. * fix * apply feedback.

[docs] fix ltx2 i2v docstring. (huggingface#13135)

3c1c62e

* fix ltx2 i2v docstring. * up

[Modular] add different pipeine blocks to init (huggingface#13145)

6141ae2

* up * style + copies * fix --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>

fix MT5Tokenizer (huggingface#13146)

5b00a18

up

fix guider (huggingface#13147)

19ab0ec

fix

asomoza and others added 12 commits February 15, 2026 11:36

[LTX2] Fix wrong lora mixin (huggingface#13144)

b0dc51d

change lora mixin Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

[Pipelines] Remove k-diffusion (huggingface#13152)

59e7a46

* remove k-diffusion * fix copies

[tests] accept recompile_limit from the user in tests (huggingface#13150

e390646

) accept recompile_limit from the user in tests

[core] support device type device_maps to work with offloading. (hugg…

35086ac

…ingface#12811) * support device type device_maps to work with offloading. * add tests. * fix tests * skip tests where it's not supported. * empty * up * up * fix allegro.

[CI] Add ftfy as a test dependency (huggingface#13155)

f81e653

* update * update * update * update * update * update

docs: improve docstring scheduling_flow_match_lcm.py (huggingface#13160)

64734b2

Improve docstring scheduling flow match lcm

[docs] add docs for qwenimagelayered (huggingface#13158)

6875490

* add example * feedback

Fix ftfy import for PRX Pipeline (huggingface#13154)

fe78a7b

* Guard ftfy import with is_ftfy_available * Remove xfail for PRX pipeline tests as they appear to work on transformers>4.57.1 * make style and make quality

[core] Enable CP for kernels-based attention backends (huggingface#12812

99daaa8

) * up * up * up * up --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

miguelmartin75 force-pushed the cosmos/transfer2.5-ar branch from 55b09c0 to 16602ad Compare February 20, 2026 02:43

sayakpaul and others added 12 commits February 20, 2026 08:35

remove deps related to test from ci (huggingface#13164)

f8d3db9

[CI] Fix new LoRAHotswap tests (huggingface#13163)

db2d7e7

update Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

Support Flux Klein peft (fal) lora format (huggingface#13169)

a80b192

peft (fal) lora format

Fix T5GemmaEncoder loading for transformers 5.x composite T5GemmaConf…

f1e5914

…ig (huggingface#13143)

Allow Automodel to use from_config with custom code. (huggingface#1…

4890e9b

…3123) * update * update

Fix AutoModel typing Import Error (huggingface#13178)

7ab2011

Fix typing import by converting to Python 3.9+ style type hint

[docs] Fix torchrun command argument order in docs (huggingface#13181)

aac94be

Fix torchrun command argument order in docs

AR

734f045

address comments

03b666a

miguelmartin75 force-pushed the cosmos/transfer2.5-ar branch from 16602ad to 03b666a Compare February 24, 2026 23:59

address comments 2

a66a12a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support AR inference#4

Support AR inference#4
miguelmartin75 wants to merge 47 commits intocosmos/transfer2.5from
cosmos/transfer2.5-ar

miguelmartin75 commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

Conversation

miguelmartin75 commented Feb 5, 2026

What does this PR do?

Before submitting

Who can review?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants