Conversation
Co-authored-by: Yang Yong (雍洋) <yongyang1030@163.com>
Co-authored-by: qinxinyi <qxy118045534@163.com>
Co-authored-by: yihuiwen <yihuiwen@sensetime.com>
Co-authored-by: qinxinyi <qxy118045534@163.com>
Co-authored-by: gushiqiao <975033167@qq.com>
Feature:
1. added mlu590 bfloat16, single-gpu and multi-gpus inference.
2. added mlu590 int8 inference.
Thanks to HunyuanVideo Team and ModelTC Team. --------- Co-authored-by: gushiqiao <975033167@qq.com> Co-authored-by: gushiqiao <77222802+gushiqiao@users.noreply.github.com> Co-authored-by: chendingyu <chendingyu1@sensetime.com> Co-authored-by: XHPlus <xhplus@163.com> Co-authored-by: wangshankun <wangshankun2011@hotmail.com> Co-authored-by: STwangyingrui <86730325+STwangyingrui@users.noreply.github.com> Co-authored-by: root <root@pt-80f094c20fc44a8cad096e5f3dbc962e-worker-0.pt-80f094c20fc44a8cad096e5f3dbc962e.ns-devsft-3460edd0.svc.cluster.local>
Added new model links and recommendations for lightweight autoencoders.
--linear_dtype and --linear_quant_dtype unify as --linear_type
Updated README_zh.md with new features and model support.
### 单卡
```bash
python examples/simple_launch.py
```
```python
# examples/simple_launch.py
from lightx2v import LightGenerator
generator = LightGenerator(
model_path="/path/to/Wan2.1-T2V-1.3B",
model_cls="wan2.1",
task="t2v",
)
video_path = generator.generate(
prompt="Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.",
negative_prompt="镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走",
seed=42,
save_result_path="output.mp4",
)
```
### 多卡
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
torchrun --nproc_per_node=8 examples/multi_launch.py
```
---------
Co-authored-by: gushiqiao <975033167@qq.com>
ring_attn: fp8_comm & kv_fusion --------- Co-authored-by: root <root@pt-72be2ccd01a14fa18a4b18c6c347f823-worker-0.pt-72be2ccd01a14fa18a4b18c6c347f823.ns-devsft-3460edd0.svc.cluster.local>
use_kv_fusion → use_tensor_fusion --------- Co-authored-by: root <root@pt-72be2ccd01a14fa18a4b18c6c347f823-worker-0.pt-72be2ccd01a14fa18a4b18c6c347f823.ns-devsft-3460edd0.svc.cluster.local>
# kernel-base text encoder 集成了sgl_kernel的优化算子,同时启用了flash attention - flash attention3 - Rmsnorm: use sgl_kernel: from sgl_kernel.elementwise import rmsnorm # service text encoder 使用分离部署,多个推理进程可共享同一个 encoder 服务,可处理并发请求 - Triton自动调优 LIGHTLLM_TRITON_AUTOTUNE_LEVEL=1 - lightllm集成flash attention3, rmsnorm等优化算子 ``` ================================================================================ COMPARISON SUMMARY ================================================================================ Encoder | Time (ms) | Speedup | Cosine Sim | 端到端精度 -------------------------------------------------------------------------------- Baseline (HF) | 92.17 | 1.00x | 1.0000 | PASS Kernel (Flash-2) | 81.23 | 1.13x | 0.9900 | PASS Service (Optimized) | 71.21 | 1.29x | 0.9492 | PASS ================================================================================ ``` > 上表为纯推理时间对比,service mode还需考虑上网络通信开销(约5ms), 服务router开销等
…ModelTC#791) Co-authored-by: gushiqiao <975033167>
Co-authored-by: gushiqiao <975033167>
Co-authored-by: gushiqiao <975033167>
ModelTC#810) …e tiling --------- Co-authored-by: gushiqiao <975033167>
…TC#776) `input_info.return_result_tensor` was ignored for all image generation. Outputting the tensor can be useful for post-processing (such as NSFW checking), without reloading the file from disk. I noticed that for video models, they do not return the tensor directly; they return a map of {"video": tensor} [here](https://github.com/ModelTC/LightX2V/blob/38f9ac0513d0a097df1dd49e95ec4cc73ec426cb/lightx2v/models/runners/default_runner.py#L445). I believe this is for compatibility with ComfyUI. If that's the case, we should only return the tensor and move ComfyUI-specific patterns to the ComfyUI wrapper codebase. What do you think?
Co-authored-by: yihuiwen <yihuiwen@sensetime.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.