Skip to content

Add debug log for troubleshooting#1

Closed
rainyfly wants to merge 2181 commits intodevelopfrom
test_cc_for_commit
Closed

Add debug log for troubleshooting#1
rainyfly wants to merge 2181 commits intodevelopfrom
test_cc_for_commit

Conversation

@rainyfly
Copy link
Copy Markdown
Owner

Summary

  • 在 `post_process_normal` 函数中添加调试日志,用于问题排查

Test plan

  • 验证调试日志正常输出
  • 确认不影响原有功能

🤖 Generated with Claude Code

ZhangYulongg and others added 30 commits February 28, 2026 14:54
…dlePaddle#6407)

* Optim GPU Mem Usage

---------

Co-authored-by: huzesen <huzesen@baidu.com>
…addlePaddle#6541)

* fix mtp acceptance rate decline

* [BugFix][Scheduler] Fix can_schedule_block_num_threshold calculation

Fix the calculation of can_schedule_block_num_threshold in
ResourceManagerV1. The original formula using need_prefill_tokens
could lead to incorrect threshold values. Now directly use
num_chunk_new_block for accurate block scheduling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* [Docs] Update code overview documentation

- Add comprehensive FastDeploy code structure overview
- Include detailed module descriptions and development guides
- Add quick development guide for common tasks
- Update both English and Chinese versions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* [Docs] Update code overview documentation format

- Convert file path links from [file](path) to `file` inline code format
- Add proper spacing for better readability in markdown tables
- Maintain consistent formatting across English and Chinese docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…e/fused_moe_cutlass_backend.py 单测补充 (PaddlePaddle#6209)

* [CI] 【Hackathon 10th Spring No.24】功能模块 fastdeploy/model_executor/layers/moe/fused_moe_triton_backend.py 单测补充

* [CI] 【Hackathon 10th Spring No.23】fastdeploy/model_executor/layers/moe/fused_moe_cutlass_backend.py 单测补充

* [CI] 【Hackathon 10th Spring No.23】fastdeploy/model_executor/layers/moe/fused_moe_cutlass_backend.py 单测补充

* Merge branch 'develop' into 23

* Merge branch 'develop' into 23

* Merge branch 'develop' into 23

* Merge branch 'develop' into 23

---------

Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com>
…_fp8 and no storage backend (PaddlePaddle#6516)

* [fix] fix cache transfer manager init failed when using block_wise_fp8 and no storage backend

* [fix] fix test_cache_transfer_manager

* [fix] fix test_cache_transfer_manager again

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
* lazy enable_torch_proxy for cutlass

* test init_flash_attn_version
…ocation errors (PaddlePaddle#6531)

* [BugFix] Add safety checks in recycle_gpu_blocks to prevent block allocation errors

- Check prefix tree status before recycling GPU blocks
- Validate gpu_block_ids is a list
- Add overflow check to prevent free block count exceeding total blocks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* [BugFix] Fix AttributeError in recycle_gpu_blocks when prefix_tree_status_signal not initialized

- Add hasattr check before accessing prefix_tree_status_signal
- The signal is only initialized in launch_cache_messager, not in __init__
- Fixes CI test failure in test_prefix_cache_manager.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* [BugFix] Reset prefix cache when model weights are updating

- Call self.reset() before setting status to NORMAL in UPDATING state
- Ensure cache consistency when model weights change
- Consistent with CLEARING state handling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ransfer benchmark tool (PaddlePaddle#6434)

* fix cache messager performance problem

* dispatch param type
…client.py 单测补充 (PaddlePaddle#6158)

* fix codestyle and update unit test coverage workflow

* fix test_engine_client.py: add main_process_metrics mock to prevent KeyError

* fix test_engine_client.py: comprehensive test improvements

* feat: enhance test_engine_client.py with comprehensive test improvements

* fix: resolve test failures in test_engine_client.py

* test: enhance EngineClient test coverage with comprehensive test suite

* test: add comprehensive EngineClient test suite (codestyle checked)
* [BugFix] Fix mtp when token_ids_all is None

* fix bug
…x_cache_manager.py单测补充 (PaddlePaddle#6297)

* test: update prefix cache manager tests

* test: refine prefix cache manager coverage helpers

* style: apply black formatting to test_prefix_cache_manager.py

Co-authored-by: Cursor <cursoragent@cursor.com>

* tests: update test_prefix_cache_manager

Co-authored-by: Cursor <cursoragent@cursor.com>

* update

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
…lePaddle#6501)

* add speculate_pre_process kernel

* reduce one slice

* make d2h async && fix mtp bug for new pre_process

* fix

* add unitest

* fix: code stype formatting

* fix

* fix: thread race in speculate_preprocess && rename d2h event
huicongyao and others added 29 commits March 25, 2026 22:54
…addle#6812)

* reformat eagle_get_hidden_states & eagle_get_self_hidden_states

* readibility

* fix xpu bug

* fix coverage failure

* change luanch params & parallelize position_map compute

* Fix MTP-related bugs in FastDeploy centralized inference

* fix

* refactor mtp hidden_states process

* fix

* add unittest & optimize kernel

* remove useless code

* fix
* [CE]add 21b cpu cache ,glm mtp,glm for rl config

* [CE]add 21b tp2 yaml

* [CE]add 21b mooncake yaml

* add fastdeploy benchmark,paddletest-155

* [CE] adjust vl wint4 config

* [CE]add glm mtp with updatemodel config

* [CE]fix

* fix

* test

* test

* test

---------

Co-authored-by: xiegegege <>
* fix xpu ci bug

* Remove unnecessary blank line in conftest.py

* Update upload-artifact action to version 6

* Update _xpu_8cards_case_test.yml

* fix ci bug

* Change exit code on test failure to 1

* fix ci bug

* fix ci bug

* fix ci bug

* fix ci bug

* Update conftest.py
* [CI]【Hackathon 10th Spring No.43】ernie4_5_mtp 单测补充

* [CI]【Hackathon 10th Spring No.43】add mapping and forward branch coverage

---------

Co-authored-by: cloudforge1 <cloudforge1@users.noreply.github.com>
Co-authored-by: CSWYF3634076 <wangyafeng@baidu.com>
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
* [BugFix] xpu fix speculate schedule cache kernel

* fix code style
…Paddle#7050)

Removed the --ipc=host option from the docker run command.
…addlePaddle#7048)

* [Feature] Support --skip-mm-profiling to skip multimodal token overhead in profiling

## Motivation

在多模态模型(如 Qwen2.5-VL、ERNIE4.5-VL 等)部署时,`get_max_chunk_tokens` 会在
基础 token 数之上额外叠加 mm token 数,用于 profiling 阶段预留显存。

某些场景下(如已知图像 token 数较小,或希望节省显存),用户希望跳过该多模态 token
额外开销的计算,直接使用文本 token 数进行 profiling。

## Modifications

- `fastdeploy/engine/args_utils.py`:`EngineArgs` 新增 `skip_mm_profiling: bool = False`
  字段,parser 新增 `--skip-mm-profiling` 启动参数
- `fastdeploy/config.py`:`ModelConfig.__init__` 新增 `self.skip_mm_profiling = False`;
  `FDConfig.get_max_chunk_tokens` 中增加 `not self.model_config.skip_mm_profiling` 判断,
  开启后跳过 mm token 叠加,直接返回基础 `num_tokens`

## Usage or Command

启动服务时添加参数:
```bash
--skip-mm-profiling
```

## Checklist

- [x] Add at least a tag in the PR title.
- [x] Format your code, run `pre-commit` before commit.
- [ ] Add unit tests. 本功能为配置参数透传,逻辑简单,已有相关 config 单元测试覆盖。

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [Refactor] Replace skip_mm_profiling with deploy_modality=text to skip mm profiling

## Motivation

原 `--skip-mm-profiling` 参数与已有的 `deploy_modality` 参数功能存在语义重叠:
当以纯文本模式(`deploy_modality=text`)部署时,本就不需要为多模态 token 预留显存。
引入独立参数增加了配置复杂度,复用 `deploy_modality` 更加直观和一致。

## Modifications

- `fastdeploy/engine/args_utils.py`:删除 `EngineArgs.skip_mm_profiling` 字段及
  `--skip-mm-profiling` 启动参数
- `fastdeploy/config.py`:删除 `ModelConfig.__init__` 中的 `self.skip_mm_profiling = False`;
  `FDConfig.get_max_chunk_tokens` 中将条件改为
  `self.deploy_modality != DeployModality.TEXT`,
  当 deploy_modality 为 text 时直接返回 `max_num_batched_tokens`,跳过 mm token 叠加

## Usage or Command

```bash
# 以文本模式部署,跳过 mm token profiling 开销(替代原 --skip-mm-profiling)
python -m fastdeploy.entrypoints.openai.api_server \
  --deploy-modality text \
  --model /path/to/model \
  ...
```

## Checklist

- [x] Add at least a tag in the PR title.
- [x] Format your code, run `pre-commit` before commit.
- [ ] Add unit tests. 本次为参数重构,逻辑等价替换,已有 config 单元测试覆盖。

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* add cute cpp fa4

* 删掉注释

* 修正合并错误

* sm_version放到函数内

* ci错误
…end (PaddlePaddle#7028)

* [BugFix] Fix kv cache int8 dynamic quant on flash and flash_mask backend

* add constexpr and code style clean

* add test

* fix code style

* fix test
* merge text processor

* update

* fix unit test

* merge messages2ids

* fix unit test

* 删除重复代码

* remove redundant code

* delete code

* fix unit test
…xtra_keys (PaddlePaddle#6929)

* [BugFix][KVCache] Fix mm hash boundary comparison in get_block_hash_extra_keys

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* [BugFix][KVCache] Fix test_get_block_hash_extra_keys_boundary_cases assertions

## Motivation

测试用例 `test_get_block_hash_extra_keys_boundary_cases` 中,Block [4,8) 的
调用错误地传入了 `mm_idx=1`,跳过了 img0[2,5);但 img0 覆盖 token 4,token 4
属于 block [4,8),应被包含在 hash_keys 中。此外,所有 assertEqual 只校验了
hash_keys,未校验返回的 mm_idx 游标。

## Modifications

- `test_get_block_hash_extra_keys_boundary_cases`:
  - 改为链式调用,用上一次返回的 mm_idx 作为下一次入参,模拟真实调用循环
  - Block [4,8) 入参从 `mm_idx=1` 改为沿用上次返回的 `mm_idx=0`,期望值从 `[]` 改为 `["hash-0"]`
  - 所有断言改为 `assertEqual((mm_idx, hash_keys), (...))` 同时校验游标
- `test_get_block_hash_extra_keys_no_overlap_at_boundaries`:
  - Case B 入参从 `mm_idx=1` 改为 `mm_idx=0`(从头遍历,img-a 走 continue)
  - 所有断言增加 mm_idx 校验
- `test_get_block_hash_extra_keys_image_crosses_block_boundary`:
  - 所有断言增加 mm_idx 校验
- `test_get_block_hash_extra_keys_no_mm_inputs`:
  - 断言增加 mm_idx 校验
- `test_get_block_hash_extra_keys_handles_multimodal_segments`:
  - call2、call3 断言增加 mm_idx 校验

## Usage or Command

```bash
python -m pytest tests/cache_manager/test_prefix_cache_manager.py::TestPrefixCacheManagerCoverage -v -k "get_block_hash_extra_keys"
```

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: chengyanfu <chengyanfu@baidu.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Added debug print statement in post_process_normal function for troubleshooting purposes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rainyfly rainyfly closed this Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.