Skip to content

【Hackathon 10th Spring No.50】【RFC】新增 MiniCPM4.1-8B 模型设计文档#1251

Open
cloudforge1 wants to merge 3 commits intoPaddlePaddle:masterfrom
cloudforge1:task/050-rfc-minicpm41
Open

【Hackathon 10th Spring No.50】【RFC】新增 MiniCPM4.1-8B 模型设计文档#1251
cloudforge1 wants to merge 3 commits intoPaddlePaddle:masterfrom
cloudforge1:task/050-rfc-minicpm41

Conversation

@cloudforge1
Copy link
Copy Markdown
Contributor

概述

本 RFC 设计 FastDeploy 新增 MiniCPM4.1-8B 模型的推理支持方案。

核心策略

主要内容

  1. 模型组网代码 minicpm4.py(~400-500行)
  2. 权重映射(HuggingFace → FastDeploy)
  3. WINT2/WINT4/WINT8/FP8 量化适配
  4. 预计 ~1.5 周完成

相关链接

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Mar 20, 2026

你的PR提交成功,感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备,具体请参考示例模版
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

@luotao1
Copy link
Copy Markdown
Collaborator

luotao1 commented Mar 24, 2026

@chang-wenbin

@luotao1
Copy link
Copy Markdown
Collaborator

luotao1 commented Mar 30, 2026

参考已合并 RFC: 本设计基于 PR https://github.com/PaddlePaddle/community/pull/1183(H9 No.74 MiniCPM4.1 RFC by @essos-bot)的基础上更新

直接在原来的RFC上修改即可,不需要新建一个文件。

@cloudforge1
Copy link
Copy Markdown
Contributor Author

好的,改为直接修改原 RFC 文件,稍后更新。

Per reviewer feedback (comment #4153770822), modify the original
RFC file from PR PaddlePaddle#1183 directly rather than creating a separate file.

- Delete: 20260708_add_minicpm41_8b_for_fastdeploy.md (new file)
- Update: 20251114_add_minicpmV41_for_fastdeploy.md (existing file)
- Content: H10 updated design with MLA reuse strategy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants