[Cpp API Compatibility] Fix CUDAContext.h to align with Pytorch#78584
[Cpp API Compatibility] Fix CUDAContext.h to align with Pytorch#78584SigureMo merged 7 commits intoPaddlePaddle:developfrom
CUDAContext.h to align with Pytorch#78584Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Pull request overview
This PR updates the ATen CUDA compat header(s) to better match PyTorch’s include surface so that #include <ATen/cuda/CUDAContext.h> also brings in the CUDA stream definitions needed by downstream builds (e.g., DeepEP).
Changes:
- Add missing transitive includes (
ATen/cuda/Exceptions.h,c10/cuda/CUDAStream.h) toATen/cuda/CUDAContext.h. - Update
CUDAContext.cppto includeATen/cuda/CUDAContext.hinstead ofCUDAContextLight.h.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| paddle/phi/api/include/compat/ATen/cuda/CUDAContext.h | Adds missing includes for CUDA stream / exceptions, but changes conditional include behavior. |
| paddle/phi/api/include/compat/ATen/cuda/CUDAContext.cpp | Switches to including the main CUDAContext header. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| #include <ATen/cuda/CUDAContextLight.h> | ||
| #endif | ||
|
|
||
| #include <ATen/cuda/Exceptions.h> | ||
| #include <c10/cuda/CUDAStream.h> |
There was a problem hiding this comment.
CUDAContext.h used to only include CUDAContextLight.h when PADDLE_WITH_CUDA/PADDLE_WITH_HIP was enabled. After this change the header unconditionally includes CUDAContextLight.h and c10/cuda/CUDAStream.h, which pulls in CUDA-only APIs (e.g., cudaError_t / cudaStream_t) and will break compilation in non-CUDA/HIP builds that still include this header transitively. Please restore a build-flag guard around the CUDA-specific includes (or provide CPU stubs matching the previous behavior) while still adding c10/cuda/CUDAStream.h for CUDA/HIP builds.
There was a problem hiding this comment.
Please restore a build-flag guard around the CUDA-specific includes (or provide CPU stubs matching the previous behavior) while still adding
c10/cuda/CUDAStream.hfor CUDA/HIP builds.
已看到最新提交在 ATen/cuda/CUDAContext.h 恢复了 PADDLE_WITH_CUDA/HIP 外层 guard,这个兼容性问题已经修复。我这边再等当前这轮 CI 跑完做最终确认。
ShigureNyako
left a comment
There was a problem hiding this comment.
这次 PR 的主目标我理解是:让 #include <ATen/cuda/CUDAContext.h> 的 include surface 更接近 PyTorch,同时补齐 c10::cuda::CUDAStream 相关实现,解决 DeepEP 侧的编译接入问题。
我这边先卡一个兼容性问题:当前改动把 ATen/cuda/CUDAContext.h 改成了无条件包含 c10/cuda/CUDAStream.h,但 c10/cuda/CUDAStream.h 在非 CUDA/HIP 构建下并没有完整兜住 cudaStream_t 等类型暴露,CPU-only/非 GPU 构建存在头文件编译回退风险。这个问题建议先修掉,再继续看后续是否还有需要补的地方。
CI 方面我看到目前还有若干 job 在运行;Linux-IXUCA 已失败,日志看起来是下游补丁未能 apply,未必是这组改动直接引起,但当前也还不是全绿状态。
| #if defined(PADDLE_WITH_CUDA) | ||
| #include <cuda_runtime_api.h> | ||
| #elif defined(PADDLE_WITH_HIP) | ||
| #include <hip/hip_runtime.h> | ||
| #endif | ||
|
|
||
| #include <ostream> | ||
| #include <tuple> | ||
|
|
||
| #include "paddle/phi/api/include/context_pool.h" | ||
| #include "paddle/phi/backends/gpu/gpu_info.h" | ||
| #include "paddle/phi/common/place.h" |
There was a problem hiding this comment.
这里有一个兼容性回退:这一版把 paddle/phi/core/cuda_stream.h 去掉后,cudaStream_t 只会在 PADDLE_WITH_CUDA/HIP 下通过 runtime header 引入;但同一个头里 CUDAStream 类、make_cuda_stream、getStreamFromExternal 等声明仍然会在所有构建里直接暴露 cudaStream_t。再叠加 ATen/cuda/CUDAContext.h 现在无条件 #include <c10/cuda/CUDAStream.h>,非 CUDA/HIP 构建一旦传递包含到这里,就会遇到未声明类型的编译错误。
建议至少二选一处理:
- 恢复
ATen/cuda/CUDAContext.h的外层 build flag guard; - 或者保留旧实现那种可在非 CUDA/HIP 下成立的前置声明 / stub,使
c10/cuda/CUDAStream.h本身仍然是“可被安全包含”的。
|
/re-run all-failed |
ShigureNyako
left a comment
There was a problem hiding this comment.
我重点复核了上一轮卡住的 blocker:ATen/cuda/CUDAContext.h 现在已经恢复 PADDLE_WITH_CUDA/HIP 外层 guard,非 CUDA/HIP 构建不会再被动包含 c10/cuda/CUDAStream.h,之前提到的头文件可包含性回退已经解除。
另外我继续看了这轮后续补丁:CUDAStream 相关接口形状仍然是在向 PyTorch 对齐,当前没有再看到新的 BREAKING include 风险或“半迁移”状态;最新的 fix dcu again 主要是在 CUDAException.h 里把 runtime API 依赖显式补齐,属于构建兼容性收尾。CI 方面,Linux-DCU / Build 已恢复通过;Linux-IXUCA 仍然失败,但结合前面日志更像下游 patch/apply 或环境侧问题,不像这组代码本身引入的新 blocker。其余还有少量 job 在跑,不过按当前代码状态我这边可以通过。
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #78584 +/- ##
==========================================
Coverage ? 94.44%
==========================================
Files ? 1
Lines ? 72
Branches ? 0
==========================================
Hits ? 68
Misses ? 4
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
PR Category
Execute Infrastructure
PR Types
Bug fixes
Description
修复
#include <ATen/cuda/CUDAContext.h>时没有#include <c10/cuda/CUDAStream.h>的错误,用于 DeepEP 的编译对齐

getStreamFromPool函数,详细兼容架构文档已在 PFCCLab/PaddleCppAPITest#59 中更新是否引起精度变化
否