Skip to content

[ROCm 7.0] Add support for AMD CDNA4 and ROCm 7.0#77641

Open
M4jupitercannon wants to merge 16 commits intoPaddlePaddle:developfrom
M4jupitercannon:amd
Open

[ROCm 7.0] Add support for AMD CDNA4 and ROCm 7.0#77641
M4jupitercannon wants to merge 16 commits intoPaddlePaddle:developfrom
M4jupitercannon:amd

Conversation

@M4jupitercannon
Copy link
Copy Markdown
Contributor

@M4jupitercannon M4jupitercannon commented Feb 3, 2026

PR Category

Inference

PR Types

New features

Description

[ROCm 7.0] Add support for AMD CDNA4 and ROCm 7.0
Key changes:

  • Update cmake/hip.cmake: Fix HIP_PATH and CMAKE_MODULE_PATH for ROCm 7.0
  • Update cmake/rccl.cmake: Fix RCCL header include path
  • Update cmake/thrust.cmake: Skip patches when ROCm has native shuffle
  • Fix GPU architectures: Add gfx950, remove unsupported gfx926/gfx928/gfx936
  • Fix hiprand/rocrand include paths for ROCm 7.0 directory structure
  • Fix hipPointerAttribute_t.memoryType -> type API change
  • Add HIPCC guards for thrust/rocprim headers in non-device code
  • Use rocblas complex types instead of thrust::complex
  • Create ROCm 7.0 patches for warpctc and warprnnt
  • Disabled operators due to rocprim trait incompatibility: argsort, mode, randperm
    Tested: Paddle compiled successfully with ROCm 7.0.0

是否引起精度变化

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Feb 3, 2026

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Feb 3, 2026
@M4jupitercannon M4jupitercannon marked this pull request as draft February 3, 2026 02:12
@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/rerun all-failed

1 similar comment
@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/rerun all-failed

@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Feb 3, 2026

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@1a009f4). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...thon/paddle/utils/cpp_extension/extension_utils.py 0.00% 2 Missing ⚠️

❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop   #77641   +/-   ##
==========================================
  Coverage           ?    0.00%           
==========================================
  Files              ?        1           
  Lines              ?        2           
  Branches           ?        0           
==========================================
  Hits               ?        0           
  Misses             ?        2           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/rerun all-failed

@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@M4jupitercannon M4jupitercannon marked this pull request as ready for review February 4, 2026 06:07
@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

1 similar comment
@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

M4jupitercannon and others added 3 commits February 24, 2026 10:51
- Revert test_runner.py sys.path/chdir changes that broke XPU tests
- Fix cmake-format issues in warpctc, warprnnt, rccl, third_party, CMakeLists
- Fix trailing whitespace in rccl.cmake and CMakeLists.txt
- Fix clang-format include ordering in allocator_facade.cc, rocprim_traits.h
- Fix cpplint line-length in enforce.h, blas_impl.hip.h, complex.h,
  graph_send_ue_recv_funcs.h, values_vectors_functor.h
@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

2 similar comments
@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

1 similar comment
@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

Add a unit test that mocks ROCm mode and asserts `_get_cuda_arch_flags()` returns an empty list so PR coverage includes the new ROCm guard path.

Made-with: Cursor
@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

Added a coverage test for the ROCm branch in _get_cuda_arch_flags (test_rocm_returns_empty_flags) to address the Codecov missing lines.

Could one of the required paddle/phi approvers please review and approve this PR?

/re-run all-failed

Apply ruff-compatible multiline formatting in the new ROCm arch-flag unit test to satisfy the pre-commit style gate.

Made-with: Cursor
@M4jupitercannon
Copy link
Copy Markdown
Contributor Author

/re-run all-failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants