Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes macOS compatibility issues with Paddle library linking by adding platform-specific dynamic library resolution logic. The changes add build-time validation to ensure required Paddle libraries exist before attempting to link them.
Changes:
- Introduced
resolve_paddle_library()function to validate library paths at CMake configuration time - Added platform-specific library path detection for macOS, Windows, and Linux
- Restricted Linux-specific
--as-neededlinker flag to prevent macOS linker errors
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| set(PADDLE_LIBPHI_CORE_CANDIDATES | ||
| "${PADDLE_PATH}/libs/libphi_core.dll" | ||
| "${PADDLE_PATH}/libs/phi_core.dll" | ||
| ) |
There was a problem hiding this comment.
On Windows, the library candidates should list .lib files before .dll files. Import libraries (.lib) are typically used for linking, while dynamic libraries (.dll) are loaded at runtime. If both files exist, the current order will select the .dll first, which may not work correctly with CMake's target_link_libraries. Consider reordering to check for .lib files first.
| if(APPLE) | ||
| # Paddle on macOS ships libpaddle as .so, while runtime deps are .dylib. | ||
| set(PADDLE_LIBPADDLE_CANDIDATES "${PADDLE_PATH}/base/libpaddle.so") | ||
| set(PADDLE_LIBCOMMON_CANDIDATES "${PADDLE_PATH}/libs/libcommon.dylib") | ||
| set(PADDLE_LIBPHI_CORE_CANDIDATES "${PADDLE_PATH}/libs/libphi_core.dylib") | ||
| elseif(WIN32) | ||
| set(PADDLE_LIBPADDLE_CANDIDATES | ||
| "${PADDLE_PATH}/base/libpaddle.dll" | ||
| "${PADDLE_PATH}/base/libpaddle.lib" | ||
| ) | ||
| set(PADDLE_LIBCOMMON_CANDIDATES | ||
| "${PADDLE_PATH}/libs/libcommon.dll" | ||
| "${PADDLE_PATH}/libs/common.dll" | ||
| ) | ||
| set(PADDLE_LIBPHI_CORE_CANDIDATES | ||
| "${PADDLE_PATH}/libs/libphi_core.dll" | ||
| "${PADDLE_PATH}/libs/phi_core.dll" | ||
| ) | ||
| else() | ||
| set(PADDLE_LIBPADDLE_CANDIDATES "${PADDLE_PATH}/base/libpaddle.so") | ||
| set(PADDLE_LIBCOMMON_CANDIDATES "${PADDLE_PATH}/libs/libcommon.so") | ||
| set(PADDLE_LIBPHI_CORE_CANDIDATES "${PADDLE_PATH}/libs/libphi_core.so") | ||
| endif() |
There was a problem hiding this comment.
The code should validate that PADDLE_PATH is defined and non-empty before using it to construct library paths. If PADDLE_PATH is not set, the error message from resolve_paddle_library will be confusing (e.g., "Unable to locate Paddle library for PADDLE_LIBPADDLE. Tried: /base/libpaddle.so"). Consider adding a check like if(NOT DEFINED PADDLE_PATH OR PADDLE_PATH STREQUAL "") with a clear error message before this platform-specific code block.
| return() | ||
| endif() | ||
| endforeach() | ||
| message(FATAL_ERROR "Unable to locate Paddle library for ${out_var}. Tried: ${ARGN}") |
There was a problem hiding this comment.
The error message could be more user-friendly. Currently it says "Unable to locate Paddle library for PADDLE_LIBPADDLE" which uses the variable name. Consider changing the message to be more descriptive, for example: "Unable to locate Paddle library '${out_var}'. Searched in: ${ARGN}" or maintain a mapping from variable names to descriptive names like "libpaddle", "libcommon", etc.
| message(FATAL_ERROR "Unable to locate Paddle library for ${out_var}. Tried: ${ARGN}") | |
| message(FATAL_ERROR "Unable to locate Paddle library '${out_var}'. Searched in: ${ARGN}") |
| "${PADDLE_PATH}/base/libpaddle.dll" | ||
| "${PADDLE_PATH}/base/libpaddle.lib" |
There was a problem hiding this comment.
On Windows, the library candidates should list .lib files before .dll files. Import libraries (.lib) are typically used for linking, while dynamic libraries (.dll) are loaded at runtime. If both files exist, the current order will select the .dll first, which may not work correctly with CMake's target_link_libraries. Consider reordering to check for .lib files first.
| "${PADDLE_PATH}/base/libpaddle.dll" | |
| "${PADDLE_PATH}/base/libpaddle.lib" | |
| "${PADDLE_PATH}/base/libpaddle.lib" | |
| "${PADDLE_PATH}/base/libpaddle.dll" |
| set(PADDLE_LIBCOMMON_CANDIDATES | ||
| "${PADDLE_PATH}/libs/libcommon.dll" | ||
| "${PADDLE_PATH}/libs/common.dll" | ||
| ) |
There was a problem hiding this comment.
On Windows, the library candidates should list .lib files before .dll files. Import libraries (.lib) are typically used for linking, while dynamic libraries (.dll) are loaded at runtime. If both files exist, the current order will select the .dll first, which may not work correctly with CMake's target_link_libraries. Consider reordering to check for .lib files first.
|
@ShigureNyako 帮我看看这么改动态链接有没有什么问题 |
ShigureNyako
left a comment
There was a problem hiding this comment.
Thanks for working on the macOS path handling here. I checked this against Paddle's current C++ compatibility layout (libpaddle.so in paddle/base, libcommon / libphi_core in paddle/libs, plus the compat include dirs), and the library-name adaptation itself looks directionally right.
However, I can't approve this yet because the PR claims to fix the macOS runtime FFmpeg loading failure, but the current change only adjusts link-time options and library name resolution. The known macOS failure mode is unresolved @rpath/libav*.dylib lookup during import, and this patch does not add any runtime search-path fix (INSTALL_RPATH, @loader_path, install-name adjustment, etc.).
Also, all green checks here are Linux-only, so there is still no macOS evidence that the new path/linking logic actually fixes import torchcodec.
| target_link_options( | ||
| ${library_name} | ||
| PUBLIC | ||
| "LINKER:-undefined,dynamic_lookup" |
There was a problem hiding this comment.
I don't think this addresses the macOS runtime failure described in the PR.
-undefined,dynamic_lookup only relaxes symbol resolution at link time; it does not add any runtime search path for FFmpeg dylibs. The failure we usually hit on macOS is Library not loaded: @rpath/libavutil*.dylib during import torchcodec / torch.ops.load_library(...).
Upstream torchcodec fixed that class of bug in meta-pytorch#1152 (see issue meta-pytorch#570) by adding an INSTALL_RPATH for Homebrew FFmpeg. This patch does not add any INSTALL_RPATH, @loader_path, or install-name adjustment, so users will still need to set DYLD_LIBRARY_PATH manually.
Can we port the actual runtime search-path fix here, and add at least one macOS validation for import torchcodec?
ShigureNyako
left a comment
There was a problem hiding this comment.
感谢补这个 macOS 兼容性修复。我结合 Paddle 当前 C++ 兼容层和打包布局一起看了一遍:libpaddle 在 paddle/base,libcommon / libphi_core 在 paddle/libs,兼容头也确实来自 include/paddle/phi/api/include/compat/...。因此,这里按平台区分 Paddle 动态库名称和路径的适配方向本身是基本正确的。
但我这里还是不能通过,原因是这版改动没有真正覆盖 PR 描述里的“macOS 运行时加载 ffmpeg 动态库失败”。当前 src/torchcodec/_core/CMakeLists.txt 里的修改,主要是库名解析、把 -Wl,--as-needed 限制到 Linux,以及在 macOS 上补了 LINKER:-undefined,dynamic_lookup。这些都偏向链接阶段处理,不会给 FFmpeg 的 dylib 提供运行时搜索路径;已知的 macOS 失败模式仍然是导入时 @rpath/libav*.dylib 找不到。上游 torchcodec 也是在 issue meta-pytorch#570 / PR meta-pytorch#1152 里通过补 INSTALL_RPATH 才真正解决这类问题,这个 PR 目前还没有等价处理。
另外,现有通过的 CI 也都是 Linux wheel build/test,没有 macOS 构建或最基本的 import torchcodec 验证,因此还缺少直接证据证明修复已经生效。建议把真正的 macOS runtime 搜索路径修复补齐后,再补至少一条 macOS 验证。
fix