Skip to content

【Hackathon 10th Spring No.2】Add schema parser and related functionality for Torch compatibility#77938

Open
gouzil wants to merge 31 commits intoPaddlePaddle:developfrom
gouzil:feat/add_schema_parser_torch_compatibility
Open

【Hackathon 10th Spring No.2】Add schema parser and related functionality for Torch compatibility#77938
gouzil wants to merge 31 commits intoPaddlePaddle:developfrom
gouzil:feat/add_schema_parser_torch_compatibility

Conversation

@gouzil
Copy link
Copy Markdown
Member

@gouzil gouzil commented Feb 16, 2026

PR Category

Execute Infrastructure

PR Types

Improvements

Description

PaddlePaddle/community#1210 的实现

  • TORCH_LIBRARY 注册机制添加 schema 支持, 修改了 OperatorRegistry 的 schema 数据结构
  • 支持参数默认值和指定参数赋值
  • 支持 schema 的 alias 表示
  • 增加 schema 解析器和 schema type 解析器

单测调整

TODO

  • cinn 和 paddle/pir/src/core/utils.cc 都用到了相同的 hash_combine 函数,可以考虑提到一个通用的地方
  • 添加一些 e2e 的测试 6676b1c

初步验证了 https://github.com/PFCCLab/paddlecodec

是否引起精度变化

Copilot AI review requested due to automatic review settings February 16, 2026 14:27
@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Feb 16, 2026

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Feb 16, 2026
@gouzil gouzil changed the title 【Hackathon 10th Spring No.2】Add schema parser and related functionality for Torch compatibility 【WIP】【Hackathon 10th Spring No.2】Add schema parser and related functionality for Torch compatibility Feb 16, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive schema parser functionality to the TORCH_LIBRARY registration mechanism, enabling PyTorch-compatible operator definitions with full type information, default values, keyword arguments, and alias annotations. This is part of the Hackathon 10th Spring No.2 task and significantly enhances the compatibility layer between PaddlePaddle and PyTorch.

Changes:

  • Implemented schema parsing infrastructure including FunctionSchemaParser and SchemaTypeParser to parse PyTorch-style operator schemas
  • Extended FunctionArgs to support named/keyword arguments and automatic argument normalization based on parsed schemas
  • Added comprehensive type system (FunctionSchema, Argument, AliasInfo, Type hierarchy) to represent parsed schema information

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
test/cpp/compat/torch_library_test.cc Added 400+ lines of comprehensive tests for schema parsing, including default values, keyword arguments, optional types, tuples, and variadic functions
test/cpp/compat/schema_parser_type_test.cc New test file (206 lines) specifically for testing schema type parser edge cases and torchcodec-like schemas
test/cpp/compat/CMakeLists.txt Moved torch_library_test from GPU-only (nv_test) to general (cc_test) and added schema_parser_type_test
paddle/phi/api/include/compat/torch/library.h Added schema binding to CppFunction, keyword argument support in FunctionArgs, and normalize_args_by_schema logic
paddle/phi/api/include/compat/torch/library.cpp Updated OperatorRegistry to parse and bind schemas to implementations
paddle/phi/api/include/compat/torch/csrc/jit/function_schema_parser.h/cpp New 574-line schema parser implementation handling full PyTorch schema grammar
paddle/phi/api/include/compat/torch/csrc/jit/schema_type_parser.h/cpp New 240-line type parser for parsing schema types including tuples, optionals, and alias annotations
paddle/phi/api/include/compat/torch/csrc/jit/schema_parser_defs.h Constants and macros for schema parsing (type names, keywords, characters)
paddle/phi/api/include/compat/ATen/core/function_schema.h/cpp New FunctionSchema and Argument classes to represent parsed schemas
paddle/phi/api/include/compat/ATen/core/jit_type_base.h Base Type class hierarchy with SingletonOrSharedTypePtr for type system
paddle/phi/api/include/compat/ATen/core/jit_type.h Concrete type implementations (TensorType, IntType, FloatType, etc.) and schema-specific types
paddle/phi/api/include/compat/ATen/core/alias_info.h AliasInfo class for representing aliasing metadata in schemas
paddle/phi/api/include/compat/ATen/core/type_ptr.h SingletonTypePtr wrapper for singleton type instances
paddle/phi/api/include/compat/ATen/core/functional.h Utility functions (fmap, filter) for working with collections
paddle/phi/api/include/compat/CMakeLists.txt Updated build to include new schema parser source files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


const size_t idx = it->second;
if (assigned[idx]) {
throw std::runtime_error("Argument `" + name + "` is already provided");
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message "Argument name is already provided" could be more helpful by specifying whether the argument was provided both positionally and by keyword. Consider changing to: "Argument " + name + " provided both positionally and as keyword argument"

Suggested change
throw std::runtime_error("Argument `" + name + "` is already provided");
throw std::runtime_error("Argument `" + name +
"` provided both positionally and as keyword argument");

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

std::move(*alias_info))
: nullptr),
kwarg_only_(kwarg_only) {
// this is an softly-enforced invariant for out arguments.
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment contains a grammatical error. It should be "This is a softly-enforced invariant" instead of "this is an softly-enforced invariant". The article "an" should be "a" before "softly".

Suggested change
// this is an softly-enforced invariant for out arguments.
// This is a softly-enforced invariant for out arguments.

Copilot uses AI. Check for mistakes.
SigureMo

This comment was marked as outdated.

@SigureMo
Copy link
Copy Markdown
Member

@gouzil 测试哈,这个 PR 当下 🐁

Copy link
Copy Markdown
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

补充两条行内意见,主要是 named kwargs 输出可测性和稳定性建议。\n\n使用的是 gpt-5.3-codex with @codex by using gh-llm;当前只是 gh-llm 开发测试。

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Feb 17, 2026

Codecov Report

❌ Patch coverage is 97.53695% with 25 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@e27e6c2). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...e/compat/torch/csrc/jit/function_schema_parser.cpp 94.62% 18 Missing ⚠️
...i/api/include/compat/ATen/core/function_schema.cpp 93.33% 6 Missing ⚠️
...e/phi/api/include/compat/ATen/core/jit_type_base.h 98.64% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop   #77938   +/-   ##
==========================================
  Coverage           ?   97.53%           
==========================================
  Files              ?       11           
  Lines              ?     1015           
  Branches           ?        0           
==========================================
  Hits               ?      990           
  Misses             ?       25           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参考 torch 的代码记得添加声明

// #The file has been adapted from pytorch project
// #Licensed under  BSD-style license -
// https://github.com/pytorch/pytorch/blob/main/LICENSE

参考 ivalue.h

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参考 #78590 改一下声明

另外,paddlecodec 那边目前卡在哪里了?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参考 #78590 改一下声明

另外,paddlecodec 那边目前卡在哪里了?

我按 #78590 的样式对着看了一下,这个 PR 当前至少还有这 3 个本次新增文件仍然是旧的声明写法:

  • paddle/phi/api/include/compat/ATen/core/alias_info.h
  • paddle/phi/api/include/compat/ATen/core/function_schema.h
  • paddle/phi/api/include/compat/ATen/core/function_schema.cpp

它们现在还是:

  • // #The file has been adapted from pytorch project
  • // #Licensed under BSD-style license -

也就是还没按 #78590 里那种去掉多余 #、规范 BSD-style 行空格的写法收齐。repo 里其他 compat 老文件也还有不少同类存量,但如果只看这个 PR 自己新加的内容,我会先把这 3 处一起改掉。

paddlecodec 那边目前卡点还是 PFCCLab/paddlecodec#2

  • 现在那版改动主要是 link-time 的 libpaddle 路径/库名处理
  • 还没有覆盖我之前卡住它的 macOS runtime @rpath/libav*.dylib 查找问题
  • 也没有补 macOS 上最基本的 import torchcodec 级别验证
  • 另外 @SigureMo 之前提过“目标是减少 diff”,所以那边现在既有 runtime 搜索路径问题,也有 diff 范围需要继续收敛的问题

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

另外,paddlecodec 那边目前卡在哪里了?

c++ 部分的还有一些 torch::tensor 不支持的,python 部分正在

PFCCLab/paddlecodec#3

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不需要全部恢复,这不是这个 PR 需要考虑的问题,只需要考虑这个 PR 需要考虑的部分即可

这个 PR 合入之后 PFCCLab/paddlecodec#3 就可以合入的话,那我觉得这个 PR 目前已经达到合入标准了

PFCCLab/paddlecodec#3 依赖 #78521 吗?

Copy link
Copy Markdown
Contributor

@ShigureNyako ShigureNyako Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PFCCLab/paddlecodec#3 依赖 #78521 吗?

我回查了最新 head 5cad6631bbbd7ef6c9a65c74fd94b79fcbd8b79a,这次新 push 本身只是把 develop merge 进来;从 compare 看,这次带进来的上游提交里已经包含:

  • 3bb286ab / #78521[common] Add support for std::optional in error messages
  • e27e6c2e / #78590[Cpp API Compatibility] Normalize adapted-from-PyTorch declarations

所以如果 PFCCLab/paddlecodec#3 之前卡在 #78521,按这个 PR 当前分支基线看,这部分依赖现在已经被吸收进来了,不再是额外阻塞项。

顺带我也回查了当前状态:

  • alias_info.h / function_schema.h / function_schema.cpp 这 3 个本 PR 新增核心文件的声明样式已经跟到 #78590
  • Check approval 现在仍然失败,但日志已经是正常分行输出,当前失败原因是缺审批(cc_test / Chinese / PADDLE_API),不是之前那个换行显示问题;
  • 其他 CI 目前还在继续跑。

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return std::make_tuple(get<Types>(I)...);
}
std::vector<torch::IValue> args_;
std::unordered_map<std::string, torch::IValue> named_args_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

named_args_ 如果是 unordered_map,那是否支持在 named_args 部分通过位置传参?(torch 是否支持这种行为?我们是否支持这种行为?)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里其实是参考 https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/core/function.h 中的 Stack 和 Kwargs,但 torch 在这里会把 Kwargs 全部合入到 Stack。

现在的实现是支持混合的,比如:schema:f(int a, int b) 调用:f(1, b=2)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

args_named_args_.values() 是什么关系?包含关系还是没有交集?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

在这个组合下,int a 是存储在 args_ 的,int b 是存储在 named_args_ 的。如果出现交集会报错 Duplicate keyword argument 或者 Argument xxxx is already provided

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

那我知道了,这里 FunctionArgs 表征的是传参方式,而非函数 schema 及其运行时 binding

@@ -0,0 +1,790 @@
// Copyright (c) 2026 PaddlePaddle Authors. All Rights Reserved.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SigureMo 这个文件都是从 torch copy 过来有关 schema 的测试。暂时不支持的都有做简单的说明

Copy link
Copy Markdown
Contributor

@ShigureNyako ShigureNyako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

整体看,这个 PR 在 schema parser / type parser / kwargs 归一化这条主线上已经比较完整,测试投入也不少,作为“最小可用的 Torch schema 兼容实现”质量是不错的。

这次 review 我主要关注了两点:

  1. 实例方法的 C++ 调用路径里,kwargs / named args 现在还没有和 Python 路径保持一致,后续如果类方法也开始依赖 kw-only 参数,可能会出现行为分叉。
  2. TORCH_LIBRARY_IMPLdef(schema)def(schema, fn) 当前表现不一致,建议确认是否真的希望把后者当成受支持语义。

另外一个建议是:当前默认值解析和 type 支持范围看起来更接近“受控子集”,例如 identifier 形式默认值对真实 Torch schema 的覆盖面还偏窄。即使这是本 PR 的预期取舍,也建议在 PR 描述或代码注释里把当前支持边界写得更明确一些,方便后续使用方判断兼容范围。

{"example_library_with_mdef_cases::overload.name",
{torch::IValue(40), torch::IValue(2)},
42},
// def(schema, fn) inside IMPL block still registers through template
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这组测试现在把 TORCH_LIBRARY_IMPLdef(schema, fn) 仍然注册成功当成了预期行为;但同一个 PR 里 def(schema) 在 IMPL block 中又是显式 no-op。这样同名 API 在 IMPL block 中语义不一致,而且也和 PyTorch 里“IMPL 只做实现注册”的直觉不太一致。建议确认一下这里是不是有意保留这种行为;如果目标是更贴近 PyTorch,模板版 def 也最好加上同样的 guard,而不是把这条误用路径固化成正向测试。

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里我补一个更明确的语义对比和修复方向。

当前同一个 IMPL block 里,两个入口的行为其实不一致:

TORCH_LIBRARY_IMPL(ns, CPU, m) {
  m.def("foo(int x) -> int");          // 当前是 no-op
  m.def("bar(int x) -> int", &bar_fn); // 当前却会注册成功
}

从调用路径看,大概是:

// 非模板 def(schema)
Library::def(const std::string& schema) {
  if (kind_ == IMPL) return *this;
  register_schema(...);
}

// 模板 def(schema, fn)
Library::def(const std::string& name_or_schema, Func&& f) {
  if (name_or_schema contains "(") {
    register_schema(...);   // 这里没有 IMPL guard
  }
  register_implementation(...);
}

所以现在不是“IMPL block 不支持 def”,而是“只对一个重载不支持,另一个重载仍然放行”。这会让使用方很难判断 TORCH_LIBRARY_IMPL 的边界,也把一条本来像误用的路径变成了测试锁定的行为。

如果目标是更贴近 PyTorch / 更少歧义,我会倾向于让模板版 def(schema, fn) 也和 def(schema) 保持同一策略(拒绝或忽略其 schema 注册语义),例如至少先加同样的 kind_ == IMPL guard,再决定是直接 no-op 还是显式报错。对应测试也建议改成验证“不允许/不生效”,而不是把当前分叉行为固化下来。

@@ -474,13 +539,96 @@ class CppFunction {
if (!func_) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里把 schema-based kwargs 归一化集中放在 CppFunction::call_with_args() 是合理的;但我顺着调用链看了一下,ClassRegistry::call_method_with_args(qualified_name, method_name, instance, args) 那个重载目前只把 instance 和 positional args 拼进 full_args,没有把 args.named_args() 一起转发。这样实例方法如果后续也依赖 kw-only / keyword args,会和 Python 路径(torch_compat.h 里会保留 kwargs)出现行为不一致。建议这里一起补一个“实例方法 + kwargs/kwonly”的 C++ 单测,避免 schema 归一化只在 operator/static method 路径上生效。

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

补一个更具体的调用流示例,说明我担心的分叉点:

当前比较像这样:

// Python 路径:kwargs 会被保留
converted = convert_args_kwargs_to_function_args(args, kwargs);
function_args = [instance] + converted.positional + converted.named_args;
ClassRegistry::call_method_with_args(qualified_name, method_name, function_args);

但 C++ 这个重载现在是:

FunctionResult call_method_with_args(..., const IValue& instance, const FunctionArgs& args) {
  FunctionArgs full_args;
  full_args.add_arg(instance);
  for (size_t i = 0; i < args.size(); ++i) {
    full_args.add_arg(args.get_value(i));
  }
  return call_method_with_args(qualified_name, method_name, full_args);
}

也就是只转发了 positional,没有转发 args.named_args()

如果后面出现这种 schema / 调用:

// schema: method(Tensor self, *, int? idx=None, str mode="nearest") -> ()
FunctionArgs a;
a.add_arg(instance);
a.add_arg(torch::arg("idx") = 1);
a.add_arg(torch::arg("mode") = "nearest");

预期是 idx/mode 能继续进入 CppFunction::call_with_args() 做 schema 归一化;但当前这条 C++ 路径里它们会在拼 full_args 时丢失。

修复方向我理解可以很直接,形态大概像:

FunctionArgs full_args;
full_args.add_arg(instance);
for (size_t i = 0; i < args.size(); ++i) {
  full_args.add_arg(args.get_value(i));
}
for (const auto& [name, value] : args.named_args()) {
  torch::arg keyword(name);
  keyword = value;
  full_args.add_arg(std::move(keyword));
}
return call_method_with_args(qualified_name, method_name, full_args);

建议再补一个覆盖“实例方法 + kwonly kwargs”的 C++ 单测,这样就能把 Python/C++ 两条路径的行为约束成一致。

@gouzil gouzil requested a review from ShigureNyako March 17, 2026 01:45
@gouzil gouzil changed the title 【WIP】【Hackathon 10th Spring No.2】Add schema parser and related functionality for Torch compatibility 【Hackathon 10th Spring No.2】Add schema parser and related functionality for Torch compatibility Mar 22, 2026
Copy link
Copy Markdown
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有能让 paddlecodec 减少 diff 的 PR 么?

meta-pytorch/torchcodec@main...PFCCLab:paddlecodec:paddle

Copy link
Copy Markdown
Contributor

@ShigureNyako ShigureNyako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

整体方向我认可:schema parser / type parser / kwargs 归一化这条主线已经比较完整,测试量也明显补上来了;不过我这里还有两点需要先修掉,当前我先给 REQUEST_CHANGES

  1. paddle/phi/api/include/compat/ATen/core/alias_info.h 里的 std::hash<c10::AliasInfo> 现在没有把 hash_combine 的返回值回写给 hash,导致 beforeSets / afterSets / containedTypes 的贡献全部丢失,hash 实际上只区分 isWrite()。另外这里还额外引入了一个全局 hash_combine 符号;上游 PyTorch 用的是 c10::hash_combine
  2. paddle/phi/api/include/compat/torch/library.hdef(..., fn)TORCH_LIBRARY_IMPL block 中仍然选择“静默忽略”。这虽然比之前的不一致行为更统一,但语义上依然和 PyTorch 不一致:上游在 IMPL block 调 def() 会直接报错,而不是 no-op。当前实现会把注册错误吞掉,调试体验和兼容语义都偏掉了。

这两处我都留了行内评论。修完后我再看一次。

// Define an operator implementation
template <typename Func>
Library& def(const std::string& name_or_schema, Func&& f) & {
if (kind_ == IMPL) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果目标是贴近 PyTorch,这里不应该 silently no-op。上游 Library::_def()IMPL block 会直接 TORCH_CHECK 报错,因为静默忽略会把真实的注册错误吞掉。当前实现会把误用悄悄吃掉,调用方只会得到“没有注册”的后果;这和 PyTorch 的调试体验/语义都不一致。建议改成抛错,并把下面的 TestMDefInImplBlockIsNoop 一并改成断言失败。

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我们的前提是,torch 能跑,其实这里报不报错倒还好

after_set_hash_seed = after_set_hash_seed ^ symbol_hash;
}

hash_combine(hash, before_set_hash_seed);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的 hash_combine 返回值没有回写到 hash,所以 beforeSets / afterSets / containedTypes 的贡献都被丢掉了;当前 hash 实际上只区分 isWrite()。上游 PyTorch 这里是 hash = c10::hash_combine(hash, ...)。另外这个 helper 目前还落在全局 namespace,会额外泄漏一个未限定的 hash_combine 符号。建议直接引入 c10/util/hash.h 并按上游写法回写。

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

啊?真的假的 @gouzil

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

确实忘记回写了,之前用的 Boost 是带回写的。已修复,并补充相关单测

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的 hash_combine 返回值没有回写到 hash,所以 beforeSets / afterSets / containedTypes 的贡献都被丢掉了;当前 hash 实际上只区分 isWrite()。上游 PyTorch 这里是 hash = c10::hash_combine(hash, ...)。另外这个 helper 目前还落在全局 namespace,会额外泄漏一个未限定的 hash_combine 符号。建议直接引入 c10/util/hash.h 并按上游写法回写。

已复查:这次提交已经把 hash_combine 的返回值正确回写给 hash,并补了 unordered_set / unordered_map 的回归测试(test/cpp/compat/schema_parser_type_test.cc 里新增 AliasInfoHashWorksInUnorderedContainers)。核心哈希错误已经修掉,这个线程我先 resolve。

Copy link
Copy Markdown
Contributor

@ShigureNyako ShigureNyako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我复查了最新 head ff036b24e814576e51e0309b2d45a64eb00f6d6a

这轮有一个点已经修好:

  • AliasInfo 的哈希实现已经把 hash_combine 返回值正确回写,并补了 unordered_set / unordered_map 回归测试,这个核心问题我这边确认已修复。

TORCH_LIBRARY_IMPLdef(..., fn) 的误用语义还没修:

  • paddle/phi/api/include/compat/torch/library.h 这里仍然是 VLOG + return *this 的 silently no-op
  • test/cpp/compat/torch_library_test.ccTestMDefInImplBlockIsNoop 也还在把 no-op 当成预期

如果目标是贴近 PyTorch,这里还是应该直接报错,而不是静默吞掉误用。所以我这边当前结论仍然是 REQUEST_CHANGES,剩余阻塞点只剩这一个。

@SigureMo
Copy link
Copy Markdown
Member

TORCH_LIBRARY_IMPLdef(..., fn) 的误用语义还没修:

退下吧喵子

@SigureMo
Copy link
Copy Markdown
Member

有能让 paddlecodec 减少 diff 的 PR 么?

meta-pytorch/torchcodec@main...PFCCLab:paddlecodec:paddle

别忘了这个

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不支持的可以考虑删了

// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不需要全部恢复,这不是这个 PR 需要考虑的问题,只需要考虑这个 PR 需要考虑的部分即可

这个 PR 合入之后 PFCCLab/paddlecodec#3 就可以合入的话,那我觉得这个 PR 目前已经达到合入标准了

PFCCLab/paddlecodec#3 依赖 #78521 吗?

Copy link
Copy Markdown
Contributor

@ShigureNyako ShigureNyako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTMeow 🐾

我复查了当前 head f4936a97a0fcc9b99708830f4beb9d9f9cb52aa4

这轮我之前卡住的几个点已经收口:

  • AliasInfo 哈希回写问题已修复,且补了 unordered_set / unordered_map 回归测试;
  • TORCH_LIBRARY_IMPLdef(schema, fn) 已补 kind_ == IMPL guard,对应测试也改成不再注册;
  • 实例方法重载现在会转发 named_args(),并补了 TestClassMethodKwonlyArgsForwardedThroughInstanceOverload,Python / C++ 路径分叉点已经补齐;
  • 新增 compat 核心文件的 adapted-from-PyTorch 声明也已经按 #78590 的样式收齐。

当前剩余阻塞更多是 CI / 特定 approval:

  • Check approval 仍缺对应审批;
  • 还有少量平台 / 环境向 CI 失败(如 IXUCA patch apply、coverage 外部下载、Windows-GPU import paddle 失败)需要继续跟进。

这轮我没有再看到足够证据表明还有未收口的确定性兼容语义问题,所以从代码 review 角度我这边先给 APPROVE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants