Import reduce-constant-prop patterns from stablehlo aggressive folder#2524
Import reduce-constant-prop patterns from stablehlo aggressive folder#2524SuryanshSS1011 wants to merge 2 commits into
Conversation
Closes EnzymeAD#1084. Lifts LowerBoolSplatConstantsIntoReduceOpRegion and FoldReduceOpToConstantInitializer from openxla/stablehlo's StablehloAggressiveFolder.cpp into EnzymeHLOOpt's default pipeline, per @wsmoses' suggestion on the issue. Together with Enzyme's existing AndSimplify/OrSimplify patterns, the two new patterns let `--enzyme-hlo-opt` fold reduce-of-splat-constants with And/Or bodies into a single constant. Example from the issue: %c_0 = stablehlo.constant dense<true> : tensor<1x140xi1> %c_7 = stablehlo.constant dense<true> : tensor<i1> %0 = stablehlo.reduce(%c_0 init: %c_7) applies stablehlo.and across dimensions = [0, 1] : (tensor<1x140xi1>, tensor<i1>) -> tensor<i1> return %0 now folds to `return %c<true>`. Tested against 23 existing lit tests touching reduce/const/and -- no regressions.
|
|
||
| if (body.getOperations().size() != 2) | ||
| return rewriter.notifyMatchFailure(op, "Incompatible op count in body."); | ||
| if (!isa<stablehlo::AndOp, stablehlo::OrOp>(body.front())) |
There was a problem hiding this comment.
can we extend this to also support add/mul, including non constants? [and appropriate tests]
There was a problem hiding this comment.
A clarification before I extend further: on closer look, this isn't already a pattern in upstream's StablehloAggressiveFolder. The existing LowerBoolSplatConstantsIntoReduceOpRegion only handles idempotent body ops (and/or, where f(x, x) = f(x, x, ..., x)), and the cascade only terminates correctly for those.
For add/mul the result depends on the reduce dimension count: reduce(splat<2>, init=0) {add} over 4 elements should fold to 8, not 2. So it needs a new pattern that computes the closed form (init + N*x for add, init * x^N for mul), with overflow and float-precision handling.
That's meaningfully bigger than what's in this PR. Happy to do it as a follow-up if you'd rather keep #2524 focused on the and/or fold lifted from upstream. Let me know.
wsmoses
left a comment
There was a problem hiding this comment.
make sure to also add this to tablegen [see the dev docs]
|
Thanks for the quick review @wsmoses! I will extend to add/mul, add tests, and wire up the tablegen registration. I might come back with a clarifying question on the "non-constants" scope once I've sketched it out, since the current pattern is structured around splat-constant inputs. |
Addresses @wsmoses's "make sure to also add this to tablegen [see the dev docs]" review feedback on EnzymeAD#2524. Per DEVDOCS.md, a new pattern must be: 1. Defined in EnzymeHLOOpt.cpp (prior commit) 2. Registered in EnzymeHLOOptPass::runOnOperation (prior commit) 3. Exposed as a transform op in TransformOps.td (this commit) 4. Added to the default pass list in EnzymeXLA.cpp (this commit) After this commit, both LowerBoolSplatConstantsIntoReduceOpRegion and FoldReduceOpToConstantInitializer are reachable both through --enzyme-hlo-opt (already, via runOnOperation) and through --enzyme-hlo-generate-td="patterns=lower_bool_splat_constants_into_reduce_op_region,..." (new, via the transform-interpreter pipeline used by the Python optimization_passes wrapper in primitives.py). All 24 lit tests touching reduce/const/and still pass.
Per DEVDOCS.md, a new pattern must be: 1. Defined in EnzymeHLOOpt.cpp (prior commit) 2. Registered in EnzymeHLOOptPass::runOnOperation (prior commit) 3. Exposed as a transform op in TransformOps.td (this commit) 4. Added to the default pass list in EnzymeXLA.cpp (this commit) After this commit, both LowerBoolSplatConstantsIntoReduceOpRegion and FoldReduceOpToConstantInitializer are reachable both through --enzyme-hlo-opt (already, via runOnOperation) and through --enzyme-hlo-generate-td="patterns=lower_bool_splat_constants_into_reduce_op_region,..." (new, via the transform-interpreter pipeline used by the Python optimization_passes wrapper in primitives.py). All 24 lit tests touching reduce/const/and still pass.
5fd0603 to
7d7b3ab
Compare
Summary of the PR
Closes #1084.
Lifts
LowerBoolSplatConstantsIntoReduceOpRegionandFoldReduceOpToConstantInitializerfrom openxla/stablehlo'sStablehloAggressiveFolder.cppinto EnzymeHLOOpt's default pipeline, per @wsmoses' suggestion on the issue.Together with Enzyme's existing
AndSimplify/OrSimplifypatterns, the two new patterns let--enzyme-hlo-optfold reduce-of-splat-constants withAnd/Orbodies into a single constant.Example
now folds to:
The fold proceeds in three steps:
LowerBoolSplatConstantsIntoReduceOpRegionrewrites the reduce body so the splat constants are materialized inside the region.AndSimplifyfolds that toconst_true.FoldReduceOpToConstantInitializerthen folds the reduce, whose body now returns a constant.Scope
FoldReduceOp*patterns from the upstream file (FoldReduceOpReducingZeroDims,FoldReduceOpWithRedundantResults) are out of scope.And/Orbody ops, matching upstream.Test plan
test/lit_tests/reduce_const_prop.mlircovering the issue's exact MLIR, a siblingorcase, and a negative case (non-constant input).bazel test:addreduceslicefusion(2),and_const_prop,and_pad_pad,binop_const_lift_computation,binopcomplexconstsimplify,broadcastreduce,concatreduce(2, 3),constpadconcat_to_concat,constpropthroughbarrier,convert_to_splatted_constants,convertconst,elementwise_reduce_slice_fuse(2),foldgather,foldpad,fullreduce_nocrash,gatherconstprop,is_finite_const_prop,log_const_prop,math_const_prop.bazel build -c opt :enzymexlamlir-opt), no new warnings introduced.