Skip to content

refactor[next-dace]: Add library node for reduce with skip values#2603

Merged
edopao merged 22 commits into
GridTools:mainfrom
edopao:dace_reduce_skip_node
May 27, 2026
Merged

refactor[next-dace]: Add library node for reduce with skip values#2603
edopao merged 22 commits into
GridTools:mainfrom
edopao:dace_reduce_skip_node

Conversation

@edopao
Copy link
Copy Markdown
Contributor

@edopao edopao commented May 15, 2026

DaCe has a library node in its standard library for reduction. However, GT4Py and ICON4Py also need to lower reduction expressions with skip values. The lowering was already done, on main, inside a nested SDFG. All this PR does is to refactor the code that creates the nested SDFG and make an application library node in GT4Py. In this way, the SDFG nodes for regular reduction and the nodes for reduction with skip values are represented in the same way in the SDFG.

No performance regression in ICON4Py blueline:
bench_blueline_stencil_compute

@edopao edopao force-pushed the dace_reduce_skip_node branch from 4e1c0e5 to 308cfc7 Compare May 15, 2026 15:43
@edopao
Copy link
Copy Markdown
Contributor Author

edopao commented May 15, 2026

cscs-ci run dace

@edopao
Copy link
Copy Markdown
Contributor Author

edopao commented May 20, 2026

cscs-ci run default

@edopao edopao force-pushed the dace_reduce_skip_node branch 2 times, most recently from 5fcb578 to bf658c3 Compare May 20, 2026 16:31
FlorianDeconinck and others added 8 commits May 21, 2026 10:06
## Description

Do not apply `open_mp` flags on device compilers - we don't do
multithreading host code for GPU backends (and device is plain wrong).

This flag usage was discovered in the case of an `icx/nvcc` where the
`-qopenmp` intel flag bled into the command line of `nvcc`.

## Requirements

- [ ] All fixes and/or new features come with corresponding tests.
- [ ] Important design decisions have been documented in the appropriate
ADR inside the [docs/development/ADRs/](docs/development/ADRs/README.md)
folder.
- Use same `dace` version for GT4Py cartesian and next.
- Make `dace` a GT4Py dependency, and remove the development groups
`dace-cartesian` and `dace-next`.
- Instead of using the `gridtools` package index, pull the dace
pre-release package from PyPI index.

Note: if in the future a custom (non-pypi) version is needed, cartesian
and next need to create a version from a common branch (and publish to
the gridtools index).
@edopao edopao force-pushed the dace_reduce_skip_node branch from b8f0627 to 14443e2 Compare May 21, 2026 08:07
Copy link
Copy Markdown
Contributor

@philip-paul-mueller philip-paul-mueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some small things but nothing big.

Comment on lines +29 to +31
wcr = dace_properties.LambdaProperty(default="lambda a, b: a")
identity = dace_properties.Property(default=0)
init = dace_properties.Property(default=0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these properties should not have a default.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried, but it is not allowed:
Error importing plugin "gt4py.next.type_system.mypy_plugin": Default not properly defined for property [misc]

By the way, the dace standard reduce does the same.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a random question, did you specify a dtype value?
Looking at the error and Property::init() I would say that dtype is missing.
You can also play with allow_none.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that dtype is not known. Yes, I could play with allow_none indeed.

max_neighbors = mask_desc.shape[1]
assert isinstance(max_neighbors, int) or str(max_neighbors).isdigit()

local_dim_index = inedge.data.src_subset.size().index(max_neighbors)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might break in cases where the number of {edges, cell, vertices} is the same as local dimensions.
Probably too paranoid and I do not have an alternative.
But you should probably put a TODO or NOTE here explaining what it is doing and what the problems are.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an idea. I can check that the size of the subset is 1 in all dimensions except one, the local dimension. I will push a new commit.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work (except for the case that there is only a single neighbour, which is probably unlikely).

Copy link
Copy Markdown
Contributor

@philip-paul-mueller philip-paul-mueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small suggestions.

Comment on lines +29 to +31
wcr = dace_properties.LambdaProperty(default="lambda a, b: a")
identity = dace_properties.Property(default=0)
init = dace_properties.Property(default=0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a random question, did you specify a dtype value?
Looking at the error and Property::init() I would say that dtype is missing.
You can also play with allow_none.

max_neighbors = mask_desc.shape[1]
assert isinstance(max_neighbors, int) or str(max_neighbors).isdigit()

local_dim_index = inedge.data.src_subset.size().index(max_neighbors)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work (except for the case that there is only a single neighbour, which is probably unlikely).


# We now expand all GT4Py specific library nodes.
# We do this such that we have control over all the Maps that are there.
# TODO(edopao,phimuell): It is probably the right place, but maybe there is a better one.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am actually not sure if it is the best one.
I would maybe also think of moving it at the very end, however, given my experience with the broadcast node, we should probably keep it here for now.

gtx_transformations.FuseHorizontalConditionBlocks(),
validate=True,
validate_all=True,
validate=False,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

Copy link
Copy Markdown
Contributor

@philip-paul-mueller philip-paul-mueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@edopao edopao merged commit 154bd05 into GridTools:main May 27, 2026
24 checks passed
@edopao edopao deleted the dace_reduce_skip_node branch May 27, 2026 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants