Skip to content

feat: add lambda expression support#710

Open
limameml wants to merge 6 commits intosubstrait-io:mainfrom
limameml:limame.malainine/add-lambda-support
Open

feat: add lambda expression support#710
limameml wants to merge 6 commits intosubstrait-io:mainfrom
limameml:limame.malainine/add-lambda-support

Conversation

@limameml
Copy link

This PR is to add support for the Lambda Expression

Summary

  • Add lambda expression support to substrait-java core and isthmus
  • Implement Type.Func for function types
  • Add Expression.Lambda and lambda parameter references
  • Full proto round-trip support
  • Calcite (isthmus) integration (single-level lambdas only)

Test plan

  • Added LambdaExpressionRoundtripTest in core
  • Added LambdaExpressionTest in isthmus

closes #687

@CLAassistant
Copy link

CLAassistant commented Feb 24, 2026

CLA assistant check
All committers have signed the CLA.

@benbellick benbellick self-requested a review February 24, 2026 17:00
Copy link
Member

@benbellick benbellick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only just started looking at this but flushing a few comments just to get the ball rolling. On the whole this is looking good though, nice work!

private final Type.Struct rootType;
private final ProtoTypeConverter protoTypeConverter;
private final ProtoRelConverter protoRelConverter;
private final List<Type.Struct> lambdaParameterStack = new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a stack here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to access lambdaIndex = lambdaParameterStack.size() - 1 - stepsOut here: lambdaParameters = lambdaParameterStack.get(lambdaIndex); when building the Lambda parametre ref that's why I didn't use a stack

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. Java's stack does extend Vector though so you still get the get method. I'm not so particular on this one.

The only thing I will say is I do think that the whole stack parameter stuff can be confusing for people without a little bit of PL experience, which is why I pushed @Slimsammylim to encapsulate some of that logic and put it in docstring comments. I could see a case for encapsulating this logic in a local private class, though it probably makes more sense to do this only if this logic comes up again.

I expected to see something like this logic elsewhere, as you need to do it to do validation when building lambdas. Though I didn't find anything like this. Does the builder for lambdas in this PR actually validate that the lambda is semantically correct?

Copy link
Member

@bestbeforetoday bestbeforetoday Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would be nice for clarity to encapsulate this quite specific behaviour in a class (single responsibility principle). It doesn't need to extend one of the full collection interfaces, just provide the methods you need:

  • void push(Type.Struct value)
  • Type.Struct pop() (throw runtime NoSuchElementException if empty)
  • Type.Struct get(int stepsOut) (throw runtime IndexOutOfBoundsException on invalid stepsOut)

Or make it generic if you will reuse it for other types. The caller doesn't need to worry about clever indexing logic or what collection implementation is used internally; just make simple method calls that clearly describe the intent.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @benbellick, for your last point I don't quite understand what validation I should add, doesn't the validation happen when building the LAMBDA_PARAMETER_REFERENCE?


Type.Struct lambdaParameters = lambdaParameterStack.get(lambdaIndex);
return FieldReference.newLambdaParameterReference(
reference.getDirectReference().getStructField().getField(),
Copy link

@Slimsammylim Slimsammylim Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I'm curious about is why we use reference.getDirectReference().getStructField().getField() here instead of getDirectReferenceSegments(reference.getDirectReference()) like the ROOT_REFERENCE case does.

I'm unfamiliar with substrait-java, but does this current implementation support nested field access through lambda parameters?

*This may be an unimportant issue especially if it's not common to access nested fields in lambdas

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the ofRoot method takes the struct itself while the newLambdaParameterReference takes the index, like the newRootStructOuterReference methode

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I am still confused as to whether this supports nested field access through lambda parameters. I messed around w some tests through claude, and I believe it is unsupported. However, this is a nit and I'm not sure it will come up often in practice, so we can come back to this later if it ends up being an issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support lambdas

5 participants