Skip to content

[Compiler plugin] join operations support #1139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 28, 2025

Conversation

koperagen
Copy link
Collaborator

@koperagen koperagen commented Apr 22, 2025

In first commit i adapted ColumnMatch and ColumnList to embed them into compiler plugin column resolving mechanism
New util functions designed to make sure all column names generated by compiler plugin in join are exactly as in runtime, and make sure there are no missing columns. They can however sometimes have nullability where runtime narrows the type to non-nullable

@koperagen koperagen added the Compiler plugin Anything related to the DataFrame Compiler Plugin label Apr 22, 2025
@koperagen koperagen added this to the 1.0.0-Beta1 (0.16) milestone Apr 22, 2025
@koperagen koperagen self-assigned this Apr 22, 2025
@koperagen koperagen changed the title [Compiler plugin join support [Compiler plugin] join operations support Apr 22, 2025
@koperagen koperagen force-pushed the compiler-plugin-join-support branch 2 times, most recently from 2a17323 to bccd01e Compare April 22, 2025 16:43
Copy link
Collaborator

@Jolanrensen Jolanrensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice :)


internal data class ColumnMatchApproximation(val left: ColumnsResolver, val right: ColumnsResolver)
internal data class ColumnMatchApproximation(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give me a hint, what does Approximation mean. I see a lot in naming, but could not understand the idea

Copy link
Collaborator Author

@koperagen koperagen Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In library we have ColumnMatch, and operations work with this class. We implement the same operation in compiler plugin, but we cannot cover 100% of usages, for example if some variable is used instead of constant, and we cannot refer to DataFrame - instead we have more limited PluginDataFrameSchema, etc. So it wouldn't be fair to just name it "ColumnMatch". Approximation indicates that it's "compile-time ColumnMatch". Something like this

}

internal class ColumnListImpl<C>(override val columns: List<ColumnsResolver<C>>) :
ColumnSet<C>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, formatting looks weird to me

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this is the only way the linter allows it

Copy link
Collaborator

@zaleslaw zaleslaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is clear for me, but I'm not sure that all required test paths are covered, could you please answer

@koperagen koperagen force-pushed the compiler-plugin-join-support branch from bccd01e to 9440640 Compare April 28, 2025 10:24
@koperagen koperagen merged commit 6532ce6 into master Apr 28, 2025
6 checks passed
@koperagen koperagen deleted the compiler-plugin-join-support branch April 30, 2025 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compiler plugin Anything related to the DataFrame Compiler Plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants