-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5201] feat(client-python): Implement expressions in python client #5646
base: main
Are you sure you want to change the base?
Conversation
clients/client-python/gravitino/api/expressions/named_reference.py
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,76 @@ | |||
# Licensed to the Apache Software Foundation (ASF) under one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we call this a named reference?
Is there a case where a reference has no name?
Do we have to differentiate these two types of references?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xunliu Do you have any idea about the name for this feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @tengqm
because we need to implement all classes in the Python client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mchades Do you have any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we call this a named reference?
It represents a reference to a field/column by its name, most common way to refer to columns in SQL: SELECT name FROM table
Is there a case where a reference has no name?
Yes, examples include:
- Positional references:
SELECT $1
- Expression results:
SELECT (a + b)
- Anonymous subquery columns
Do we have to differentiate these two types of references?
We currently don't have UnnamedReference
in Gravitino because we haven't encountered the usage scenarios.
So my suggestion is to keep it as it is like Java implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really appreciate this clarification. Thanks.
self._function_name == other._function_name | ||
and self._arguments == other._arguments | ||
) | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gut feeling is that you may want to leave a TODO here ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comment. Updated the TODO comment. Does that work for you?
hi @SophieTech88 I will help you improve this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest we split this into a few smaller PRs.
pass | ||
|
||
@abstractmethod | ||
def data_type(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like that the above two methods are be implemented by subclasses anyway.
If that is true, I don't think we we do a pass
here.
We may want to raise an exception instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Just updated the code to raise NotImplementedError() for those 2 functions.
|
||
def __init__( | ||
self, | ||
value: Union[int, float, str, datetime, time, date, bool, Decimal, None], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have a Decimal
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we need to support Decimal
type here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest we split this into a few smaller PRs.
Agree with @tengqm, the current PR is too large to review.
As this PR is focused on expressions, I suggest moving distributions
, sorts
, and transforms
to separate PRs. This will make it easier for us to review this PR. WDYT? @SophieTech88
What changes were proposed in this pull request?
Implement expression from java, including:
convert to python client, and add unit test for each class.
Why are the changes needed?
We need to support the expressions in python client
Fix: #5201
Does this PR introduce any user-facing change?
No
How was this patch tested?
Need to pass all unit tests.