-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Projections follow up #454
Conversation
This is a follow up PR after merging the Projections PR. This is a combination of lots of small things. - Add documentation to properties of `MapEntry` so that they are displayed - Move map#project so that we maintain alphabetical order - Add some corner case tests for projections and aggregations - Used asserCountEqual on projection tests so that the tests will be more durable, even if we add more items to map - Add missing API documentation for projections - Fix API documentation of projections around the return value documentations - Add unit tests for the invalid projection inputs - Make the projections code snippet simpler - Add a code sample for projections
Codecov Report
@@ Coverage Diff @@
## master #454 +/- ##
==========================================
- Coverage 94.45% 94.21% -0.24%
==========================================
Files 345 345
Lines 17545 17545
==========================================
- Hits 16572 16530 -42
- Misses 973 1015 +42
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but I think using attribute
instead of single_attribute
and multi_attribute
would make our API simpler and more Pythonic.
filtered_ages = employees.project(single_attribute("age"), greater("age", 23)) | ||
|
||
# Prints: "Ages of the filtered employees are [25, 40]" | ||
print("Ages of the filtered employees are %s" % filtered_ages) | ||
|
||
attributes = employees.project(multi_attribute("age", "height")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about replacing single_attribute
and multi_attribute
with just attribute
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But they are two distinct things with distinct return types. One returns list[any]
, the other list[list[any]]
when used with project method.
Also, I find such codes like the below a bit weird in the implementation. IMHO, a separation is better than this
def attribute(*attrs):
if len(attrs) == 1:
return _SingleAttribute(attrs[0])
return _MultiAttribute(attrs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we make it just attribute
, the return type can be list[list[any]]
. I think it's not weird at all to distinguish between _SingleAttribute
and _MultiAttribute
depending on the number of attributes. Those two are implementation details, the user doesn't need to know about them. (I didn't check that, but I guess they are different in the protocol as well, not sure about the reason of that decision)
@@ -48,7 +48,7 @@ def get_class_id(self): | |||
|
|||
|
|||
class _MultiAttributeProjection(_AbstractProjection): | |||
def __init__(self, *attribute_paths): | |||
def __init__(self, attribute_paths): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding types to new classes and functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added type hints to the projections module. I didn't type-hint the code that is coming from the superclass (IdentifiedDataSerializable), should I add type hints to those methods too (like get_class_id)?
Also, I question. We have docstrings that define types for public methods, should we define type hints for them too? I have added type hints to them but I can remove them if you want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to this SO discussion, it is possible to configure sphinx to recognize types, so writing in docstrings is probably not necessary: https://stackoverflow.com/questions/40071573/python-3-sphinx-doesnt-show-type-hints-correctly But I think we should keep them until we are sure about that.
IMO adding types to only functions/classes added/changed in this PR is OK. We can tackle with adding types to others in a separate PR. I think types in the base class are used for derived classes, but since IdentifiedDataSerializable
is an older class, I think it's OK to not add types to get_class_id et al.
with self.assertRaises(AssertionError): | ||
self.map.aggregate(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be simplified as: self.assertRaises(AssertionError, lambda: self.map.aggregate(None))
I think most of other assertRaises uses can also be put into a single line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally prefer this style, and try to use it like this across the codebase. Advantages:
- You can easily put breakpoints
- You can put multiple lines without defining a closure
And IMHO, this one is more readable
filtered_ages = employees.project(single_attribute("age"), greater("age", 23)) | ||
|
||
# Prints: "Ages of the filtered employees are [25, 40]" | ||
print("Ages of the filtered employees are %s" % filtered_ages) | ||
|
||
attributes = employees.project(multi_attribute("age", "height")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we make it just attribute
, the return type can be list[list[any]]
. I think it's not weird at all to distinguish between _SingleAttribute
and _MultiAttribute
depending on the number of attributes. Those two are implementation details, the user doesn't need to know about them. (I didn't check that, but I guess they are different in the protocol as well, not sure about the reason of that decision)
@@ -48,7 +48,7 @@ def get_class_id(self): | |||
|
|||
|
|||
class _MultiAttributeProjection(_AbstractProjection): | |||
def __init__(self, *attribute_paths): | |||
def __init__(self, attribute_paths): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to this SO discussion, it is possible to configure sphinx to recognize types, so writing in docstrings is probably not necessary: https://stackoverflow.com/questions/40071573/python-3-sphinx-doesnt-show-type-hints-correctly But I think we should keep them until we are sure about that.
IMO adding types to only functions/classes added/changed in this PR is OK. We can tackle with adding types to others in a separate PR. I think types in the base class are used for derived classes, but since IdentifiedDataSerializable
is an older class, I think it's OK to not add types to get_class_id et al.
This is a follow up PR after merging the Projections PR. This is
a combination of lots of small things.
MapEntry
so that they aredisplayed
be more durable, even if we add more items to map