-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operator tp.where() #265
Operator tp.where() #265
Conversation
ad7e8cc
to
6f9632d
Compare
Changelog 🔔 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome
### Improvements | ||
|
||
- Print `EventSet` timestamps as datetimes instead of float. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of those improvements are in fact "features".
A feature add a new functionality. An improvement does not: For example, increasing the speed of something is an improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that this isn't adding a new functionality, rather improving an existing one (user was already able to print an eventset, this just makes it better). The sampling
support on the other hand I would consider a feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are minor changes that don't enable new use cases (the arg. for cumsum
was already available through moving_sum
, since it's an alias).
I'd keep the "Features" section for "new things you can do with the library". Does it make sense?
pb.OperatorDef.Attribute( | ||
key="on_true", | ||
type=pb.OperatorDef.Attribute.Type.ANY, | ||
is_optional=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are both on_true and on_false required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I don't think it's obvious what value to use by default if some of them is not provided.
### Improvements | ||
|
||
- Print `EventSet` timestamps as datetimes instead of float. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that this isn't adding a new functionality, rather improving an existing one (user was already able to print an eventset, this just makes it better). The sampling
support on the other hand I would consider a feature.
|
||
def where( | ||
self: EventSetOrNode, | ||
on_true: Union[EventSetOrNode, Any], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we create an alias for Union[int, float, str, bool]
? we have T_SCALAR
defined in event_set_ops which is only (int, float), but int that same file in __ne__
's def we join it with (bool, str)
too. Maybe this definition could improve the signature's legibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this. T_SCALAR
is actually a tuple, we're not using it in typehints.
This would make sense if we define types in typing.py
like PythonValue
and PythonScalarValue
and use it consistently in the event_set_ops types.
But I think we should discuss it previously, I would not add it in this PR.
@@ -68,6 +68,7 @@ def __init__( | |||
"`window_length` must have exactly one float64 feature." | |||
) | |||
self.add_input("window_length", window_length) | |||
self._window_length = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch :) and great example of the problems associated with having tons of tests on the implementation and not so tons of them on the core op
ac2ea1d
to
59d03b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome
Also, changes in
normalize_features
:np.str_
(unicode strings) were not casted tonp.bytes_
np.object_
).np.datetime
gets converted tofloat64
instead ofint64
, to keep it consistent with timestamps handling.