-
Notifications
You must be signed in to change notification settings - Fork 750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add transform objects for temporal point processes (#294) #341
Conversation
I understand both test fails are related to building docs on the GPU instance. Can't tell if they're related to the PR. |
They aren't, we will look into the CI configuration. And thanks for the PR! |
@@ -155,6 +155,42 @@ def __call__(self, ts: np.ndarray, a: int, b: int) -> np.ndarray: | |||
return np.array([b]) | |||
|
|||
|
|||
class ContinuousTimePointSampler: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This abstraction looks very similar to the InstanceSampler
: https://github.com/awslabs/gluon-ts/blob/a97518d436dbb8e2cb6e2958d4ab899e5fa43a9d/src/gluonts/transform.py#L94
The only difference seems the num_instances
constructor argument here. I'm wondering if the two could be condensed somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in continuous time you sample between points in time (float), as opposed to indices (int).
461d8ee
to
a21681d
Compare
Codecov Report
@@ Coverage Diff @@
## master #341 +/- ##
==========================================
+ Coverage 83.7% 83.78% +0.07%
==========================================
Files 159 159
Lines 9403 9458 +55
==========================================
+ Hits 7871 7924 +53
- Misses 1532 1534 +2
|
@lostella Is this a good time to bump this? :) |
Definitey :-) will go through it asap |
May I suggest @lovvge as a reviewer as well? |
84da906
to
5bb91fc
Compare
length of the interval seen before making prediction | ||
future_interval_length | ||
length of the interval that must be predicted | ||
train_sampler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused. It says "does not accept time-series fields" above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm making it a little more verbose now, but the intention was that the Transformation does not take time_series_fields
(not target_field
), which I remember were intended for features?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually the names are not defined in the transformation chain but in the estimator so it does not really matter.
src/gluonts/transform.py
Outdated
|
||
for future_start in sampling_times: | ||
|
||
r = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary? You could modify your data copy (d
) directly without introducing a new dict, no? This would avoid introducing a new variable and the adding the other fields later. Either way, r
should be of type DataEntry
if this is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d
was initialized outside of the loop here (this may be a spurious copy, now that I look at it). r
is the output dict
.
I'm not a fan of the style in InstanceSplitter
where d is "copied" once per each output. However this is (probably?) a shallow copy. Moreover, features an explicit
del d[ts_field]
I thought this style was a little safer, or maybe I'm missing something here?
In any case, ditto for DataEntry
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason I would regard this as safer is if you have to do multiple transformations that rely on one original entry on in the DataEntry
(which would make sense to have it immutable in that case). I'm not a fan because it introduces more code than necessary, but tastes differ :-).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. I agree. I thought in flatmap_transformation
this is indeed the intention that multiple entries derive from a single input. Hence the extra code, to stress that the input is immutable.
Just crossing the t's here to make sure :)
5bb91fc
to
442f7ff
Compare
442f7ff
to
a169110
Compare
a169110
to
8bb133a
Compare
Issue #, if available:
This is the first PR setting up the infrastructure for temporal point processes in GluonTS (#294).
Description of changes:
Put simply it introduces the
ContinuousTimeInstanceSplitter
object, which somewhat replicates the logic of anInstanceSplitter
for but for point processes in continuous time. I thought this was a good first PR for me, and a good place to start the process of bringing in TPPs.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.