-
Notifications
You must be signed in to change notification settings - Fork 772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify QuantileOutput and DistributionOutput #3093
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @shchur, left some comments but I’ll take a deeper look
(outputs,), loc, scale = prediction_net(*inputs.values()) | ||
if scale is not None: | ||
outputs = outputs * scale[..., None] | ||
if loc is not None: | ||
outputs = outputs + loc[..., None] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering: would it be better to turn to_numpy
here? Otherwise the type of the output of prediction_net
depend on the type of prediction_net
, and we're using indexing/multiplication/addition without really knowing if it will work. Of course, it's just indexing/multiplication/addition, but I'm wondering if it would be somehow clearer if these objects were np.ndarray
already here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, your suggestion feels cleaner. Implemented it.
yield QuantileForecast( | ||
output, | ||
output.T, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why move the transposition here, from inside the model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QuantileForecast expects array of shape [num_quantiles, prediction_length]
, but all models usually produce the output of shape [batch_size, prediction_length, *additional_dims]
- both in case of DistributionOutput
and QuantileOutput
. I think it's better to keep model output shape consistent and do the transpose here.
@@ -77,7 +78,7 @@ def assert_shapes_and_dtypes(tensors, shapes, dtypes): | |||
TemporalFusionTransformerModel( | |||
context_length=24, | |||
prediction_length=12, | |||
quantiles=[0.2, 0.25, 0.5, 0.9, 0.95], | |||
distr_output=QuantileOutput([0.2, 0.25, 0.5, 0.9, 0.95]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this class now also supports other parametric distribution families, right? Should some test case be added for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a similar test in https://github.com/awslabs/gluonts/pull/3093/files#diff-636e8492b38a2a549ef86320a8d00ca2b531aff6e31b4d61260ce6ed5ea7d132R104, do you think that's sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry, somehow I missed that
@@ -292,7 +292,7 @@ def __init__( | |||
es_num_samples: int = 50, | |||
beta: float = 1.0, | |||
) -> None: | |||
super().__init__(self) | |||
super().__init__(self, beta=beta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I guess there's no need to set self.beta
further below (line 306)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
Issue #, if available: closes #3083
Description of changes:
Output
object and remove all losses defined ingluonts.torch.module.loss
.DistributionOutput
andQuantileOutput
:SimpleFeedForward
TemporalFusionTransformer
DLinear
PatchTST
LagTST
forward
method for the following models toTuple[Tuple[Tensor, ...], Tensor, Tensor]
:MQCNN
(MXNet)MQRNN
(MXNet)TemporalFusionTransformer
(MXNet)TemporalFusionTransformer
(PyTorch)QuantileForecastGenerator
to support the new unified signature offorward
methodpredict_to_numpy
method withto_numpy
cc @kashif
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Please tag this pr with at least one of these labels to make our release process faster: BREAKING, new feature, bug fix, other change, dev setup