Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixes fieldname issues #292

Merged
merged 3 commits into from
Sep 7, 2019
Merged

fixes fieldname issues #292

merged 3 commits into from
Sep 7, 2019

Conversation

benidis
Copy link
Contributor

@benidis benidis commented Sep 6, 2019

The field names of a dataset are not checked for consistency across the different modules. If a field changes in the FieldName class, there is no error raised, the old fields are never there in the new datasets and therefore ignored, and the new fields are not processed.

This caused issue #283 in the dataset statistics calculation dataset.calc_stats().

This PR includes the following changes:

  • fixes the calc_stats issue (fixes issue The calculation of dataset statistics reads wrong feature fields #283 )
  • moves the FieldName class to a separate module to avoid circular dependencies and to increase visibility for potential future changes (it was in the transform package with all the transformations and it did not really belong there)
  • adds tests for potential change on the FieldName class so people are warned at least that they should not touch it without a larger refactoring
  • adds tests for the calc_stats method

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@szha
Copy link
Member

szha commented Sep 6, 2019

Job PR-292/3 is complete.
Docs are uploaded to http://gluon-ts-staging.s3-accelerate.dualstack.amazonaws.com/PR-292/3/index.html

@codecov-io
Copy link

Codecov Report

Merging #292 into master will increase coverage by 37.12%.
The diff coverage is 97.5%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #292       +/-   ##
===========================================
+ Coverage   39.43%   76.56%   +37.12%     
===========================================
  Files         141      142        +1     
  Lines        7904     7940       +36     
===========================================
+ Hits         3117     6079     +2962     
+ Misses       4787     1861     -2926
Impacted Files Coverage Δ
src/gluonts/transform.py 82.89% <ø> (+49.7%) ⬆️
src/gluonts/model/deepstate/_estimator.py 0% <0%> (ø) ⬆️
src/gluonts/model/deep_factor/_estimator.py 0% <0%> (ø) ⬆️
src/gluonts/model/seq2seq/_seq2seq_estimator.py 53.96% <100%> (+0.74%) ⬆️
src/gluonts/model/canonical/_estimator.py 100% <100%> (+44.44%) ⬆️
src/gluonts/dataset/field_names.py 100% <100%> (ø)
src/gluonts/model/gp_forecaster/_estimator.py 100% <100%> (+53.65%) ⬆️
src/gluonts/model/trivial/mean.py 97.29% <100%> (+45.94%) ⬆️
src/gluonts/model/trivial/identity.py 100% <100%> (+35.29%) ⬆️
src/gluonts/dataset/stat.py 96.32% <100%> (+15.24%) ⬆️
... and 97 more

@lostella
Copy link
Contributor

lostella commented Sep 7, 2019

Could you point to the changes in the tests that would catch #283 without this fix? I assume these are in https://github.com/benidis/gluon-ts/blob/af9c7386a4ed1eaf66c44aa04a0b3f7c77f177f9/test/dataset/test_stat.py

@lostella
Copy link
Contributor

lostella commented Sep 7, 2019

Looks good, thanks!

@benidis
Copy link
Contributor Author

benidis commented Sep 7, 2019

@lostella: The issue with #283 was that the fields in calc_stats were hard coded so when we changed them in the FieldName class the function kept the old ones. It was not possible to use FieldName.{field} in calc_stat due to circular dependency. The fix basically removed the circular dependency and uses ts[FieldName.{field}] to reference a field instead of the previous hardcoded ts["field"] that can be obsolete. This would prevent an issue like #283 .

The tests you referred to were obsolete since they were for the older fields and the fixes there reflect that. I also added a test that hardcodes the current fields. This test aims to ensure that a change in the fields will throw an error and should only be done as a part of a bigger refactoring.

@benidis benidis merged commit 2394ce5 into awslabs:master Sep 7, 2019
@benidis benidis deleted the fix_calc_stats branch September 9, 2019 08:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants