Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make test data smaller #785

Closed
gdementen opened this issue Jun 27, 2019 · 6 comments
Closed

make test data smaller #785

gdementen opened this issue Jun 27, 2019 · 6 comments

Comments

@gdementen
Copy link
Contributor

For a reason which escapes me, population_session.h5 is 3Mb (while the same data in .xlsx is 11.8Kb) and demography.* are a bit large too, making the total size of the larray package larger than I would prefer.

@alixdamman
Copy link
Collaborator

alixdamman commented Jun 27, 2019

In the PR #771 I have started to replace the use of demography dataset by the demography_eurostat dataset (which is actually population_session renamed).

Is demography used in the unittests?

@gdementen
Copy link
Contributor Author

I don't remember...

@alixdamman
Copy link
Collaborator

I temporarily changed the name of demography, ran the tests and found that demography is only used in the load_example_data() function and in the tutorial.

However, there is still a weird problem with demography_eurostat.h5. I'll take a look.

@alixdamman
Copy link
Collaborator

alixdamman commented Jun 27, 2019

After some tests I discovered that the size of demography_eurostat.h5 is huge because the session contains Axis objects with labels of dtype '<U'.
There is clearly a problem when dumping axes with unicode labels to an HDF5 file.

@gdementen
Copy link
Contributor Author

can we solve this?

@alixdamman
Copy link
Collaborator

We need to convert unicode labels to bytes

alixdamman added a commit to alixdamman/larray that referenced this issue Aug 5, 2019
…ctory

- renamed 'population_session' directory and files as 'demography_eurostat'
- made 'demography_eurostat' as new available dataset in function load_example_data()
- fix larray-project#785
alixdamman added a commit to alixdamman/larray that referenced this issue Aug 5, 2019
…ctory

- renamed 'population_session' directory and files as 'demography_eurostat'
- made 'demography_eurostat' as new available dataset in function load_example_data()
- fix larray-project#785
@alixdamman alixdamman added this to the 0.32 milestone Aug 13, 2019
alixdamman added a commit to alixdamman/larray that referenced this issue Aug 26, 2019
…ctory

- renamed 'population_session' directory and files as 'demography_eurostat'
- made 'demography_eurostat' as new available dataset in function load_example_data()
- fix larray-project#785
alixdamman added a commit to alixdamman/larray that referenced this issue Sep 23, 2019
…ctory

- renamed 'population_session' directory and files as 'demography_eurostat'
- made 'demography_eurostat' as new available dataset in function load_example_data()
- fix larray-project#785
alixdamman added a commit to alixdamman/larray that referenced this issue Sep 23, 2019
…ctory

- renamed 'population_session' directory and files as 'demography_eurostat'
- included values for years 2016 and 2017 for all arrays of 'demography_eurostat'
- made 'demography_eurostat' as new available dataset in function load_example_data()
- fix larray-project#785
- added the 'Pythonic VS String Syntax' section in the tutorial
@alixdamman alixdamman removed this from the 0.32 milestone Oct 2, 2019
alixdamman added a commit to alixdamman/larray that referenced this issue Oct 7, 2019
- added generate_data.py module to generate the example and test data
- renamed 'population_session' dataset as 'demography_eurostat'
- included values for years 2016 and 2017 for all arrays of 'demography_eurostat'
- fix larray-project#785
- added the 'Pythonic VS String Syntax' section
- updated all existing sections to include changes up to the 0.31 release version
alixdamman added a commit to alixdamman/larray that referenced this issue Oct 7, 2019
- added generate_data.py module to generate the example and test data
- renamed 'population_session' dataset as 'demography_eurostat'
- included values for years 2016 and 2017 for all arrays of 'demography_eurostat'
- fix larray-project#785
- added the 'Pythonic VS String Syntax' section
- updated all existing sections to include changes up to the 0.31 release version
alixdamman added a commit to alixdamman/larray that referenced this issue Oct 8, 2019
- added generate_data.py module to generate the example and test data
- renamed 'population_session' dataset as 'demography_eurostat'
- included values for years 2016 and 2017 for all arrays of 'demography_eurostat'
- fix larray-project#785
- added the 'Pythonic VS String Syntax' section
- updated all existing sections to include changes up to the 0.31 release version
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants