Skip to content

Unblock supported compression libs in pytables #16196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 11, 2017
Merged

Unblock supported compression libs in pytables #16196

merged 1 commit into from
May 11, 2017

Conversation

linebp
Copy link
Contributor

@linebp linebp commented May 2, 2017

@codecov
Copy link

codecov bot commented May 2, 2017

Codecov Report

Merging #16196 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16196      +/-   ##
==========================================
- Coverage   90.39%   90.37%   -0.02%     
==========================================
  Files         161      161              
  Lines       50863    50863              
==========================================
- Hits        45977    45967      -10     
- Misses       4886     4896      +10
Flag Coverage Δ
#multiple 88.15% <50%> (-0.01%) ⬇️
#single 40.33% <100%> (-0.11%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 91.94% <ø> (ø) ⬆️
pandas/io/pytables.py 93.06% <100%> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.59% <0%> (-0.1%) ⬇️
pandas/core/indexes/datetimes.py 95.23% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fdc2185...13b9d36. Read the comment docs.

@jreback jreback added the IO HDF5 read_hdf, HDFStore label May 2, 2017
@@ -736,6 +736,114 @@ def test_put_compression_blosc(self):
store.put('c', df, format='table', complib='blosc')
tm.assert_frame_equal(store['c'], df)

def test_complibs(self):
# Test data
data = [{'ID': '7082-GTZ-572490',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't specify data like this, just use tm.makeDataFrame

all_tests = [(lib, lvl) for lib in all_complibs for lvl in all_levels]

for (lib, lvl) in all_tests:
with ensure_clean_store(self.path, complib=lib,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather you use .to_hdf and .read_hdf here

@jreback
Copy link
Contributor

jreback commented May 2, 2017

this is going to need a release note for 0.20.1 (but doesn't exist as of yet, soon).

@jreback jreback added this to the 0.20.1 milestone May 2, 2017
@@ -736,6 +736,20 @@ def test_put_compression_blosc(self):
store.put('c', df, format='table', complib='blosc')
tm.assert_frame_equal(store['c'], df)

def test_complibs(self):
df = tm.makeDataFrame()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test that validates that complevel=0 turns off compression?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I best do that? There is the h5dump utility that can tell me if its compressed or I can compare the size of the files perhaps?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look in the meta data of the file
lots of _ attributes that define things like this
pytables docs might be helpful

@@ -736,6 +736,20 @@ def test_put_compression_blosc(self):
store.put('c', df, format='table', complib='blosc')
tm.assert_frame_equal(store['c'], df)

def test_complibs(self):
df = tm.makeDataFrame()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment referencing the issue

@@ -4942,7 +4956,7 @@ def test_invalid_complib(self):
columns=list('ABCDE'))
with ensure_clean_path(self.path) as path:
pytest.raises(ValueError, df.to_hdf, path,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to have be

with pytest.raises(ValueError):
   ...

@jreback jreback modified the milestone: 0.20.1 May 5, 2017
@linebp
Copy link
Contributor Author

linebp commented May 11, 2017

It turned out that not all configurations has lzo compression available, so I removed those tests for those scenarios.

Is this looking good? The whatsnew file for 20.1 is still not in existence?

@jreback jreback added this to the 0.20.2 milestone May 11, 2017
@jreback
Copy link
Contributor

jreback commented May 11, 2017

@linebp put this in 0.20.2 Enhancements (0.20.1 doesn't exist and was a combine release with 0.20.0).

ping on green.

complib : {'zlib', 'lzo', 'bzip2', 'blosc', None}, default None
Specifies the compression library to be used.
Additional compressors for Blosc are supported (default if no
compressor specified: 'blosc:blosclz'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add as of 0.20.2 these addtional compressors are available.

complib : {'zlib', 'lzo', 'bzip2', 'blosc', None}, default None
Specifies the compression library to be used.
Additional compressors for Blosc are supported (default if no
compressor specified: 'blosc:blosclz'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@linebp
Copy link
Contributor Author

linebp commented May 11, 2017

Its all green now.

@jreback jreback merged commit 541e8e8 into pandas-dev:master May 11, 2017
@jreback
Copy link
Contributor

jreback commented May 11, 2017

thanks!

@linebp linebp deleted the unblock_complibs branch May 15, 2017 06:32
pcluo pushed a commit to pcluo/pandas that referenced this pull request May 22, 2017
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request May 29, 2017
TomAugspurger pushed a commit that referenced this pull request May 30, 2017
stangirala pushed a commit to stangirala/pandas that referenced this pull request Jun 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO HDF5 read_hdf, HDFStore
Projects
None yet
3 participants