0.7updates #43

dmbates · 2018-07-25T20:51:57Z

No description provided.

alyst · 2018-07-25T23:40:23Z

Thanks! Please feel free to pick up my fixes from this branch. It should pass the tests.

I think we should separate the necessary updates to Julia 0.7 and FileIO 1.0.0 from adding the support to .xz/.bzip2, which should go into another PR.

I'm not so sure about contextify() (in my branch I renamed it to RDAContext, since it's essentially a high-level RDAContext ctor):

it moves part of the format detection logic to context.jl making it harder to maintain
contextify() may optionally create a new decompressor IO stream, but it's never closed. Whereas the underlying original io is explicitly closed by open() do io ..end. The proper way is to only close the decompressor stream. In the previous version we made sure that the decompressed stream is always closed (even if there's exception).

Previously the support of multiple compression formats was discussed in #31. RData is used by e.g. RDatasets, which, in turn, is used in the unit tests of several packages.
So if RData will require all the decompression codecs, it will bring a lot of unnecessary dependencies (+binary ones) to multiple downstream packages.
I think the better way would be to make compression codecs optional.
It looks like now the use of Requires.jl is approved by the core Julia team (JuliaLang/julia#2025), so we should be able to use it to require specific codec depending on the RData file.
Actually, compression format detection is quite a common problem, maybe it's possible to provide some generic support for it in FileIO? (cc @RalphAS)

dmbates · 2018-07-27T19:53:52Z

Regarding the different compression schemes, would it be possible to build the detection of the compression scheme into a detection function that would be in FileIO/src/registry.jl and have that always return a stream? It may be possible to have FileIO load the compression package such as CodecBzip2 on demand. That way the RData package may be able to avoid loading any of the Codec*packages.

I appreciate that the vast majority of .rda or .RData files are compressed with Gzip because that is the default. However, xz compression can make a big difference in file sizes when working with large data sets.

alyst · 2018-08-11T13:41:55Z

Thanks for working on this! Closing as #44 was merged.

dmbates added 3 commits July 24, 2018 07:31

Initial changes julia 0.7

ea3a587

changes for v0.7

87e65c0

Attempt reintegration of FileIO

0d5a56b

andreasnoack mentioned this pull request Aug 10, 2018

Update for Julia 1 #44

Merged

alyst closed this Aug 11, 2018

alyst deleted the 0.7updates branch November 6, 2018 13:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.7updates #43

0.7updates #43

dmbates commented Jul 25, 2018

alyst commented Jul 25, 2018

dmbates commented Jul 27, 2018

alyst commented Aug 11, 2018

0.7updates #43

0.7updates #43

Conversation

dmbates commented Jul 25, 2018

alyst commented Jul 25, 2018

dmbates commented Jul 27, 2018

alyst commented Aug 11, 2018