Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added file checking and overwrite functionality to write #94

Merged
merged 6 commits into from
Sep 20, 2018

Conversation

ExpandingMan
Copy link
Collaborator

@ExpandingMan ExpandingMan commented Sep 14, 2018

This should fix #93 rather elegantly (thanks to @nalimilan for the suggestion).

Interesting bit of trivia: most of my code which uses Feather was actually already doing this outside of Feather. So, it was rather silly (or cruel) of me to use this as a matter of course but not include this in Feather.jl itself.

Note that this makes it so that by default, feather will not allow you to write over existing files, but you can pass overwrite=true to delete the file before writing.

@nalimilan
Copy link
Member

Cool. I think one of the strategies often used by apps which save new versions of files (e.g. text processing apps) is to rename the original file to something else, write the new one, and only then remove the original one. That's the only way to ensure that if something goes wrong during the operation (crash, power issue...), you'll end up either with the old or the new file, and not with a single half-corrupt file. Maybe that' strategy could make sense here? I guess renaming an mmapped file and removing it afterwards is OK?

@ExpandingMan
Copy link
Collaborator Author

I think it's just a question of how much of this functionality we want to build into Feather.jl itself. Perhaps we could allow users to pass a filename to Feather.write that would be the name to move the old file to, or allow the user to provide a file extension (like .backup) which Feather would append to any files it would otherwise try to overwrite, but I sort of feel at that point it's probably just best for the user to do that outside of Feather.

I think ideally stuff like that would be handled by FileIO (perhaps with us accommodating and interface), though as far as I know it doesn't provide anything like that right now.

@ExpandingMan
Copy link
Collaborator Author

@nalimilan , if you have a firm idea of how you'd like to see this implemented, perhaps raise an issue on FileIO? That would potentially provide the functionality for both Feather.jl and CSV.jl all in one go. Though as I recall FileIO wasn't good at supporting all the additional options that are available for CSV.jl.

@quinnj
Copy link
Member

quinnj commented Sep 14, 2018

The problem here is I don't think windows allows you to rm(file) when file is mmapped, or indeed this would be an easier problem.

@quinnj
Copy link
Member

quinnj commented Sep 14, 2018

That's why you always see things in CSV/Feather tests like

obj = nothing; GC.gc(); GC.gc()
rm(filename)

For windows, you have to make sure that original mmap-ing gets finalized/GC-ed before rm.

@ExpandingMan
Copy link
Collaborator Author

It's taking great self-restraint for me not to share my opinion on windows at this moment.

I don't know, how about having this as a linux only option? On windows presumably you can't get the error that instigated this PR since it wouldn't let you write to the file in the first place.

@ExpandingMan
Copy link
Collaborator Author

Ok, how about this then? I think this should give pretty reasonable behavior in all 4 cases (whether file exists, OS).

src/sink.jl Outdated
catch e
@error(string("Unable to delete file $filename. It's possible that it's ",
"already open and being used by this or another process."))
throw(e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be rethrow?

@ExpandingMan
Copy link
Collaborator Author

By the way, can we drop 0.6 now? If so, say the word, and I'll take 0.6 stuff out of this PR.

@quinnj
Copy link
Member

quinnj commented Sep 18, 2018

I say we drop 0.6.

@codecov-io
Copy link

Codecov Report

Merging #94 into master will decrease coverage by 1.85%.
The diff coverage is 33.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #94      +/-   ##
==========================================
- Coverage   97.22%   95.37%   -1.86%     
==========================================
  Files           4        4              
  Lines         108      108              
==========================================
- Hits          105      103       -2     
- Misses          3        5       +2
Impacted Files Coverage Δ
src/sink.jl 94.28% <33.33%> (-5.72%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00099a1...48267f7. Read the comment docs.

@ExpandingMan
Copy link
Collaborator Author

I removed a bunch of 0.6 stuff. So, if we merge this, we can tag a 1.0-only 0.5 at some point.

@ExpandingMan
Copy link
Collaborator Author

The error on appveyor looks like the same 32-bit alignment error we'd been seeing.

@quinnj : it'll be up to your vote if we merge this. Yes, the windows behavior isn't great, but it seems to me that this will still be an overall improvement.

Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it; it's certainly better than current behavior.

@quinnj quinnj merged commit 6a7177d into JuliaData:master Sep 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Updating feather file crashes Julia and wipes file
4 participants