Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZarrIosp claims any zipped dataset #1319

Closed
tdrwenski opened this issue Mar 21, 2024 · 8 comments · Fixed by #1320
Closed

ZarrIosp claims any zipped dataset #1319

tdrwenski opened this issue Mar 21, 2024 · 8 comments · Fixed by #1320

Comments

@tdrwenski
Copy link
Contributor

From issue: #1307:

When using netcdfAll with the optional cdm-zarr code included, any zipped dataset you try to open gets claimed by ZarrIosp as a validFile. The zip is apparently opened, but if it's not actually zarr, then it is reported as empty.

@tdrwenski
Copy link
Contributor Author

@rschmunk feel free to add anything I missed here. I believe it's clear and should be straight forward to fix.

@tdrwenski tdrwenski self-assigned this Mar 21, 2024
@rschmunk
Copy link
Contributor

rschmunk commented Mar 21, 2024

@tdrwenski, I've been trying to look into this but keep getting sidetracked. I think it has something to do with the file or the archive somehow getting tagged as a directory but I haven't figured that bit out yet.

@tdrwenski
Copy link
Contributor Author

@rschmunk, believe it's fixed now, but let us know if you still have issues!

@rschmunk
Copy link
Contributor

@tdrwenski, At first look, that seems to have done the trick.

@rschmunk
Copy link
Contributor

rschmunk commented Mar 30, 2024

@tdrwenski, An FYI/warning/whatever in case this issue gets reported again:

If for some reason the zip process decides to include filesystem metadata along with the compressed dataset, then there will be > 1 entry in the zip and netcdfAll will decide that zip archive must be a compressed zarr archive.

I discovered this because I just tried to open a zipped netCDF file and was startled that my app, which is using a freshly built netcdfAll, reported it was a zipped zarr archive. I deleted the zip archive, re-zipped the NC file again at the command line, and tried again; this time my app successfully uncompressed and opened it as a netCDF file.

Further testing revealed that if you are using a Mac and use the desktop control-click on a dataset icon, and select Compress in the contextual menu, the zip file that results will have 2 entries in it. In the case I just tested, the data file was named eccc2016.nc and the desktop compression command was including a __MACOSX/._eccc2016.nc metadata entry in the archive.

@tdrwenski
Copy link
Contributor Author

Thanks for that extra info @rschmunk, I was not aware of this. We may need a more robust fix for this issue then. I will reopen this issue so we don't forget, but not sure if we will get to it right away.

@tdrwenski tdrwenski reopened this Apr 1, 2024
@rschmunk
Copy link
Contributor

rschmunk commented Apr 1, 2024

@trdweski, What I encountered was enough of an edge case that I don't think there's a rush. I would expect most people zipping their datasets are going to do so from the command line or by script.

@tdrwenski
Copy link
Contributor Author

On second thought, I think this is enough of an edge case that we don't need to handle it now. Users can delete those resource forks (__MACOSX/) files from their zips to work around it. We can always revisit it if more people are zipping their files this way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants