Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make h5 handler default for nc4 files #1002

Open
Mikejmnez opened this issue Nov 20, 2024 · 3 comments
Open

make h5 handler default for nc4 files #1002

Mikejmnez opened this issue Nov 20, 2024 · 3 comments

Comments

@Mikejmnez
Copy link

Mikejmnez commented Nov 20, 2024

Problem

I want to serve .nc4 datasets with DAP4 types such as Int64s and Groups. By default in the hyrax:snapshot, the nc4 files are assigned to the netCDF handler. When I try to inspect the dmr of the .nc4 file I get a Hyrax 500 error. To actually serve this nc4 file I need to assign the h5 handler to them via the site.conf file (see OPENDAP/hyrax_guide#17 (comment)). This is less than ideal.

(potential) solution

Assign the h5 data handler to any .nc4 file by default on the bes.conf file. I think this should work since netCDF4 is a subset of h5...

@ndp-opendap
Copy link
Contributor

Maybe fix the netcdf_handler to actually support netcdf-4?

Part of the issue is that in the distant past people made the decision that netcdf-3 and netcdf-4 files would share the same suffix .nc

You can achieve this by injecting a site.conf file into you docker container at start up:

-v /Users/schmoo/conf/site.conf:/etc/bes/site.conf

And placing (at least) this inside:

# First reset the TypeMatch
BES.Catalog.catalog.TypeMatch=

# Then rebuild the TypeMatch expression reassigning .nc4 to the hdf5 handler
BES.Catalog.catalog.TypeMatch+=csv:.*\.csv(\.bz2|\.gz|\.Z)?$;
BES.Catalog.catalog.TypeMatch+=reader:.*\.(dds|dods|data_ddx|dmr|dap)$;
BES.Catalog.catalog.TypeMatch+=dmrpp:.*\.(dmrpp)(\.bz2|\.gz|\.Z)?$;
BES.Catalog.catalog.TypeMatch+=ff:.*\.dat(\.bz2|\.gz|\.Z)?$;
BES.Catalog.catalog.TypeMatch+=gdal:.*\.(tif|TIF)$|.*\.grb\.(bz2|gz|Z)?$|.*\.jp2$|.*/gdal/.*\.jpg$;
BES.Catalog.catalog.TypeMatch+=h4:.*\.(hdf|HDF|eos|HDFEOS)(\.bz2|\.gz|\.Z)?$;

BES.Catalog.catalog.TypeMatch+=h5:.*\.(nc4|NC4|HDF5|h5|he5)(\.bz2|\.gz|\.Z)?$;

BES.Catalog.catalog.TypeMatch+=nc:.*\.nc(\.bz2|\.gz|\.Z)?$;
BES.Catalog.catalog.TypeMatch+=ncml:.*\.ncml(\.bz2|\.gz|\.Z)?$;

@Mikejmnez
Copy link
Author

Mikejmnez commented Nov 20, 2024

fix the netcdf_handler to actually support netcdf-4?

Yes - I think your proposed solution is better, and likely more involved (although that I am not sure).

For now, I am including an example on the Hyrax Guide on serving nc4 enabled by h5 handler via the site.conf file see hyrax_guide/issues#17.

@jgallagher59701
Copy link
Member

This is probably obvious, but I'll say it for completeness sake, the TypeMatch values above are regular expressions and are applied to the whole path for the file, so if all the netCDF4 files are in one subtree and all the netCDF3 files in another, the fact they end in the same extension is not a problem for a particular server installation.

Also, while regexes with ORs are possible, the OR operator is a performance killer. For a regex with two or more OR clauses, think nested loops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants