Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ncdispatch.h needs to be split into an installed, public header, and an uninstalled private header #1426

Closed
edhartnett opened this issue Jul 3, 2019 · 16 comments · Fixed by #1452

Comments

@edhartnett
Copy link
Contributor

As I try to use the user-defined format feature to support PIO/netCDF integration, I see that user-defined formats will not work unless the ncdispatch.h file is installed with netcdf.h.

This is not a big deal, but the ncdispatch.h file contains a bunch of internal stuff too - stuff that should not be exposed in a public header.

I will submit a PR shortly.

@DennisHeimbigner
Copy link
Collaborator

DennisHeimbigner commented Jul 3, 2019

This is clearly an error. I assume that netcdf.h is making references
to something defined in ncdispatch.h, which it should not do.
Can you identify the minimum set of items in ncdispatch.h that
are being referenced? Splitting ncdispatch.h is a bad idea BTW.

@edhartnett
Copy link
Contributor Author

No, it's not an error. In order to use the user-defined format feature, the user must have access to the definition of the dispatch table.

However, this is not referenced from netcdf.h. It's only needed for users (like me, with PIO) who are writing a dispatch layer and using nc_def_user_format() to use it.

Yes, I can and will separate the minimum items that need to be in the public file.

@DennisHeimbigner
Copy link
Collaborator

I see. You are correct, sorry. For consistency,
this new file should probably be called netcdf_dispatch.h
to be consistent with e.g. netcdf_parallel.h etc.

@edhartnett
Copy link
Contributor Author

OK, will do.

@edwardhartnett
Copy link
Contributor

OK, we now have a netcdf_dispatch.h and it seems to be working well.

But there is a problem in that the open and create calls both use pointers to NC (the final parameter for both calls):

  extern int
   PIO_NCINT_open(const char *path, int mode, int basepe, size_t *chunksizehintp,
                 void *parameters, const NC_Dispatch *, NC *);

   extern int
   PIO_NCINT_create(const char* path, int cmode, size_t initialsz, int basepe,
                   size_t *chunksizehintp, void *parameters,
                   const NC_Dispatch *dispatch, NC *nc_file);

The code in dfile.c is:

    /* Create the NC* instance and insert its dispatcher */
    if((stat = new_NC(dispatcher,path,omode,&model,&ncp))) goto done;

    /* Add to list of known open files */
    add_to_NCList(ncp);

    /* Assume open will fill in remaining ncp fields */
    stat = dispatcher->open(ncp->path, omode, basepe, chunksizehintp,
			    parameters, dispatcher, ncp);
    if(stat == NC_NOERR) {
	if(ncidp) *ncidp = ncp->ext_ncid;
    } else {
	del_from_NCList(ncp);
        free_NC(ncp);
    }

So the code is written to allow dispatch layers to modify the contents of NC. But do they?

The HDF5 layer only sets int_ncid (which may not be necessary) and check the value of (nc->model->iosp == NC_IOSP_HTTP) in open.

I think without too much trouble, we could remove the NC from the open/create call, and just use ncid. That way, we would not have to expose the NC struct to people using the dispatch layer.

@DennisHeimbigner what do you think? Shall I give this a whack and see how it looks?

@DennisHeimbigner
Copy link
Collaborator

The design is this. It is assumed that the NC structure
is created in nc.c/new_NC and mostly filled independent of any
dispatcher. new_NC is called by e.g. NC_open in libdispatch/dfile.c
again independent of any dispatcher.
Once created, the NC structure is passed to the dispatcher specific
open or create functions (as you note). These functions create
their own dispatcher specific state and hook it the NC structure
by inserting it into NC->dispatchdata. As far as I know, the only field
modified by the dispatcher is that field. It is possible, however,
that some of them also modify the internal id (int_ncid) field
and the mode field.

The goal was to centralize as much of the NC manipulation as possible.

@DennisHeimbigner
Copy link
Collaborator

Questions remain:

  1. Does the dispatcher code (e.g. open) have access to the information
    contained in the NC structure in some other way?
  2. The dispatcher specific state needs to be linked into the NC structure;
    how to do this?
  3. Who uses int_id and what would they do instead?

@DennisHeimbigner
Copy link
Collaborator

Another way to look at it is that the dispatch->open function
must return multiple values (its internal state, int_ncid, etc).
Currently we are using the NC structure to hold those multiple
return values.

@edwardhartnett
Copy link
Contributor

edwardhartnett commented Jul 28, 2019

1 - If the function parameter is ncid (from ext_ncid) then any dispatch layer that needs to can look up the NC. But those that don't need to, don't have to know about NC. Surprisingly (for me) I think this includes libhdf5, so perhaps no dispatch layer has to know about NC.

2 - Not sure what you mean by dispatch-specific state. You mean all the other values in NC? Seems like we are not really setting them in the dispatch layer much.

3 - int_ncid is used by libsrc. When built with --disable-netcdf-4, then int_ncid is all there is. But when netCDF-4 is used, int_ncid is the internal ID of classic files (which also get a different ext_ncid). This is so that the user can open a netCDF-4 file, then open a classic file, and not have conflicting ncids. I am going to investigate whether it even needs to be set for libhdf5 code. I believe it does not.

(Also int_ncid is used by the pnetcdf layer, libsrcp. It's for the same reason as it is used by the classic C layer. Pnetcdf assigns its own IDs and we must keep them separate from the IDs assigned by the netCDF-4 code, to prevent conflicts.)

@edwardhartnett
Copy link
Contributor

I have examined the code and indeed setting int_ncid by the libhdf5 layer is unnecessary. I have removed it. PR up shortly. This is worth doing independently from the dispatch work, just to reduce code complexity and confusion...

@DennisHeimbigner
Copy link
Collaborator

2 - Not sure what you mean by dispatch-specific state...
If you look in the other dispatch directories you will
see that all of them set nc->dispatchdata.
For example,
libsrc4/nc4internal.c: nc->dispatchdata = h5;
where h5 is instance of NC_FILE_INFO_T

WRT int_ncid, it is set in most of the dispatchers
(search for '->int_ncid'), but it is not clear if it is necessary
for it to be part of struct NC as opposed to the dispatch specific state.
I suspect we can move it from struct NC into the dispatch state
with only minor problems.

@edwardhartnett
Copy link
Contributor

I believe int_ncid is not necessary for most dispatch layers, but it certainly is for pnetcdf and libsrc.

My ides is not to eliminate it, or move it, but simply to first remove it wherever it is not really needed. This will simplify the situation.

It is set (but not used, apparently) in libdap2 and libdap4. Can it be removed there too?

@edwardhartnett
Copy link
Contributor

OK, removing int_ncid worked fine. Now we have one remaining use of NC on libhdf5. In nc4_open_file() we have this code:

#ifdef ENABLE_BYTERANGE
    /* See if we want the byte range protocol */
    if(nc->model->iosp == NC_IOSP_HTTP) {
	h5->http.iosp = 1;
	/* Kill off any conflicting modes flags */
	mode &= ~(NC_WRITE|NC_DISKLESS|NC_PERSIST|NC_INMEMORY);
	parameters = NULL; /* kill off parallel */
    } else
	h5->http.iosp = 0;
#endif /*ENABLE_BYTERANGE*/

(I have asked some questions about --ebable-byterange in #1446.)

One immediate thought is that the flag checking can probably be done in libdispatch in this case (with the rest of the flag checking that occurs there).

Parameters should not silently be set to NULL. If the user attempts this with parallel I/O they should get an error (parallel programming is already too hard!). This too can probably be checked in libdispatch.

As for the setting of iosp, I have looked at how this is used, but can't yet figure out if this could be set at the libdispatch layer.

@edwardhartnett
Copy link
Contributor

OK, I see how NC * can be easily taken out of the create/open dispatch functions.

Create the NC object, but instead of passing a pointer to it, pass the ncid that has been assigned to the file (the ext_ncid).

If the code needs it (as the open does for HDF5, due to the enable-byterange stuff, or as the pnetcdf open and create need to), then either find_nc_h5() or NC_check_id() is used to find the NC from the ncid.

In this way, the NC struct can be hidden within netCDF internals, and not exposed in the dispatch layer.

@edwardhartnett
Copy link
Contributor

@DennisHeimbigner when you say "dispatch state" do you mean the dispatchdata field in NC?

Seems like only the pnetcdf layer uses that.

@DennisHeimbigner
Copy link
Collaborator

I do not believe that is accurate. All of the dispatch directories use it
directly or indirectly. Note that for libhdf5, it is set up in libsrc4
(which is probably not the right way to do it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants