Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPeNDAP+HTTPS: certificates not found #2337

Closed
Alexander-Barth opened this issue May 5, 2022 · 23 comments
Closed

OPeNDAP+HTTPS: certificates not found #2337

Alexander-Barth opened this issue May 5, 2022 · 23 comments

Comments

@Alexander-Barth
Copy link
Contributor

Alexander-Barth commented May 5, 2022

With the new NetCDF 4.8.1 updated in Julia, we have an issue to accessing OPeNDAP resources over HTTPs. It seems that no certificate authority is loaded by default. In the new version of NetCDF 4.8.1 we also link against a new version of libCURL (now 7.81.0 previously 7.73.0).

  • the version of the software with which you are encountering an issue
    NetCDF 4.8.1

  • environmental information (i.e. Operating System, compiler info, java version, python version, etc.)

Ubuntu 20.04 and x86_64-linux-gnu-gcc (GCC) 4.8.5 (compilation is done within a docker container)

  • a description of the issue with the steps needed to reproduce it

nc_open (called by NCDataset) with a HTTPS url results in the following error:

julia> using NCDatasets
julia> NCDataset("https://erddap.ifremer.fr/erddap/griddap/SDC_GLO_CLIM_TS_V2_1")
Error:curl error: SSL peer certificate or SSH remote key was not OK
curl error details: 
Warning:oc_open: Could not read url
ERROR: NetCDF error: Opening path https://erddap.ifremer.fr/erddap/griddap/SDC_GLO_CLIM_TS_V2_1: NetCDF: I/O failure (NetCDF error code: -68)
Stacktrace:
 [1] nc_open(path::String, mode::UInt16)
[...]

If I define a ~/.ncrc file with the following content HTTP.SSL.CAINFO, then OPENDAP over HTTPS works:

HTTP.SSL.CAINFO=/etc/ssl/certs/ca-certificates.crt

I am wondering if somebody can shed some light how it worked before (as in NetCDF 4.7.4/libcurl 7.73.0 such configuration file was not needed) and if it is possible to restore the previous behavior.

As noted by @visr, some libraries (like GDAL, Proj4) expose a function to set this path programmatically. This solution would also be ideal for NetCDF in Julia.

(Any insight on how this is solved in e.g. anaconda or R would be very useful.)

Reference:
Alexander-Barth/NCDatasets.jl#173
JuliaPackaging/Yggdrasil#4843

@DennisHeimbigner
Copy link
Collaborator

I can't see anything in the libcurl change log: https://curl.se/changes.html
Perhaps this is an OPENSSL change?

@visr
Copy link

visr commented May 5, 2022

I'm not sure, but I suspect this change might be on the julia side since other libraries that use curl also need to set this path on Linux.

As noted by @visr, some libraries (like GDAL, Proj4) expose a function to set this path programmatically.

I should've looked better but it seems this already exists; ocset_curlopt. And if I'm not mistaken ocopen needs to be called first to create the OCstate.

I tried using these two functions, but got stuck on an assertion error from netcdf:

drc.c:80: ncrc_setrchome: Assertion `ncrc_globalstate && ncrc_globalstate->home' failed.

More details are at Alexander-Barth/NCDatasets.jl#174

@DennisHeimbigner
Copy link
Collaborator

You should not be using any oc function; they were made internal long ago
and can only be used in the context of libncdap.
Why is your use of .ncrc not acceptable?

@Alexander-Barth
Copy link
Contributor Author

Alexander-Barth commented May 6, 2022

For me (as a user of netcdf in my research work), it is certainly acceptable to manually define this configuration file .ncrc.

But as a developer of a julia wrapper, I am a bit worried about all the users hitting this error and not knowing what to do. Some of them will use NetCDF as an indirect dependencies and are not aware that a library down the software stack requires some configuration. The error message seems to indicate that there is a problem with the server certificate.

We considered to use the environment variable DAPRCFILE (pointing to a temporary configuration file), but that could be problematic too.

@DennisHeimbigner
Copy link
Collaborator

Good point. As a developer, I sometimes forget to take the naive user into account.
In include/ncrc.h there is a technically internal operation to insert a new key/value pair
into the internal copy of .ncrc,

EXTERNL int NC_rcfile_insert(const char* key, const char* value, const char* hostport, const char* path);

You might try it as an experiment to see if it works for your problem. You can set the last two
parameters to NULL. You will also need to declare the function signature in your code.
I will try to think of the proper way to expose equivalent functionality.

@visr
Copy link

visr commented May 6, 2022

Thanks for the tip! I tried NC_rcfile_insert, but have not yet been successful. The certificates are not found, but that seems not too surprising, given that when I try to check the value with NC_rclookup, I get a NULL string back.

The code is here:
https://github.com/Alexander-Barth/NCDatasets.jl/blob/9817458ebd95cd90606181846b1fd76b2ab76f18/src/NCDatasets.jl#L40-L52

Which prints

NC_rcfile_insert returns 0
lookup = Cstring(0x0000000000000000)

Am I forgetting something, or is it supposed to work like this?

@Alexander-Barth
Copy link
Contributor Author

Thanks @visr, for your test with NC_rcfile_insert and NC_rclookup.

I had look with the gdb debugger while running NC_rcfile_insert from a julia session.
When reaching this line https://github.com/Unidata/netcdf-c/blob/v4.8.1/libdispatch/drc.c#L532, just before returning, globalstate->rcinfo.triples was surprisingly still a null pointer.

549	    return ret;
(gdb) p ret
$1 = 0
(gdb) p globalstate->rcinfo.triples
$2 = (NClist *) 0x0
(gdb) p rc
$3 = (NClist *) 0x11e2550

However, when I force globalstate->rcinfo.triples=rc before returning,

(gdb) p globalstate->rcinfo.triples=rc
$4 = (NClist *) 0x11e2550

subsequent lookup do find the HTTP.SSL.CAINFO key:

julia> lookup = unsafe_string(@ccall(libnetcdf.NC_rclookup(key::Cstring, hostport::Cstring, path::Cstring)::Cstring))
"/etc/ssl/certs/ca-certificates.crt"

(And also OPENDAP over HTTPS works without defining a configuration file)

Could it be that NC_rcfile_insert should update globalstate->rcinfo.triples when a new list was created with nclistnew?

Maybe something similar to this?
https://github.com/Unidata/netcdf-c/blob/v4.8.1/libdispatch/drc.c#L338

@Alexander-Barth
Copy link
Contributor Author

Alexander-Barth commented Jun 7, 2022

With this PR defa153 we can now use the private API NC_rcfile_insert to set HTTP.SSL.CAINFO. I am wondering if there is any chance to have a public API for this option?

@DennisHeimbigner
Copy link
Collaborator

Thanks for the reminder. I plan to expand that API to a little better
access to the rc file: e.. insert, find, remove

@Alexander-Barth
Copy link
Contributor Author

Unfortunately, I see this failure now when trying this on NetCDF 4.9.0.

julia: drc.c:117: ncrc_setrchome: Assertion `ncg && ncg->home' failed.
signal (6): Aborted
in expression starting at /home/runner/work/_temp/4a1f6304-e640-43[7](https://github.com/Alexander-Barth/NCDatasets.jl/runs/6857358120?check_suite_focus=true#step:9:8)6-[8](https://github.com/Alexander-Barth/NCDatasets.jl/runs/6857358120?check_suite_focus=true#step:9:9)503-7bb4267810ff:1
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7fa[9](https://github.com/Alexander-Barth/NCDatasets.jl/runs/6857358120?check_suite_focus=true#step:9:10)d6d43728)
__assert_fail at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
ncrc_setrchome at /workspace/srcdir/netcdf-c-4.9.0/libdispatch/drc.c:117
NC_rcload at /workspace/srcdir/netcdf-c-4.9.0/libdispatch/drc.c:187
ncrc_initialize at /workspace/srcdir/netcdf-c-4.9.0/libdispatch/drc.c:[10](https://github.com/Alexander-Barth/NCDatasets.jl/runs/6857358120?check_suite_focus=true#step:9:11)1
NC_rcfile_insert at /workspace/srcdir/netcdf-c-4.9.0/libdispatch/drc.c:573
init_certificate_authority at /home/runner/work/NCDatasets.jl/NCDatasets.jl/src/netcdf_c.jl:2[13](https://github.com/Alexander-Barth/NCDatasets.jl/runs/6857358120?check_suite_focus=true#step:9:14)4

It seems that ncg->home is still NULL after calling NC_getglobalstate():

#4  0x00007fff97dacd35 in ncrc_setrchome () at drc.c:117
117	    assert(ncg && ncg->home);
(gdb) p ncg
$1 = (NCglobalstate *) 0x14cda50
(gdb) p ncg->home
$2 = 0x0

Maybe I need to call now NCDISPATCH_initialize to ensure that the global state is fully initialized?

@DennisHeimbigner
Copy link
Collaborator

This is windows/mingw, correct?

@Alexander-Barth
Copy link
Contributor Author

This issue is now on Linux (Ubuntu 20.04), also reproduced on CI:

https://github.com/Alexander-Barth/NCDatasets.jl/runs/6857358120?check_suite_focus=true

It might be also present on other platforms, but I did not reached the same point on Windows or Mac OS (all 3 fail, but they fail earlier).

@DennisHeimbigner
Copy link
Collaborator

Can you run the following command in the docker to see if the file is accessible at all?

wget "https://erddap.ifremer.fr/erddap/griddap/SDC_GLO_CLIM_TS_V2_1.dds"

or you can use curl if you prefer
In any case, post the result.

@Alexander-Barth
Copy link
Contributor Author

This is the output inside the docker container:

sandbox:${WORKSPACE}/srcdir/netcdf-c-4.9.0 # wget "https://erddap.ifremer.fr/erddap/griddap/SDC_GLO_CLIM_TS_V2_1.dds"
--2022-06-13 19:13:59--  https://erddap.ifremer.fr/erddap/griddap/SDC_GLO_CLIM_TS_V2_1.dds
Resolving erddap.ifremer.fr (erddap.ifremer.fr)... 134.246.142.39
Connecting to erddap.ifremer.fr (erddap.ifremer.fr)|134.246.142.39|:443... connected.
HTTP request sent, awaiting response... 200 200
Length: 2417 (2.4K) [text/plain]
Saving to: 'SDC_GLO_CLIM_TS_V2_1.dds'

SDC_GLO_CLIM_TS_V2_1.dds                  100%[===================================================================================>]   2.36K  --.-KB/s    in 0s      

2022-06-13 19:13:59 (149 MB/s) - 'SDC_GLO_CLIM_TS_V2_1.dds' saved [2417/2417]

sandbox:${WORKSPACE}/srcdir/netcdf-c-4.9.0 # head SDC_GLO_CLIM_TS_V2_1.dds
Dataset {
  Float64 time[time = 12];
  Float64 depth[depth = 45];
  Float64 latitude[latitude = 641];
  Float64 longitude[longitude = 1440];
  GRID {
    ARRAY:
      Float32 Temperature[time = 12][depth = 45][latitude = 641][longitude = 1440];
    MAPS:
      Float64 time[time = 12];

We use docker to compile the NetCDF and use the resulting binary libnetcdf.so.XY without docker. I see this error actually outside of the docker environment (same results of the wget test).

@DennisHeimbigner
Copy link
Collaborator

Thanks. The fact that it fails on linux is odd. I can see why it might fail on windows.
Did you try with an older version of libcurl?

@Alexander-Barth
Copy link
Contributor Author

Julia also depends on libcurl, so the libcurl version is kind of fixed. Calling the private NC_rcfile_insert to set HTTP.SSL.CAINFO as you suggested here is indeed a good approach (even better of course would be a public API :-)) . This worked well in NetCDF 4.8.1 but in NetCDF 4.9.0 there is an additional assert which now fails:

$ git checkout v4.8.1
$ grep -r 'ncg->home' . # no output
# grep assert ./libdispatch/drc.c # no output
$ git checkout v4.9.0
$ grep -r 'ncg->home' .
./libdispatch/drc.c:    assert(ncg && ncg->home);  # <- our failing assert
./libdispatch/drc.c:	tmp = ncg->home;

If I call NCDISPATCH_initialize from Julia before calling NC_rcfile_insert, ncg->home gets correctly initialized and NC_rcfile_insert now also work in with NetCDF 4.9.0, but I would like to minimize the number of private functions that we have to call.

DennisHeimbigner added a commit to DennisHeimbigner/netcdf-c that referenced this issue Jun 14, 2022
re: Unidata#2380
re: Unidata#2337

This PARTIALLY fixes some HOME problems because under Windows,
the HOME environment variable may not be set. In that case, use the
USERPROFILE environment variable instead.
@DennisHeimbigner
Copy link
Collaborator

Here is the proposed API for managing the .rc table:

Add two functions to netcdf.h to allow programs to walk and
modify the internal .rc tables. This should fix the above issue
by allowing HTTP.CAINFO to be set to the certificates directory.
Note that the changes should be performed as early as possible
in the program because some of the .rc table entries may get
cached internally and changing the entry after that caching
occurs may have no effect.

The new signatures are as follows:

  1. Iterate over the entries in the .rc table
int nc_rc_ith(int index, char* const * keyp, char* const* valuep, char* const* hostportp, char* const* urlpathp);

@param index which table entry to return
@param keyp table entry key field -- may be NULL
@param valuep table entry value field -- may be NULL
@param hostportp table entry hostport field (of form host | host:port) -- may be NULL
@param urlpathp table entry urlpath field -- may be NULL
@return NC_NOERR if no error
@return NC_EINVAL if index out of range
  1. Append/Overwrite/Delete Entry in .rc table
int nc_rc_replace(const char* key, const char* hostport, const char* urlpath, const char* value);

@param key table entry key field -- may not be NULL
@param value table entry value field -- NULL => delete
@param hostport table entry hostport field (of form host | host:port) -- may be NULL
@param urlpath table entry urlpath field -- may be NULL
@return NC_NOERR if no error

@Alexander-Barth
Copy link
Contributor Author

This API seems perfect to me! (but I cannot tell if there are enough const and * in the function signature as my C knowledge is quite limited ;-) )

I am wondering if the strings passed to nc_rc_replace can be freed safely after this function returned or are they expected to stay in memory (which is a bit more difficult to with the garbage collector in julia (and other languages)) ?

@DennisHeimbigner
Copy link
Collaborator

Yeh, const are confusing. I think I will change it so that the return values
are copies of the internal strings so the user must free them.
I will change the documenation to note that.

For the replace fcn, the strings are copied internally, so the user
can reclaim the arguments if desired. I will change the documenation
to not that.

@Alexander-Barth
Copy link
Contributor Author

Yes, handling the memory as you propose makes the most sense to me too!

DennisHeimbigner added a commit to DennisHeimbigner/netcdf-c that referenced this issue Jun 17, 2022
re: Unidata#2337
re: Unidata#2407

Add two functions to netcdf.h to allow programs to get/set
selected entries into the internal .rc tables. This should fix
the above issues by allowing HTTP.CAINFO to be set to the
certificates directory.  Note that the changes should be
performed as early as possible in the program because some of
the .rc table entries may get cached internally and changing the
entry after that caching occurs may have no effect.

The new signatures are as follows:

1. Get the value of a simple .rc entry of the form "key=value".
Note that caller must free the returned value, which might be NULL.
````
char* nc_rc_get(char* const * key);

@param key table entry key
@return value if .rc table has entry of the form key=value
@return NULL if no such entry is found.
````

2. Insert/Overwrite the specified key=value pair in the .rc table.
````
int nc_rc_set(const char* key, const char* value);

@param key table entry key -- may not be NULL
@param value table entry value -- may not be NULL
@return NC_NOERR if no error
@return NC_EINVAL if error
````

Addendum:

re: Unidata#2407

Modify dhttp.c to use the .rc entry HTTP.CAINFO if defined.
@DennisHeimbigner
Copy link
Collaborator

I changed the API to significantly simplify it and to cover the most
common cases.

@Alexander-Barth
Copy link
Contributor Author

Unfortunately, I also see this failed assertion when I call nc_rc_set from Julia (first call to the NetCDF library after loading the library) using the PR #2408.

#2337 (comment)

$ LANG=C julia-1.8 --eval 'ENV["JULIA_DEBUG"] = "NCDatasets"; using NCDatasets; NCDataset("https://opendap.puertos.es/thredds/dodsC/tidegauge_alge/2022/06/MIR2Z_Algeciras_Alge_3541_202206_analysis_event.nc4")'
┌ Debug: nc_rc_set: set HTTP.SSL.CAINFO to /etc/ssl/certs/ca-certificates.crt
└ @ NCDatasets ~/.julia/dev/NCDatasets/src/netcdf_c.jl:2174
julia-1.8: drc.c:171: ncrc_setrchome: Assertion `ncg && ncg->home' failed.

signal (6): Aborted
in expression starting at none:1
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7fa61c5d7489)

(OS: Linux and $HOME defined)

DennisHeimbigner added a commit to DennisHeimbigner/netcdf-c that referenced this issue Jun 21, 2022
re: Unidata#2380
re: Unidata#2337

This PARTIALLY fixes some HOME problems because under Windows,
the HOME environment variable may not be set. In that case, use the
USERPROFILE environment variable instead.
@Alexander-Barth
Copy link
Contributor Author

I am closing this issue because it is likely to be the same as: #2486

The option -std=c99 (used here) results that the function strdup is undefined and any operation on strings like using the home path leads in a crash.

Thank you very much for implementing the new API and your work in NetCDF in general !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants