-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chinese character in the path using netcdf4 dataset #997
Comments
This may be related to #941, which in turn is related to HDF5 unicode filename support (https://forum.hdfgroup.org/t/non-english-characters-in-hdf5-file-name/4627/8) |
h5py/h5py#839 suggests that this may be been fixed in HD5 1.10.6 (at least for windows). What version of HDF5 are you using? |
The following works for me import netCDF4
filename="delta_\u0394.nc"
print(filename)
nc = netCDF4.Dataset(filename,'w')
nc.filename = filename
nc.close()
nc = netCDF4.Dataset(filename)
print(nc)
delta_Δ.nc
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
filename: delta_Δ.nc
dimensions(sizes):
variables(dimensions):
groups: Can you modify this example to use the utf-8 filename causing you problems? |
Thank you very much! I am going to try. |
|
I use netCDF4 1.5.3. |
import netCDF4 as np Above is my code, I think it is simple and I met the error : |
What encoding are you using? |
UTF8 |
I think this is a windows filename encoding issue - I can create a file with this filename and read it back in on macos x and linux. Unfortunately, I don't have access to Windows. |
Disregard my last message - I can now reproduce this on macos x and linux. Here's the script: import netCDF4, os
dirpath1 =\
b'\xe6\xb5\xb7\xe6\xb4\x8b\xe6\x8e\xa2\xe6\xb5\x8b\xe6\x8a\x80\xe6\x9c\xaf\xe4\xb8\x93\xe9\xa2\x98\xe5\xae\x9e\xe9\xaa\x8c'.decode('utf-8')
dirpath2 =\
b'\xe4\xb8\x93\xe9\xa2\x98\xe5\xae\x9e\xe9\xaa\x8c\xe5\x85\xab'.decode('utf-8')
dirpath = os.path.join(dirpath1,dirpath2)
os.makedirs(dirpath,exist_ok=True)
filename=os.path.join(dirpath,'V2019151040600.L2_SNPP_OC.nc')
print(filename)
nc = netCDF4.Dataset(filename,'w')
nc.filename = filename
nc.close()
nc = netCDF4.Dataset(filename)
print(nc.filename) which produces [mac28:~/python] jwhitaker% python3.7 unicode_filename.py Curiously, the file is created - but can't be read. ncdump produces the same error when given the full path to the file, so it isn't an issue in the python interface. [mac28: h5dump does work though - so it seems like it's an issue with |
The following h5py script works, so I don't think this is an hdf5 issue import h5py, os
dirpath1 =\
b'\xe6\xb5\xb7\xe6\xb4\x8b\xe6\x8e\xa2\xe6\xb5\x8b\xe6\x8a\x80\xe6\x9c\xaf\xe4\xb8\x93\xe9\xa2\x98\xe5\xae\x9e\xe9\xaa\x8c'.decode('utf-8')
dirpath2 =\
b'\xe4\xb8\x93\xe9\xa2\x98\xe5\xae\x9e\xe9\xaa\x8c\xe5\x85\xab'.decode('utf-8')
dirpath = os.path.join(dirpath1,dirpath2)
os.makedirs(dirpath,exist_ok=True)
filename=os.path.join(dirpath,'V2019151040600.L2_SNPP_OC.h5')
print(filename)
f = h5py.File(filename,'w')
dset = f.create_dataset("mydataset", (100,), dtype='i')
f.close()
f = h5py.File(filename,'r')
print(f) [mac28:~/python] jwhitaker% python3.7 unicode_filename_h5py.py |
From https://support.hdfgroup.org/HDF5/doc/Advanced/UsingUnicode/index.html:
A simple workaround in this case (since the chinese characters are in the directory names, not the filename itself) is to create the directory in python, change the working directory, and then create the file in that directory import netCDF4, os
dirpath1 =\
b'\xe6\xb5\xb7\xe6\xb4\x8b\xe6\x8e\xa2\xe6\xb5\x8b\xe6\x8a\x80\xe6\x9c\xaf\xe4\xb8\x93\xe9\xa2\x98\xe5\xae\x9e\xe9\xaa\x8c'.decode('utf-8')
dirpath2 =\
b'\xe4\xb8\x93\xe9\xa2\x98\xe5\xae\x9e\xe9\xaa\x8c\xe5\x85\xab'.decode('utf-8')
dirpath = os.path.join(dirpath1,dirpath2)
os.makedirs(dirpath,exist_ok=True)
os.chdir(dirpath)
filename='V2019151040600.L2_SNPP_OC.nc'
print(filename)
nc = netCDF4.Dataset(filename,'w')
nc.filename = filename
nc.close()
nc = netCDF4.Dataset(filename)
print(os.getcwd)
print(nc.filename) Still curious as to how h5py manages to make it work, digging further... |
|
A potential solution (bug fix in the C lib) is being discussed at Unidata/netcdf-c#1666 |
@LiNinghui-AI 嗨 问一下解决了吗?我这有中文路径还是不行呢~ |
I tryied an indirect method: |
I need to open a nc file with netcdf4. But in the path of the file, there are chinese characters and netCDF4.Dataset return the error : "No such file or directory". But if I use "os.path.isfile", the file is found. I've tried to decode, encode (in utf-8) the path without result.
Is there any option to use in dataset call ?
Thanks.
The text was updated successfully, but these errors were encountered: