Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to specify the use of the DIRECT driver #2236

Open
hmaarrfk opened this issue Feb 26, 2022 · 16 comments
Open

Allow users to specify the use of the DIRECT driver #2236

hmaarrfk opened this issue Feb 26, 2022 · 16 comments

Comments

@hmaarrfk
Copy link
Contributor

This is somewhat a followup to: #2177

In order to achieve maximum performance on writing large data blocks, I had to do a few things:

  1. Ensure the datasets were aligned in the user's ram (out of scope for netcdf-c)
  2. Ensure the destination in the file was aligned to a block : Allow users to specify data alignment #2177
  3. Use the direct driver to bypass many caching mechanism in the operating system.

With #2177 approaching being fixed #2206

I'm hoping that we can provide users to use different virtual file drivers.

The DIRECT driver fapl documention is:
https://support.hdfgroup.org/HDF5/doc/RM/H5P/H5Pset_fapl_direct.htm

Without the DIRECT driver, on linux, I'm limited to about 1GB/s.
With the DIRECT driver, on linux, I can reach read/write speeds that are limited by the PCIe Interface and the SSD reaching up to 3GB/s.

A secondary effect is that it bypasses the operating system file cache, so the operating system doesn't need to spend time evicting it (or other files) during large disk reads/writes.

Thank you for considering this feature.

I presume that the changes would have to provide a path similar to

        if (H5Pset_fapl_mpio(fapl_id, comm, info) < 0)
            BAIL(NC_EPARINIT);

In the file: libhdf5/hdf5create.c and libhdf5/hdf5open.c.

Let me know how I can help get this through.

For reference, here is my PR to h5py that exposes the same feature: h5py/h5py#2041

@DennisHeimbigner
Copy link
Collaborator

Since we use a couple of file drivers already, this is certainly possible.
I will have to investigate a reasonable way to specify the use of some
file driver. Is the DIRECT driver already part of HDF5 or is it your
own custom driver?

@DennisHeimbigner
Copy link
Collaborator

Is PR 2066 a possible solution when it gets integrated?

@hmaarrfk
Copy link
Contributor Author

Do you mean to ask:
"Is a solution similar to PR2206 an acceptable solution?"

The answer is yes.

@DennisHeimbigner
Copy link
Collaborator

A couple of questions:
1, Is the DIRECT driver already part of HDF5 or is it your own custom driver?
2. Does it take any parameters?
Is there a pointer to the source code for this driver?

@DennisHeimbigner
Copy link
Collaborator

https://docs.hdfgroup.org/hdf5/v1_8/group___f_a_p_l.html#title14

Is this the driver to which you refer?

@hmaarrfk
Copy link
Contributor Author

https://docs.hdfgroup.org/hdf5/v1_8/group___f_a_p_l.html#title14

Is this the driver to which you refer?

yes.

I guess my original reference to the official documentation was missed.

@hmaarrfk
Copy link
Contributor Author

The difference is subtle, but eventually you get down deep in thesource code.

H5FDSec2 on the left, H5FDDirect On the right
image

I think there are other differences too, but the O_DIRECT flag is what gives you more direct access to the hardware.

https://github.com/HDFGroup/hdf5/blob/develop/src/H5FDsec2.c#L329

@DennisHeimbigner
Copy link
Collaborator

Sorry, do you want Direct or do you want SEC2?

@hmaarrfk
Copy link
Contributor Author

SEC2 is the default. Direct is the one I'm requesting.

@hmaarrfk
Copy link
Contributor Author

(I've definitely been in the HDF5 weeds for too long, sorry for the jargon)

@DennisHeimbigner
Copy link
Collaborator

I see. Anyway, This will require some thinking. Solution will not be soon.

@hmaarrfk
Copy link
Contributor Author

hmaarrfk commented Mar 1, 2022

Cross referencing other benchmarks I ran:
#2206 (comment)

@hmaarrfk
Copy link
Contributor Author

hmaarrfk commented Mar 1, 2022

I must say that something I found rather annoying when building support for this in h5py was that the direct driver symbols are not always visible.

This means that you have to selectively enable (and disable!) them at compile time.

@DennisHeimbigner
Copy link
Collaborator

I must say that something I found rather annoying when building support for this in h5py was that the direct driver symbols are not always visible.
This means that you have to selectively enable (and disable!) them at compile time.

Not sure I follow; can you elaborate?

@hmaarrfk
Copy link
Contributor Author

hmaarrfk commented Mar 1, 2022

On conda forge, I compiled hdf5 with --enable-direct-vfd

https://github.com/conda-forge/hdf5-feedstock/blob/master/recipe/build.sh#L12

The command:

$ nm -gC libhdf5.so | grep -i direct
00000000000af760 T H5D__chunk_direct_read
00000000000b24e0 T H5D__chunk_direct_write
00000000004066d4 B H5_direct_block_blk_free_list
00000000003149f0 T H5FD_direct_init
00000000004066d8 B H5_H5HF_direct_t_reg_free_list
00000000004066e8 B H5_H5HF_indirect_ent_t_seq_free_list
00000000004066dc B H5_H5HF_indirect_filt_ent_t_seq_free_list
00000000004066e0 B H5_H5HF_indirect_ptr_t_seq_free_list
00000000004066e4 B H5_H5HF_indirect_t_reg_free_list
0000000000403f80 D H5HF_FSPACE_SECT_CLS_INDIRECT
000000000016c8d0 T H5HF__sect_indirect_add
0000000000314e80 T H5Pget_fapl_direct    <---- This symbol is for the direct driver
0000000000314bd0 T H5Pset_fapl_direct    <---- This symbol is for the direct driver
00000000002fcec0 T H5Z_can_apply_direct
00000000002fcfd0 T H5Z_set_local_direct

However, h5py installable through pypi, does not include the flag in their compilation.

(pypi) ✘-1 ~/mambaforge/envs/pypi/lib/python3.9/site-packages/h5py.libs
15:36 $ nm -gC libhdf5-346dbfc8.so.200.1.0 | grep -i direc
00000000000b49b0 T H5D__chunk_direct_read
00000000000b7bd0 T H5D__chunk_direct_write
00000000006230a0 D H5_direct_block_blk_free_list
00000000006230e0 D H5_H5HF_direct_t_reg_free_list
00000000006231e0 D H5_H5HF_indirect_ent_t_seq_free_list
00000000006231a0 D H5_H5HF_indirect_filt_ent_t_seq_free_list
0000000000623160 D H5_H5HF_indirect_ptr_t_seq_free_list
0000000000623220 D H5_H5HF_indirect_t_reg_free_list
00000000006232e0 D H5HF_FSPACE_SECT_CLS_INDIRECT
00000000001703c0 T H5HF__sect_indirect_add
000000000031a260 T H5Z_can_apply_direct
000000000031a350 T H5Z_set_local_direct

This means that at compile time you have to detect the capabilities of the underlying hdf5 library, and use macros to enable the appropriate sections of code within the netcdf library (or some other dynamic linking mechanism).

@DennisHeimbigner
Copy link
Collaborator

I see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants