Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use hdf5=1.14 and conda med and medcoupling #57

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Krande
Copy link
Contributor

@Krande Krande commented Apr 3, 2024

Hey @ldallolio, I thought we could try to use the latest conda libmed and medcoupling packages and use HDF5 1.14 as dependencies for Code Aster.

I managed to configure and build locally, however it seems we're having an issue with H5 native float types. I am making this PR in the hopes that we might be able to resolve the HDF5 issue!

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@Krande Krande requested a review from ldallolio as a code owner April 3, 2024 14:19
@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@Krande
Copy link
Contributor Author

Krande commented Apr 3, 2024

@conda-forge-admin, please rerender

@Krande
Copy link
Contributor Author

Krande commented Apr 4, 2024

Hey @hmaarrfk, I hope it's okay I tagged you here directly.

We've started to integrate the packages you helped use with (libmed and medcoupling) here in the FE solver Code Aster.

I have managed to update the build script to accomodate the necessary dependencies so that it compiles just fine.

However, we're seeing a runtime issue with what seems to be the function H5T__init_native_float_types. I wanted to get your opinion on how to best approach debugging this. Remember, that we

Is it possible to print out all the native types definitions (ie. could the 64 bit integer, real and double sizes set in libmed, medcoupling and Code Aster play a role here?). And would it make any sense if I compile the dependencies myself relying on HDF5 with debugging symbols in order to dig into the underlying issue, or would you advice doing it differently?

For reference, the error message excerpt below is from this

stderr: Traceback returned by GNU libc (last 25 stack frames):
/home/<shortened this>/placeholder_/lib/aster/libbibc.so(print_trace_+0x33) [0x7fc15b7f18f3]
/home/<shortened this>/placeholder_/lib/aster/libbibfor.so(utmess_core_+0xa48) [0x7fc15cc0b8a8]
/home/<shortened this>/placeholder_/lib/aster/libbibfor.so(utmess_+0x760) [0x7fc15cc0aa60]
/home/<shortened this>/placeholder_/lib/aster/libbibfor.so(utmfpe_+0x39) [0x7fc15cd23019]
/home/<shortened this>/placeholder_/lib/aster/libbibc.so(hanfpe+0x9) [0x7fc15b7e8379]
/lib64/libc.so.6(+0x36400) [0x7fc15db94400]
/home/<shortened this>/placeholder_/lib/libhdf5.so.310(H5T__init_native_float_types+0x932) [0x7fc157adbea2]
/home/<shortened this>/placeholder_/lib/libhdf5.so.310(H5T_init+0x4b) [0x7fc157a6afeb]
/home/<shortened this>/placeholder_/lib/libhdf5.so.310(H5VL_init_phase2+0x143) [0x7fc157afa513]
/home/<shortened this>/placeholder_/lib/libhdf5.so.310(H5_init_library+0x2b3) [0x7fc157851a23]
/home/<shortened this>/placeholder_/lib/libhdf5.so.310(H5get_libversion+0xc6) [0x7fc1578520b6]
/home/<shortened this>/placeholder_/lib/aster/libbibc.so(lihdfv_+0x4e) [0x7fc15b7ed86e]
/home/<shortened this>/placeholder_/lib/aster/libbibfor.so(entete_+0xf6) [0x7fc15cc641f6]
/home/<shortened this>/placeholder_/lib/aster/libbibfor.so(ibmain_+0x137) [0x7fc15c9f5ee7]
/home/<shortened this>/placeholder_/lib/aster/libbibcxx.so(_Z11jeveux_initi+0x10) [0x7fc15b3b7ce0]
/home/<shortened this>/placeholder_/lib/aster/libbibcxx.so(+0x3bec42) [0x7fc15b3bec42]
/home/<shortened this>/placeholder_/lib/aster/libbibcxx.so(+0x1a5c9d) [0x7fc15b1a5c9d]
/home/<shortened this>/placeholder_/bin/python(+0x201626) [0x55a3c9682626]
/home/<shortened this>/placeholder_/bin/python(_PyObject_MakeTpCall+0x253) [0x55a3c9661323]
/home/<shortened this>/placeholder_/bin/python(_PyEval_EvalFrameDefault+0x716) [0x55a3c966ee36]
/home/<shortened this>/placeholder_/bin/python(+0x2303a4) [0x55a3c96b13a4]
/home/<shortened this>/placeholder_/bin/python(+0x22fbe0) [0x55a3c96b0be0]
/home/<shortened this>/placeholder_/bin/python(_PyEval_EvalFrameDefault+0x49f9) [0x55a3c9673119]
/home/<shortened this>/placeholder_/bin/python(+0x2303a4) [0x55a3c96b13a4]
/home/<shortened this>/placeholder_/bin/python(+0x22fb8e) [0x55a3c96b0b8e]

Command '/home/conda/feedstock_root/build_artifacts

The C function in Code Aster that calls H5get_libversion can be found here

@hmaarrfk
Copy link

hmaarrfk commented Apr 4, 2024

Is it possible to print out all the native types definitions (ie. could the 64 bit integer, real and double sizes set in libmed, medcoupling and Code Aster play a role here?). And would it make any sense if I compile the dependencies myself relying on HDF5 with debugging symbols in order to dig into the underlying issue, or would you advice doing it differently?

Unfortunately, my advice is to stay away from any library that switches data type definitions based on a compilation flag. libmed seems to do that, so my suggestion would be to work with upstream to get that "fixed".
If you want specific datatypes, the library should be specific about them uint32 for 32 bit unsiged, uint64 for 64bit unsigned. Scientific and performant computing just requires knowledge of the datatype size.

changing the value of my_datatype is just dangerous. I have nightmares from when I wrote code like that ^_^.

Have a look at https://github.com/HDFGroup/hdf5/blob/3424bc9756e00a8630b774449638bf5f498ef015/src/H5Tinit_float.c#L455 to see if anything jumps out as potentially problematic.

There are some much more knowledge in the HDF5 inner workings if you ping them @conda-forge/hdf5. I unfortunately am only a mere user of HDF5.

--enable-hdf5 \
--embed-hdf5 \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like you want to keep these "enables" right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, if you are referring to the "--embed-*" flags, I believe they look for statically compiled libs. med, hdf5 and mumps I believe are shared libraries?

From the waf config file for med `waftools/med_cfg.py"

group.add_option(
        "--embed-hdf5",
        dest="embed_hdf5",
        default=None,
        action="store_true",
        help="Embed HDF5 libraries as static library",
    )

I tried quickly to check what would happen with "--embed-hdf5" and the configuration part failed with:

['/home/conda/feedstock_root/build_artifacts/code-aster_1712244556525/_build_env/bin/x86_64-conda-linux-gnu-cc', '-Wl,--export-dynamic', '-Wl,--no-keep-memory', 'test.c.1.o', '-o/home/conda/feedstock_root/build_artifacts/code-aster_1712244556525/work/build/std/.conf_check_928bdad62d1e191bbcbdf50847880f00/testbuild/release/testprog', '-Wl,-Bstatic', '-lhdf5', '-Wl,-Bdynamic', '-lz', '-Wl,--no-as-needed', '-lscotch', '-lscotcherr', '-lscotcherrexit', '-lz', '-ldl', '-lm', '-Wl,-O2', '-Wl,--sort-common', '-Wl,--as-needed', '-Wl,-z,relro', '-Wl,-z,now', '-Wl,--disable-new-dtags', '-Wl,--gc-sections', '-Wl,--allow-shlib-undefined', '-Wl,-rpath,/home/conda/feedstock_root/build_artifacts/code-aster_1712244556525/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib', '-Wl,-rpath-link,/home/conda/feedstock_root/build_artifacts/code-aster_1712244556525/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib', '-L/home/conda/feedstock_root/build_artifacts/code-aster_1712244556525/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib']
err: /home/conda/feedstock_root/build_artifacts/code-aster_1712244556525/_build_env/bin/../lib/gcc/x86_64-conda-linux-gnu/12.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find -lhdf5: No such file or directory
collect2: error: ld returned 1 exit status

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry i misread the flag. enable vs embed

@Krande
Copy link
Contributor Author

Krande commented Apr 5, 2024

Is it possible to print out all the native types definitions (ie. could the 64 bit integer, real and double sizes set in libmed, medcoupling and Code Aster play a role here?). And would it make any sense if I compile the dependencies myself relying on HDF5 with debugging symbols in order to dig into the underlying issue, or would you advice doing it differently?

Unfortunately, my advice is to stay away from any library that switches data type definitions based on a compilation flag. libmed seems to do that, so my suggestion would be to work with upstream to get that "fixed". If you want specific datatypes, the library should be specific about them uint32 for 32 bit unsiged, uint64 for 64bit unsigned. Scientific and performant computing just requires knowledge of the datatype size.

changing the value of my_datatype is just dangerous. I have nightmares from when I wrote code like that ^_^.

Have a look at https://github.com/HDFGroup/hdf5/blob/3424bc9756e00a8630b774449638bf5f498ef015/src/H5Tinit_float.c#L455 to see if anything jumps out as potentially problematic.

There are some much more knowledge in the HDF5 inner workings if you ping them @conda-forge/hdf5. I unfortunately am only a mere user of HDF5.

Thank you for replying so quickly, @hmaarrfk! I always appreciate your valuable input.

Yeah, I see your point on the dangers of switching data types during compilation, especially when it comes to aligning these packages with other existing packages on conda-forge.

I suspect any upstream changes to data type switching is likely not going to happen in the near future. So right now, I would settle for identifying the root issue, and then based on that evaluate the best course of action going further.

@conda-forge/hdf5, it would be greatly appreciated if anyone of you could point us in the right direction with respect to debugging the root cause for the above mentioned issue related to H5T__init_native_float_types?

Let me know what kind of information you would need to help, and I'll provide it asap.

Best Regards
Kristoffer

@hmaarrfk
Copy link

hmaarrfk commented Apr 5, 2024

I suspect any upstream changes to data type switching is likely not going to happen in the near future.

you have to speak up!

they are also the best ones to help you.

@Krande
Copy link
Contributor Author

Krande commented Apr 5, 2024

I suspect any upstream changes to data type switching is likely not going to happen in the near future.

you have to speak up!

they are also the best ones to help you.

Don't worry, I'm already on it :) I have gotten confirmation that they currently only support 64 bit integers and that they plan on making the datatype explicit in the future (see discussion here for reference).

I'll continue digging into the root cause of the issue we're seeing in Code Aster on linux, and in parallel I have started working on compiling a windows version of Code Aster for windows. The latter requires a MSVC Fortran compiler, and it also requires the windows variants of hdf5, libmed and medcoupling to be compiled with a MSVC fortran compiler. So there's still a lot of work ahead of me :)

Thanks again for your continued support. I really appreciate it!

Best Regards
Kristoffer

@hmaarrfk
Copy link

hmaarrfk commented Apr 5, 2024

I have gotten confirmation that they currently only support 64 bit integers and that they plan on making the datatype explicit in the future (see discussion here for reference).

perfect!

@Krande
Copy link
Contributor Author

Krande commented Apr 8, 2024

Hm, I came across this: HDFGroup/hdf5#3831 and tried to compile against hdf5=1.14.0 (made local compilations of libmed and medcoupling also). To my surprise I was able to pass the tests successfully now. Hopefully hdf5=1.14.4 solves this issue.

I also found that I will have to bring back the static compilation and embedding of mumps. Will add my changes to this PR once I've cleaned it up a bit.

@hmaarrfk Do you know if I can enforce a different version of 1.14.3 of hdf5 for this package even though libmed and medcoupling are compiled against v1.14.3?

Update:

Might not be necessary to re-compile to older version of HDF5 as it seems the 1.14.4 milestone is scheduled for April 11th . So I guess I can either close this PR and re-open it once 1.14.4 is out, or just let this PR wait until then)

…d-medc

# Conflicts:
#	.ci_support/linux_64_numpy1.22python3.10.____cpython.yaml
#	.ci_support/linux_64_numpy1.22python3.8.____cpython.yaml
#	.ci_support/linux_64_numpy1.22python3.9.____cpython.yaml
#	.ci_support/linux_64_numpy1.23python3.11.____cpython.yaml
#	recipe/meta.yaml
@Krande Krande marked this pull request as draft June 15, 2024 13:25
@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants