Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prov/psm3: --disable-shared causes "No HALs registered" error #8979

Closed
raffenet opened this issue May 31, 2023 · 3 comments
Closed

prov/psm3: --disable-shared causes "No HALs registered" error #8979

raffenet opened this issue May 31, 2023 · 3 comments

Comments

@raffenet
Copy link
Contributor

raffenet commented May 31, 2023

Describe the bug
Error message from psm3 provider during initialization.

To Reproduce

./autogen.sh
./configure --prefix=$PWD/i --disable-shared
make -j; make install
FI_PROVIDER=psm3 i/bin/fi_info
compute-386-08% FI_PROVIDER=psm3 i/bin/fi_info
compute-386-08:pid2800529.fi_info: No HALs registered
fi_getinfo: -61 (No data available)

Expected behavior
No error message. If psm3 cannot work with --disable-shared, then configure should disable the provider.

Environment:
Linux (Ubuntu 22.04)

@acgoldma
Copy link
Contributor

acgoldma commented Jun 1, 2023

I have been able to reproduce. Oddly, looks like the constructors that are included in other files are working, but the ones in the hals are not.

@acgoldma
Copy link
Contributor

acgoldma commented Jun 2, 2023

Comment from #8972:

looks like the __psmi_hal_verbs_constructor is grepable in libfabric.a, but not in fi_info.

It looks like constructors are not always included when linking a static lib to an executable.

I was able to fix this for fi_info by adding this:

 util_fi_info_SOURCES = \
        util/info.c
 util_fi_info_LDADD = $(linkback)
+util_fi_info_LDFLAGS = -Wl,--whole-archive,src/.libs/libfabric.a,--no-whole-archive

 util_fi_strerror_SOURCES = \
        util/strerror.c

This will link the whole archive instead of just the sections fi_info calls.

Not sure why some constructors worked and others did not.
Specifically all psm3 constructors, excpet the ones to register the HALs, worked fine.

Structurally, __psmi_gethostname_lock_constructor() works the same way as the hals, but seems to work fine.

As a work around for now, using the above patch should work.

Copy link
Contributor

This issue is stale because it has been open 360 days with no activity. Remove stale label or comment, otherwise it will be closed in 7 days.

@github-actions github-actions bot added the stale label May 28, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants