Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiled PyFlex does not work on Ubuntu 20 #30

Open
Skylion007 opened this issue Mar 14, 2022 · 8 comments
Open

Compiled PyFlex does not work on Ubuntu 20 #30

Skylion007 opened this issue Mar 14, 2022 · 8 comments

Comments

@Skylion007
Copy link

I have been trying to compile the latest version of SoftGym on a Ubuntu 20 machine, however, I have been unable to load the compile pyflex.so from either the system or a conda interperter. The error I keep getting is that the symbol __powf_finite is not defined which seems to be related to the libc version.

I have been using CUDA11.6, PyBind2.9.1 and Ubuntu 20. I have tested this issue on Python 3.9, 3.8, and 3.7 and it has caused the same issue on each. I tried compiling with clang, but I got several errors that prevented compilation altogether.

@DanielTakeshi
Copy link

DanielTakeshi commented Mar 14, 2022

Can you copy and paste your full error message so that we can better diagnose?
Also, please exactly reproduce your steps.

@Skylion007
Copy link
Author

When trying to import pyflex:

ImportError: .....pyflex.so: undefined symbol: __powf_finite

@Skylion007
Copy link
Author

@DanielTakeshi Any updates?

@FranBesq
Copy link

FranBesq commented Mar 24, 2022

Have you compiled PyFlex with docker?
What steps have you followed exactly?

@denkiwakame
Copy link

denkiwakame commented Mar 25, 2022

@Skylion007

Hi, I encountered the same issue today w/PyFleX and figured out that the precompiled static library NvFlexExtReleaseCUDA uses __powf_finite function, which is not included in the latest libc++ google/filament#2146 (comment)

$ strings ../../lib/linux64/NvFlexExtReleaseCUDA_x64.a | grep finite
__powf_finite

Unfortunatelly we cannot easily re-compile NVIDIA FleX (proprietary software).
I just tried the following workaround and it worked locally (outside docker).

  • create libc_compat.c that only contains the following line
float __powf_finite(float x, float y) { return powf(x, y); }
  • and then link it to binary at CMakeLists
add_library(libc_compat ${ROOT}/bindings/libc_compat/libc_compat.c)
...
target_link_libraries(${EXAMPLE_BIN} PRIVATE ${ROOT}/lib/linux64/NvFlexExtReleaseCUDA_x64.a)
target_link_libraries(${EXAMPLE_BIN} PRIVATE libc_compat)
$ cmake -H. -Bbuild
$ make -j -C build

That is, I created the entity of __powf_finite by myself and linked so that NvFlexExtReleaseCUDA can refer to it.
It should work. I hope this helps.

info

  • g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
  • conda 4.5.11 Python3.7.0
  • Ubuntu20.04

@Skylion007
Copy link
Author

@denkiwakame It would be really useful if we could detect this error by checking the libc version and automatically apply this fix. Would you be willing to look into opening a PR?

@denkiwakame
Copy link

denkiwakame commented Apr 5, 2022

@Skylion007

I don't mind creating a PR though, in my humble opinion, this is not "fix", but a "temporary workaround'' .

  • 1️⃣ I created the old libc-compatible dummy library for NvFleXExtReleaseCUDA, and the function just fallbacks to powf instead of the original __powf_finite .
  • 2️⃣ In my understanding, we can ``fix'' the issue only if we re-compile NVIDIA FleX without `-ffast-math` (which may cause a performance issue) https://bugzilla.redhat.com/show_bug.cgi?id=1803203
    • or, re-compile NVIDIA FleX with latest libc
    • .... , which are not possible for us since NVIDIA open-sources only their democodes https://github.com/NVIDIAGameWorks/FleX
    • The problem is not due to neither the SoftGym nor PyFleX, but the precompiled NVIDIA FleX which depends on the older libc and CUDA9.
  • 3️⃣ It seems that the original authors only support Ubuntu 16.04 or 18.04 (in docker). We should not extend supported platforms unless the maintainers are eager to do so, which will be a bit too much on their plate.
    • (side note) As long as I tested locally, we don't even need cuda-docker environments when compiling (all we need is libcudart9.1.a and statically link it to the python binding alongside with NvFleX).

Btw, have you resolved the problem? Although I applied a simple workaround, it would also be appreciated if you find out a better solution for this :D

@adla700
Copy link

adla700 commented Sep 12, 2024

@Skylion007

I don't mind creating a PR though, in my humble opinion, this is not "fix", but a "temporary workaround'' .

  • 1️⃣ I created the old libc-compatible dummy library for NvFleXExtReleaseCUDA, and the function just fallbacks to powf instead of the original __powf_finite .

  • 2️⃣ In my understanding, we can ``fix'' the issue only if we re-compile NVIDIA FleX without `-ffast-math` (which may cause a performance issue) https://bugzilla.redhat.com/show_bug.cgi?id=1803203

    • or, re-compile NVIDIA FleX with latest libc
    • .... , which are not possible for us since NVIDIA open-sources only their democodes https://github.com/NVIDIAGameWorks/FleX
    • The problem is not due to neither the SoftGym nor PyFleX, but the precompiled NVIDIA FleX which depends on the older libc and CUDA9.
  • 3️⃣ It seems that the original authors only support Ubuntu 16.04 or 18.04 (in docker). We should not extend supported platforms unless the maintainers are eager to do so, which will be a bit too much on their plate.

    • (side note) As long as I tested locally, we don't even need cuda-docker environments when compiling (all we need is libcudart9.1.a and statically link it to the python binding alongside with NvFleX).

Btw, have you resolved the problem? Although I applied a simple workaround, it would also be appreciated if you find out a better solution for this :D

Had the same problem while running rlvlmf today -
ImportError: /home/adhula/PyFleX/bindings/build/pyflex.cpython-39-x86_64-linux-gnu.so: undefined symbol: __powf_finite

Did you find any simple/short cut solutions for it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants