Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cx_freeze pulling in wrong dynamic libraries #2261

Closed
Fox214 opened this issue Feb 15, 2024 · 31 comments
Closed

cx_freeze pulling in wrong dynamic libraries #2261

Fox214 opened this issue Feb 15, 2024 · 31 comments

Comments

@Fox214
Copy link

Fox214 commented Feb 15, 2024

Hi, I was debugging another build issue (still working on that) but discovered when turning off silent that some of my shared object files were coming from the wrong location.

Specifically: libcrypto.so.1.1 and libssl.so.1.1

Background:
Python is build on a network file system with --with-openssl and the Modules/Setup file is edited (OPENSSL= same path as clo configure option) additionally LDFLAGS=-Wl,-z,origin,-rpath="$$\ORIGIN" is passed to configure.
this puts these files in the lib directory and everything is linked together correctly.

At this point the entire python install is portable, and can be moved to any location and works. (and it is moved out of the build area and write bits are disabled)

When cx_freeze is used to build though its still pulling these files from the original location (and because they're the same everything is still working)

I debugged the problem at the bottom of freezer.py - I added the print statements for clarity:

def _default_bin_path_includes(self) -> list[str]:
        bin_path = [
            sysconfig.get_config_var("LIBDIR"),
            sysconfig.get_config_var("DESTLIB"),
        ]
        print("sys.path",sys.path)
        print("LIBDIR",sysconfig.get_config_var("LIBDIR"))
        print("DESTLIB",sysconfig.get_config_var("DESTLIB"))
        return self._validate_bin_path(bin_path)

the sysconfig vars are returning the original location of the install, sys.path is returning the proper (new) location, if the location is not moved I would expect these to match. Can sys.path (or some subset) at least be added (preferably ahead) to this? its validated anyways the fact that its possible to dynamically link to files outside our build area is an issue.

My sense is sysconfig.get_config_var is a subset of sys.path if python is in the original location and completely incorrect if it has been moved.

This is for linux (presumably all platforms)

@Fox214
Copy link
Author

Fox214 commented Feb 28, 2024

Any update on this? I will probably be rebuilding python and updating cx_freeze in March sometime and would prefer to not have to remember to update this myself.

@marcelotduarte
Copy link
Owner

marcelotduarte commented Feb 29, 2024

Due to the characteristics of your project and the way you compile Python, it ends up having differences with what cx_Freeze was designed for. cx_Freeze is designed to run in "compiled from sources" mode and in binary mode. I added this binary mode after I took over maintenance, and I use the manylinux project as base. You do something intermediate, so I think you would also have to do this intermediation for cx_Freeze.
I had already suggested this to you: #2010 (comment)
Here is more information: https://cx-freeze.readthedocs.io/en/stable/development/index.html#building-redistributable-binary-wheels

But first you can test an option that I recently added and I think it meets what you are asking:

Can sys.path (or some subset) at least be added (preferably ahead) to this?

https://cx-freeze--2253.org.readthedocs.build/en/2253/setup_script.html#cmdoption-arg-include_path

You can test the patch in the latest development build:
pip install --force --no-cache --pre --extra-index-url https://marcelotduarte.github.io/packages/ cx_Freeze
For conda-forge the command is:
conda install -y --no-channel-priority -S -c https://marcelotduarte.github.io/packages/conda cx_Freeze

@Fox214
Copy link
Author

Fox214 commented Mar 12, 2024

The new option appears to work except for the main libpython.3.12.so, with silent False:
copying /path/to/CPython/3.12.0/linux64/lib/libpython3.12.so.1.0 -> /path/to/lib-linux64/lib/libpython3.12.so.1.0

(I copied CPython to OddPYTHON to test, and every other reference in the log file was to OddPYTHON)

Also this version appears to fix a problem with 3.12 relative to the use of the opcode module. So thank you.

@Fox214
Copy link
Author

Fox214 commented Mar 12, 2024

FWIW, right after posting my last comment I, realized I just lazily created include_path by ','.join(sys.path) and /path/to/OddPYTHON/3.12.0/linux64/lib was not in it, but I added it and got the same result (it was first in the path).

@Fox214
Copy link
Author

Fox214 commented Mar 26, 2024

I ran into another problem with dev build:
6.15.6 (current prod):
copying /depotbld/RHEL7.0/gcc-9.5.0/lib64/libstdc++.so.6.0.28 -> /path/to/lib-linux64/lib/libstdc++.so.6.0.28
linking /path/to/lib-linux64/lib/libstdc++.so.6 -> libstdc++.so.6.0.28
copying /depotbld/RHEL7.0/gcc-9.5.0/lib64/libstdc++.so.6.0.28 -> /path/to/lib-linux64/lib/libstdc++.so.6.0.28
linking /path/to/lib-linux64/lib/libstdc++.so.6 -> libstdc++.so.6.0.28

dev build:
[delay] linking /path/to/lib-linux64/lib/libstdc++.so.6 -> ../../../../../depotbld/RHEL8.0/gcc-12.2.0/lib64/libstdc++.so.6.0.30
copying /depotbld/RHEL8.0/gcc-12.2.0/lib64/libstdc++.so.6.0.30 -> /path/to/lib-linux64/lib/libstdc++.so.6.0.30
linking /path/to/lib-linux64/lib/libstdc++.so.6 -> ../../../../../depotbld/RHEL8.0/gcc-12.2.0/lib64/libstdc++.so.6.0.30

Couple issues: /depotbld is a root directory and the number of ../ is not correct to make it relative, it should be a local link, its going to no where.

I can potentially debug this further if needed. (note this seems to happen whether I use "include_path" or not)

@Fox214
Copy link
Author

Fox214 commented Apr 3, 2024

Any updates on this? (either issue)

@Fox214
Copy link
Author

Fox214 commented Apr 4, 2024

FYI, I ended up combing the _pre_copy_hook and _post_freeze_hook methods from the released version (6.15.16 I think) in freezer.py (Freezer and LinuxFreezer classes) and that seems to fix the issue, I did not do any deep analysis.

Typically .so files are symlinked locally because a file like libstdc++.so.6.0.30 has a SONAME (readelf -d libstdc++.so.6.0.30 | grep SONAME) of libstdc++.so.6 which means files searching RPATH will look for libstdc++.so.6 not libstdc++.so.6.0.30, these are linked locally.

There may be a case here of links referencing links as well as this line: symlink = Path(os.path.relpath(real_source, source.parent)) I suspect is problematic because os.getcwd (where cx_freeze started, about 8 directories from root) is completely and totally different the origin location of the file (shown above which is 4 directories from root). in this case its linking the link to a relative version of the source file.

For simplicity:
/root/source/dir/libstdc++.so.6 -> libstdc++.so.6.0.30
/root/source/dir/libstdc++.so.6.0.30 (depth 3)

/path/to/my/cxFreeze/dir (this is what os.getcwd()reports - depth 5)
/path/to/my/cxFreeze/dir/lib-linux64/lib (here is linking libstdc++.so.6 > ../../../root/source/dir/libstdc++.so.6.0.30 - depth 7)

so the "relpath" is reporting a depth of 3 and using it in a depth 7 location. Although in either case its linking to outside lib-linux64 (which is where I want my stuff "frozen" to, so it will break if I move the location)

Unsure of the purpose of the change but If the goal here is to reduce the output size I would humbly suggest copy everything and then just create hardlinks for files which have the same checksum. (not sure how patchelf processing might affect that though). (this is unlikely to be the fastest way, but probably the most effective).

@marcelotduarte
Copy link
Owner

Sorry, I haven't had time to look into this yet. You can propose a patch if you want, or even part of the code you modified, just for reference.

@Fox214
Copy link
Author

Fox214 commented Apr 4, 2024

Is there a place I can put the freezer.py file I have (its the only one I changed)? I just fixed the linking issue, I also updated sys.path as I described in the initial post (it fixes the issue although you can of course not accept that change). I did notice the libpython3.12.so file seems to only get copied if I use "include_path" so I am thinking its not needed anyways. Maybe your option can suppress it? I am fine with using the option.

Otherwise I can describe/paste code from my changes here (its 4 functions, none are more than 10 lines) I don't think you want me to put all 1300ish lines here.

@marcelotduarte
Copy link
Owner

Check https://cx-freeze.readthedocs.io/en/stable/development/index.html, mainly
https://cx-freeze.readthedocs.io/en/stable/development/index.html#submitting-pull-requests
But a diff -u can be used to show the changes.
A change in sys.path must be made in your setup.py using include_path. I do not accept hardcoded changes.

@Fox214
Copy link
Author

Fox214 commented Apr 5, 2024

Sorry I am terrible with git, and I don't see anywhere to submit a pull request. (I don't have the repository cloned locally)

I do have a solution though: I am passing sys.path into bin_path_includes, this seems to work (although sometimes it still copies the original file, but I think if it were deleted it would copy from the new location due to the isfile() check)

so include_path seems to be redundant to bin_path_includes.

Otherwise I only needed to revert _pre_copy_hook() and _post_freeze_hook() methods in Freezer and LinuxFreezer classes in freezer.py (presumably the same would need to be done for DarwinFreezer and WinFreezer as well (although i cannot test those platforms) to what is in the current released code. 6.15.16 (please do not release this version of dev code it breaks badly, although a 3.12 supported release would be good)

diff -u freeze.py.2309 (dev) freezer.py (myversion)

image

image

image

image

Screen shots of tkdiff posted much better than the output of diff -u, the code is just a paste of the released version.

@marcelotduarte
Copy link
Owner

I prefer diff -u that generates like this: https://github.com/marcelotduarte/cx_Freeze/pull/2322/files

I don't have the repository cloned locally

You don't need to clone locally, you can clone directly on GitHub using fork button.
Documentation: Collaborating with pull requests.

@Fox214
Copy link
Author

Fox214 commented Apr 8, 2024

Ok I think this is at least viewable, I although I forked under my user ID not sure if I did that right or not:
Fox214/cx_Freeze@main...Fox214-patch-1

@marcelotduarte
Copy link
Owner

I think I understand. PR #2138 with macOS support may have broken Linux support. Interesting that my tests continue to pass. If I reverse it like you did it breaks on macOS. I will try to revert to Linux only.

@Fox214
Copy link
Author

Fox214 commented Apr 9, 2024

As I said here the 'new' code I think is one of 2 issues (or possibly both):

  1. symlinks pointing to symlinks
  2. relpath and os.getcwd seem to be mismatched

Perhaps I can debug some more tomorrow. In either case ../ extending beyond the root directory is bad though. (the code becomes non-portable)

@Fox214
Copy link
Author

Fox214 commented Apr 9, 2024

Ok I got a minor (seemingly) fix that works with the new code (for me at least).

I think the root of the problem is I am using a version of libstdc++.so than is stored in /lib64 on my system. This new version needs to be referenced in the rpath of everything copied by Cx freeze.

I suspect to replicate my issue if you have access to a newer version of libstdc++.so you can:
LD_LIBRARY_PATH=/path/to/newlib;

This will cause the code in freezer.py to now copy that file rather than skip it when its in /lib64.

Then there was 2 issues:

  1. _pre_copy_hook has: symlink = real_source.relative_to(source.parent), this would fail because /path/to/newlib is an independent directory structure, but then the except clause just forced it (but that is also wrong), so I returned the original inputs (as this more closely mimiced the original behavior) which led to:
  2. It only copied it once even though many packages link to this file, so I sort of stumbled into another portion of code and discovered its using relative paths from the wrong location (similar to above) so if it fails the same check again, it won't get copied (added to excludes) but I flip the source target so we get the path to the relative version .

There's still a bit of mystery to me as I was not getting quite what I expected, however its working and the build is dramatically smaller (my wrapper counts the files and uses du for the size):

Orig: INFO: Output library is 4935 files and 902 MB
with my fixes: INFO: Output library is 4911 files and 467 MB

Note: might be good to show what patchelf is doing if silent is off. (either in fix_rpath or parse.py run_rpath)

Fox214/cx_Freeze@main...Fox214-patch-2

@marcelotduarte
Copy link
Owner

marcelotduarte commented Apr 9, 2024

Hey! Based on what you had done, I made this modification:
97a74a2

I'll try to have time to do more tests tomorrow. But I hope you can test it. This modification does not break macos.
The one you did now, I think it breaks macos using pyqt6. I would have to test it.

EDIT:
You can test the patch in the latest development build (dev100):
pip install --force --no-cache --pre --extra-index-url https://marcelotduarte.github.io/packages/ cx_Freeze

@Fox214
Copy link
Author

Fox214 commented Apr 10, 2024

That did not work. I'm getting the same error I got before I added the 2nd change (redefining relative in the _post_copy_hook) in patch 2.

Some files are not getting the correct libstdc++.so.6 version (they're pulling from /lib64, which is too old, need the version copied).

it did appear to copy everything, but its not linked correctly. In fact upon closer inspection the original problem remains. its copying libstdc++.so.6.0.30 but libstdc++.so.6 is not being linked to it (its a relative path, but it goes back to root at the wrong depth and then tries to link outside the build area) I'll see if I can come up with a simple version to test.

@Fox214
Copy link
Author

Fox214 commented Apr 10, 2024

I don't have a simple case that fails but your pandas case: https://github.com/marcelotduarte/cx_Freeze/tree/main/samples/pandas

If I run without LD_LIBRARY_PATH set, it does not pull in the libstdc++ file (it doesn't need to).
but if I run with LD_LIBRARY_PATH=/path/to/somedir/withlibstd:/path/to/python/3.12.1/lib set it will create the problem.

I think we do this to make sure some files are consistent across various OS (types/versions)

"withlibstd" needs to have libstdc++.so.6 symlink to libstdc++.so.6.30 then in ./lib-linux64/lib/pandas/_libs/window/ you will see libstdc++.so.6 link to outside the build (probably dead, but if /path/to/somedir/withlibstd is at teh same depth as the file after its copied it may work)

Whether the run fails or not will depend on weather the code needs anything from the newer version or perhaps some other combinations of factors (I can't do it in a simple case)

@Fox214
Copy link
Author

Fox214 commented Apr 10, 2024

I was able to successfully modify the pandas case and get it to fail. I needed to add a simple pybind case to it as well. but upon further testing I discovered I can only reproduce the problem if I build the pybind piece with a newer version of g++ than is installed on the host in which I am running. It seems that it will not use the newer symbols in libstdc++ if the version of g++ is older (this is the only explanation I can think of) in both cases the link is broken (which is the problem) but in the case with the older g++ the only libraries used are in /lib64/libstdc++.so.6 so it "works" (in quotes because its not linking to the file it should, it just has a working version it can fall back to).

So if you have multiple versions of gcc available, I think I can give you my test case and describe how to reproduce the problem.

Likewise if you just want the case to see the broken link that should happen regardless, although it will likely work if you just run it, but working because it fell back to the wrong file, (or perhaps coincidentally ends up with the correct number of ../ in the path) is not good.

Maybe if you have cx freeze exit if it creates a dead link?

@Fox214
Copy link
Author

Fox214 commented Apr 10, 2024

Ok I think this is the most concise way to reproduce the issue (or catch it):

  1. modify freezer.py (Freezer class line ~400):
   def _post_freeze_hook(self) -> None:
        """Post-Freeze work (can be overridden)."""
        for target, symlink, symlink_is_directory in self._symlinks:
            if self.silent < 1:
                print(f"linking {target} -> {symlink}")
            if not target.exists():
                target.symlink_to(symlink, symlink_is_directory)
            target_file = target.parent / symlink
            if not target_file.exists():
                print(f"ERROR: {target_file} does not exist!")
                sys.exit(1)

I just added the check to make sure the symlink exists (last 4 lines above), (it could error out sooner if done at creation, this could potentially be an option too)

I just ran the pandas sample case as is to test, but I create a directory:
myldpath

I put 1 file in it (libstdc++.so.6.0.30) note the version of this file won't matter, you can just copy it off /lib64.

Symlink libstdc++.so.6 to libstdc++.so.6.0.30

Now symlink a new directory "linkldpath" to "myldpath"
export LD_LIBRARY_PATH=linkldpath

I it won't matter what the names or relative/abs. paths of myldpath or linkldpath are cx freeze should now fail with the new error I added above:
linking /path/to/test_cxLink/lib-linux64/lib/kiwisolver/libstdc++.so.6 -> ../myldpath/libstdc++.so.6.0.30
ERROR: /path/to/test_cxLink/lib-linux64/lib/kiwisolver/../myldpath/libstdc++.so.6.0.30 does not exist!

(its possible for the correct number of ../ to show up for some placements but unlikely for all, and may not work for any)

Upon staring at that I think the ultimate source of my problem, for the most concise fix:
line 240 in the code you gave me last night:
symlink = Path(os.path.relpath(real_source, source.parent))
to:
symlink = Path(os.path.relpath(real_source, source.parent.resolve()))

I did not even realize the source of the problem is LD_LIBRARY_PATH directory was pointing to a link itself. I would recommend looking at patch2 anyways just because of the massive file reduction. I don't have any pyqt6 projects but I do have a pyqt5 one. I can test anything you need though. Just resolving that seems to fix it, although to catch it you would need to have a case with the link dir and the error message, since even with bad links things can still work for various reasons.

@Fox214
Copy link
Author

Fox214 commented Apr 15, 2024

Are you ok with adding the extra .resolve()? I think that resolves it (no pun intended)

@marcelotduarte
Copy link
Owner

I can't keep up with you on this thread, but I'll try to point out a few things...

so include_path seems to be redundant to bin_path_includes.

They can complement each other. The include_path is used to search for modules*, packages, and dependencies. bin_path_includes is used for dependencies only.

  • .py, .pyd (on Windows), and .cpython-*.so (on Linux) modules.

(please do not release this version of dev code it breaks badly, although a 3.12 supported release would be good)

I'm holding off on the release for this, but I think I found a few other things in the meantime to fix.

Note: might be good to show what patchelf is doing if silent is off. (either in fix_rpath or parse.py run_rpath)

Noted.

I don't have a simple case that fails but your pandas case: https://github.com/marcelotduarte/cx_Freeze/tree/main/samples/pandas

I ended up discovering that this example is failing in Python 3.12, I already have a solution, but it's strange because it's not detecting some numpy.core modules and in Python 3.11 it works, it detects it. I even made a PR #2336, because it covers more of the code.

Are you ok with adding the extra .resolve()? I think that resolves it (no pun intended)

If it resolves, why not? ;-)

@marcelotduarte
Copy link
Owner

Please test the patch in the latest development build (dev108):
pip install --force --no-cache --pre --extra-index-url https://marcelotduarte.github.io/packages/ cx_Freeze

@marcelotduarte
Copy link
Owner

Note: might be good to show what patchelf is doing if silent is off. (either in fix_rpath or parse.py run_rpath)

PR #2340

Available in dev110

@Fox214
Copy link
Author

Fox214 commented Apr 19, 2024

I did not get a version with the .resolve added when I used the pip command above.

We're sticking with 3.11 for now, so no urgency on this, but I can keep testing for you.

@marcelotduarte
Copy link
Owner

Release 7.0.0 is out!
Documentation

@Fox214
Copy link
Author

Fox214 commented Apr 22, 2024

version 7.0.0 is good, thank you so much

@Fox214 Fox214 closed this as completed Apr 22, 2024
@Fox214
Copy link
Author

Fox214 commented Apr 22, 2024

Just out of curiosity were you able to replicate the issue with any of your testcases? I think you just need to symlink a directory

@marcelotduarte
Copy link
Owner

No. I cannot replicate it. I just made 3 PR based on our conversations. #2340, #2342 and #2345
Please test if this issue should stay closed.
I just released the version because some people need some features under testing and also python 3.12 support.

@Fox214
Copy link
Author

Fox214 commented Apr 23, 2024

No, its good it can stay closed. If you were able to replicate it would be good to have in regression to avoid seeing in the future, but its functional now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants