Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix library duplication in tensorflow v.2.3 and avoid packaging unnecessary files #50

Merged
merged 5 commits into from
Sep 25, 2020

Conversation

rokm
Copy link
Member

@rokm rokm commented Sep 23, 2020

This is a follow-up to the previous tensorflow hook cleanup:

  • fix Unable to package Tensorflow 2.3 #49 by explicitly excluding tensorflow.python._pywrap_tensorflow_internal.
  • avoid packaging tensorflow's development headers, XLA AOT runtime sources, and duplicating the libtensorflow_framework shared library (the unused copy picked up via datas ends up in tensorflow folder, while the one picked up by dependency analysis ends up in program root)

On some tensorflow versions (2.3.0 thus far), the
_pywrap_tensorflow_internal extension module ends up duplicated.
We pick it up as both an extension module (placed in
`tensorflow/python` directory) and as a shared library (placed in
program's root directory). This increases the program size, and
also prevents the program from running under macOS.

In the problematic versions, the extension module seems to be
picked up as `tensorflow.python._pywrap_tensorflow_internal`,
while in others it is picked up as just ˙_pywrap_tensorflow_internal`.
So we try to work around the duplication by removing the problematic
entry from `hiddenimports` and adding it to `excludedimports`.

Fixes pyinstaller#49.
Avoid collecting tensorflow's development headers in `include`
directory and XLA AOT runtime sources (`xla_aot_runtime_src`).

Also prevent `libtensorflow_framework` shared library from being
picked up as data to prevent its duplication (it should be
correctly picked up as a shared library by dependency scanner).
@rokm rokm force-pushed the tensorflow-duplicate-library branch from 79cb40d to 70ae5e9 Compare September 23, 2020 10:22
@bwoodsend
Copy link
Member

Is there any sense in adding the test code from #49 to the test suite? Or could we already reproduce with our current ones?

@rokm
Copy link
Member Author

rokm commented Sep 23, 2020

The example given in #49 is actually (almost) identical as one of our tests (training on MNIST dataset).

But the issue on macOS can be reproduced by just importing tensorflow.

@rokm rokm marked this pull request as ready for review September 24, 2020 08:52
@rokm rokm requested review from a team and bwoodsend and removed request for a team September 24, 2020 08:52
Copy link
Member

@bwoodsend bwoodsend left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really appreciate you taking the time to explain what you're doing in the comments.

@Legorooj Legorooj merged commit fb16f21 into pyinstaller:master Sep 25, 2020
@rokm rokm deleted the tensorflow-duplicate-library branch October 14, 2020 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to package Tensorflow 2.3
3 participants