-
-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is this version compiled with SSE4, AVX etc. #34
Comments
We do not compile this package from source currently. We repackage Google's wheels on PyPI. So you would have to ask them how they build it. |
For anyone wondering: The pypi package is built without optimizations. (as of right now) |
Thanks for the info. We have attempted to build this from source and likely will again. That said, we still need to come up with some acceptable set of assembly instructions that will run on a range of hardware. Unless Tensorflow has some way of determining what the hardware can support at runtime, this will likely mean not having every optimization enabled. Though there will at least be a recipe that one can tweak and room for discussing how better to handle additional instructions in a reasonable way. xref: #12 |
I just ran into a problem relating to this. On one of my machines, Update: A snippet from running on another machine; seems that TensorFlow has some kind of CPU capability detection but it isn't actually wired up to anything yet:
Update 2: I'll also add that one of the big HPC clusters I use has a bunch of nodes that do not support AVX (I even wrote a tool to figure out which of my binaries were killing things) so the use of AVX here is a substantial pain point. On the other hand, I totally see how we want to use fancy CPU instructions if available ... hard to see how to balance things without someone doing the difficult work of generating code that can choose the right implementation at runtime. |
Tensorflow does not have dynamic code paths based on CPU capabilities. Whatever target micro-architecture is selected at build time is the minimum CPU version required at run time. Starting with the 1.5.1 release, the wheels available on PyPI use AVX instructions. We are re-packaging the wheels for the conda package so these have the same AVX requirements. If you need conda packages which do not require AVX the conda packages in |
@jjhelmus Ah, good to know that the |
But I can't tell which packages require AVX . |
Hello,
when i install tensorflow via "conda install tensorflow", running scripts with it display several warnings about possible optimizations. These would speed up tensorflow significantly and i think are supported by many modern CPUs.
Does this version have SSE etc. enabled?
The text was updated successfully, but these errors were encountered: