Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warm up layers models when they are loaded. #224

Closed
nsthorat opened this issue Apr 24, 2018 · 8 comments
Closed

Warm up layers models when they are loaded. #224

nsthorat opened this issue Apr 24, 2018 · 8 comments

Comments

@nsthorat
Copy link
Contributor

If we pass a tensor of zeros to the model it will compile all the shader programs and upload weights to the GPU, thus making the first inference much faster.

This should be an easy win.

@bileschi
Copy link
Contributor

Can you explain a little more? Something like calling model.predict internally as the last step in model.compile ?

@nsthorat
Copy link
Contributor Author

Actually I was thinking model.predict(tf.zeros(inputLayer.shape)); as the last step of tf.loadModel.

@bileschi
Copy link
Contributor

Seems reasonable. Quick question: In general a model might have input layer(s) with incomplete shapes, say [?, 100, 100], typically for batch. If we guess the batch size, and are incorrect, does the warm-up behavior still work, or does the GPU memory need to be re-allocated because of the shape change?

@dsmilkov
Copy link
Contributor

Good q. In that case, we should warmup with the most common case in inference, batchSize=1. If we happen to be wrong, some of the GPU programs will be re-compiled, but that's fine.

@nsthorat
Copy link
Contributor Author

We should warm up with batchSize = 1 since it will be faster. Having batchSize > 1 likely will not change which programs are compiled / weights are uploaded. It's also likely that users will be using a batchSize of 1 for inference, anyways.

@nsthorat nsthorat added the P1 label Oct 24, 2018
@nsthorat nsthorat added P2 and removed P1 labels Oct 24, 2018
@caisq caisq added type:feature New feature or request type:performance labels Feb 12, 2019
nsthorat pushed a commit that referenced this issue Aug 19, 2019
* 0.3.x: Update the publish-npm script to allow publishing from the release branch. (#203)

DEV

* Upgrade 0.3.x to 0.15.3 (#210)


<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/tensorflow/tfjs-node/210)
<!-- Reviewable:end -->

* Fix win GPU packaging. (#208) (#211)

Turns out that the windows GPU builds for TensorFlow 1.12 lack the directory structure and eager headers. A bug has been filed with core TF - but we should bake in some fixes for this.

This PR simply refactors the downloading logic to a new file. I'd like to use this logic in the node-gles package as well (maybe worth releasing as a stand-alone package in the future).

After the refactoring, I check the directory structure in Windows. If the folder structure is missing, but the required tensorflow.dll exists - I re-create the directory structure, move and download the proper header files.

The screenshot below shows the contents of TF 1.12 Windows GPU:
![capture](https://user-images.githubusercontent.com/306276/53048799-719f4b80-344a-11e9-9004-3eef2446a246.PNG)

<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/tensorflow/tfjs-node/208)
<!-- Reviewable:end -->

* Bump 0.3.1

* Add TensorBoard callback for model training: tf.node.tensorBoard() (#202) (#213)

FEATURE

See example screenshot:
![image](https://user-images.githubusercontent.com/16824702/52491877-19d52a80-2b96-11e9-8c24-5a403c2450d3.png)

Fixes #686

* [0.3.x] Upgrade nyc package fo fix lodash security issue. (#218) (#219)

Bump for 0.3.x so we can get a security release spun.

https://github.com/tensorflow/tfjs-node/network/alert/yarn.lock/lodash/open

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/tensorflow/tfjs-node/219)
<!-- Reviewable:end -->

* Bump to 0.3.2

* Upgrade TS libraries and change binding from typings file to plain
TypeScript definition.

* Upgrade TS dependencies

* save

* Fix deps-stage

* save

* Revert TS changes and keep binary staging fixes.

* Don't use a definition file for the bindings.

This causes many issues and doesn't help with redistribution. It looks
like exporting a local definition file on top of what else is exported
is not a common supported TypeScript use case. This fix simply moves
defnitions into a normal TypeScript file.

This should fix: #1092

* save

* Add typescript integration project.

* save

* Add license
@rikkitook
Copy link

Little side note: we noticed that on the full tensorflow with gpu enabled, consecutive calls with different batch sizes leads to session reheating. Like, for example, first call with batchsize = 1 is long, first call with batchsize = 5 is also long on the same session

@gaikwadrahul8
Copy link
Contributor

Hi, @nsthorat

Thank you for opening this issue for tracking purposes. Since this issue has been open for a long time, the code/debug information for this issue may not be relevant with the current state of the code base.

The TFJs team is constantly improving the framework by fixing bugs and adding new features. We suggest you try the latest TFJs version with the latest compatible hardware configuration which could potentially resolve the issue. We can keep the issue open if it is still relevant. Please confirm if we need to keep the issue open.

Thank you for your support and cooperation.

@gaikwadrahul8
Copy link
Contributor

Hi, @nsthorat

We have not received any confirmation from you, so we are closing this issue now. If the issue is still relevant, please let us know, either we'll re-open this issue or please feel free to create new issue after trying with latest version of Tensorflow.js. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants