Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provided weight data has no target variable: batch_normalization #755

Closed
rajeev-samalkha opened this issue Oct 3, 2018 · 28 comments
Closed
Assignees
Labels

Comments

@rajeev-samalkha
Copy link

To get help from the community, check out our Google group.

TensorFlow.js version

0.13

Browser version

Chrome Version 69.0.3497.100

Describe the problem or feature request

I converted a Keras model to tfjs using python utility with no errors. But when I try to load the model in tfjs, I get the following error:

tfjs@0.13.0:2 Uncaught (in promise) Error: Provided weight data has no target variable: batch_normalization_1_2/gamma
    at new t (tfjs@0.13.0:2)
    at loadWeightsFromNamedTensorMap (tfjs@0.13.0:2)
    at t.loadWeights (tfjs@0.13.0:2)
    at tfjs@0.13.0:2
    at tfjs@0.13.0:2
    at Object.next (tfjs@0.13.0:2)
    at i (tfjs@0.13.0:2)

Code to reproduce the bug / link to feature request

Running it on local machine.
model = await tf.loadModel(<path_to_model.json>)

@bileschi
Copy link
Contributor

bileschi commented Oct 4, 2018

Hi
Can you please share your (original) model and the commands used to convert & load?
Thanks

@davidsoergel
Copy link
Member

If that weight is an extra one that is lying around for some reason but is not actually needed, you can call tf.loadModel(..., strict=false) to disable the error.

Of course, if the weight is needed, doing this would leave you with a broken model. In that case, as @bileschi said, we'd need to see the original Keras model to determine whether there is a conversion bug.

@davidsoergel davidsoergel added question usage question or debugging support comp:layers labels Oct 5, 2018
@bileschi
Copy link
Contributor

bileschi commented Oct 9, 2018

@rajeev-samalkha is this issue resolved? If so feel free to close. Thankyou.

@rajeev-samalkha
Copy link
Author

Folks, sorry for delayed response.

When I take out Batch Norm layer then it seems to work fine (I had to retrain the model). Is there any difference between Batch norm in tf.keras and tfjs. I have used tensorflowjs.converter utility. The model in itself is quite simple with few Con2D layers interspersed by Max Pool/Dropout.

Regards
Rajeev

@bileschi
Copy link
Contributor

TFJS BatchNorm maintains 4 up to weights:
gamma, beta, movingMean, and movingVariance which matches those in keras-team/keras

https://github.com/keras-team/keras/blob/master/keras/layers/normalization.py#L93

I wonder if tf.keras is saving additional tensors, possibly for optimization or related to the momentum for training?

Looking at the tensorflow/keras implementation, I see that it is somewhat more complex: including a 'fused' batch-norm implementation that reaches into c.
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/layers/normalization.py

Can you list out the weights in your model from the python code?

@rajeev-samalkha
Copy link
Author

Do you need weights for all the layers or just BatchNorm one.

@bileschi
Copy link
Contributor

bileschi commented Oct 10, 2018 via email

@Raza25
Copy link

Raza25 commented Oct 24, 2018

I get the same error while loading a keras converted model on browser.
following error pops up
embeddings_issue

As for the model weights:
model_issue_embedding

Note: Some of my converted model tend to work fine on browser having embeddings in it, but sometimes this error shows up. @bileschi @davidsoergel kindly post a proper fix for this issue.

@ashsharma28
Copy link

Same error here, Someone please help.

tfjs@0.13.0:2 Uncaught (in promise) Error: Provided weight data has no target variable: conv2d/kernel
at new t (tfjs@0.13.0:2)
at loadWeightsFromNamedTensorMap (tfjs@0.13.0:2)
at t.loadWeights (tfjs@0.13.0:2)
at tfjs@0.13.0:2
at tfjs@0.13.0:2
at Object.next (tfjs@0.13.0:2)
at i (tfjs@0.13.0:2)

image

image

@rajeev-tbrew
Copy link

I hit the same error without batch norm. Appreciate your help.

capture

@rajeev-tbrew
Copy link

Folks

I think I found why we are getting this error. The error can happen for any layer. Steps to reproduce the error:

  1. Load the model in tensorFlow using tf.keras.
  2. Load the same model again (basically load the model more than once).
  3. Use tfjs.converters to convert keras model and you get this error.

It seems every layer name changes in model.json file (it will be different than model.summary name). For example, one of the layer in my model was 'conv2d_6' but it got named as 'conv2d_6_2' when I loaded the model twice. But it seems actual weights (assuming in shard file) still expect 'conv2d_6 in my case.

So till we get a fix, pls make sure you load your model only once before doing tfjs conversion. Hope this helps.

@hardikmodi1
Copy link

I have tried that too by loading the model exactly once but still, the same error prevails.

@stephenrt42
Copy link

stephenrt42 commented Nov 24, 2018

Not sure if this helps, but when I converted my h5 model using the python code
`import tensorflowjs as tfjs
from keras.models import load_model

modelk = load_model('./input/model.h5')
tfjs.converters.save_keras_model(modelk, './output/')`

I would receive the following error:

errors.ts:48 Uncaught (in promise) Error: Provided weight data has no target variable: dense_1_7/kernel at new t (errors.ts:48) at loadWeightsFromNamedTensorMap (container.ts:190) at t.loadWeights (container.ts:759) at models.ts:285 at index.ts:79 at Object.next (index.ts:79) at i (index.ts:79)

But if I convert the h5 model using the tensorflowjs_converter command line tool my tfjs json model file will load without any problems.

@desenmeng
Copy link

model summary

image

const tf = require('@tensorflow/tfjs');
require('@tensorflow/tfjs-node');
const path = require('path');

async function load(){
  await tf.loadModel(`file://${path.join('xxx', 'model.json')}`);
}

load()

error

(node:72812) UnhandledPromiseRejectionWarning: Error: Provided weight data has no target variable: conv2d_10_1/kernel

@caisq
Copy link
Contributor

caisq commented Dec 17, 2018

@demohi Can you try using setting the strict argument to false, i.e.,

  await tf.loadModel(`file://${path.join('xxx', 'model.json')}`, false);

Also, this might be a bug in loadModel. Can you provide the weight and JSON file to us so we may try reproducing this issue on our end? Thanks.

@desenmeng
Copy link

@caisq Thank you for your reply. It works.

you can convert this keras model to fix the bug.

@caisq
Copy link
Contributor

caisq commented Dec 22, 2018

@demohi I'm looking into this issue now. It seems the cause to do with the following fact:

  • The model has a layer with the name conv2d_10, however
  • One of the weights for that layer is named conv2d_10_1/kernel in the model.h5 file. So there is the extra suffix _1

This is the reason why the weight loading fails and you get the error. Can you tell me a little about how the model is saved from Python side? Is it possible that there are multiple instances of the model existing in Python memory?

I think we need to fix this issue regardless of what happens on the Python side, as Python Keras / TensorFlow can load this sort of model correctly. But I just want to understand the conditions under which this kind of name mismatches happen. Thanks.

@desenmeng
Copy link

@caisq

I use the colab to train this model with keras.

// keras model
model.save('xxx.h5')

@caisq
Copy link
Contributor

caisq commented Dec 22, 2018

@demohi At the risk of asking too much, I wonder whether you could try running the same code but from a Python file (or reset the state of the CoLab kernel and run the code from scratch, making sure that each code block is run only once.) I expect the name mismatch to disappear in those cases.

Again, don't feel obliged to try that. But if you do have time to try it and let me know, it would be wonderful.

We'll work on a fix in the meantime.

@ekchatzi-pointr
Copy link

I am facing the same issue. It appears to happen when I run the tensorflowjs_converter command (with os.system) while having the model loaded from the same input keras .h5 file. If I run tensorflowjs_converter separately after the python program is over from the shell, it looks to work fine

@caisq caisq assigned davidsoergel and unassigned caisq Feb 12, 2019
@caisq caisq added the P2 label Feb 12, 2019
@shafay07
Copy link

Same error with

Error: Provided weight data has no target variable: Conv1_1/kernel

could someone fix it?
I am converting mobilenet_v2 to tensorflow js for browser classification task

@nsthorat
Copy link
Contributor

nsthorat commented Feb 25, 2019 via email

@shafay07
Copy link

@demohi At the risk of asking too much, I wonder whether you could try running the same code but from a Python file (or reset the state of the CoLab kernel and run the code from scratch, making sure that each code block is run only once.) I expect the name mismatch to disappear in those cases.

Again, don't feel obliged to try that. But if you do have time to try it and let me know, it would be wonderful.

We'll work on a fix in the meantime.

This helped me solve the problem. I had to run my google colab again after cleaning the runtime. I made sure i execute each code block once and then i converted the model using
tensorflowjs_converter --input_format keras ./my_model.h5 ./my_model_as_tfjs

It is working perfectly fine in the browser now :)

@jamesmf
Copy link

jamesmf commented Mar 13, 2019

I was experiencing this in the following scenario:

  • train a model using model.fit(..., callbacks=[ModelCheckpoint])
  • load the best model (not just the model weights) using model = load_model(ckpt_path)
  • convert and save using tfjs.converters.save_keras_model

This might be obvious, but the issue was that in the call to load_model I was creating a whole new set of layers without removing the old tf variables. Keras was showing the proper layer.name, but I was still having the mismatch.

The underlying tf.Variable objects had name collisions with the first model, and therefore got a suffix of _1 (like char_embedding_1/embeddings:0 instead of char_embedding/embeddings:0). You can see these names with something like

for layer in model.layers:
    print(l.weights)

To solve my version of the issue (where there was at some point a copy of the same model and I loaded a new one), you can reset the tf session entirely before loading

import keras.backend as K
...
model.fit(data, callbacks=[...])
K.backend.clear_session()  # this resets the session containing the stale, not-best version of the model 
model = load_model(ckpt_path)
tfjs.converters.save_keras_model(model, out_dir)

@oakkas
Copy link

oakkas commented May 2, 2019

The way I solved the same problem (provided weight data has no target variable conv1_1/kernel) is by cleaning all output and cache of my jupyter notebook, loading model (model = load_model('./tf_files/keras/modelKeras2.h5')) and converting with tfjs (tfjs.converters.save_keras_model(model, './tfjsModelConverted/model6') ).

Hope it helps...

@shashwatsahay
Copy link

shashwatsahay commented May 6, 2019

I was experiencing this in the following scenario:

  • train a model using model.fit(..., callbacks=[ModelCheckpoint])
  • load the best model (not just the model weights) using model = load_model(ckpt_path)
  • convert and save using tfjs.converters.save_keras_model

This might be obvious, but the issue was that in the call to load_model I was creating a whole new set of layers without removing the old tf variables. Keras was showing the proper layer.name, but I was still having the mismatch.

The underlying tf.Variable objects had name collisions with the first model, and therefore got a suffix of _1 (like char_embedding_1/embeddings:0 instead of char_embedding/embeddings:0). You can see these names with something like

for layer in model.layers:
    print(l.weights)

To solve my version of the issue (where there was at some point a copy of the same model and I loaded a new one), you can reset the tf session entirely before loading

import keras.backend as K
...
model.fit(data, callbacks=[...])
K.backend.clear_session()  # this resets the session containing the stale, not-best version of the model 
model = load_model(ckpt_path)
tfjs.converters.save_keras_model(model, out_dir)

This particular way worked for me but instead of using the

keras.backend.clear_session()

functionality I used the tensorflow API to keras as clearing the seesion by directly using keras module throws an error with tensorflow. The following is what I used

tf.keras.backend.clear_session()

@anilsathyan7
Copy link

anilsathyan7 commented Sep 30, 2019

Here is how i converted the model using google colab (ipython)...
The python API seems to work for this version atleast. No need to set strict parameter in this case.

Once you have saved the entire model as a h5 file, upload it to colab and run the script to generate the tfjs model.

!pip install tensorflowjs==1.2.6

Restart runtime after installation

import keras

import os
import keras
from keras.models import load_model
import tensorflow as tf

import tfjs

import tensorflowjs as tfjs

load model

tf.compat.v1.disable_eager_execution()
model=load_model('/content/model.h5')# path to model

create directory

!mkdir model

convert model

tfjs.converters.save_keras_model(model, '/content/model')

!zip -r model.zip /content/model

download and verify

@rthadur
Copy link
Contributor

rthadur commented Sep 30, 2019

@anilsathyan7 thank you , closing this issue.

@rthadur rthadur closed this as completed Sep 30, 2019
IvanMamontov added a commit to griddynamics/ar-video-conf-mobile-demo that referenced this issue May 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests