Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNN Predict: Key b-1 not found in checkpoint #65

Open
bikramkhastgir opened this issue Jun 27, 2018 · 14 comments
Open

CNN Predict: Key b-1 not found in checkpoint #65

bikramkhastgir opened this issue Jun 27, 2018 · 14 comments

Comments

@bikramkhastgir
Copy link
Contributor

bikramkhastgir commented Jun 27, 2018

Hi @brightmart ,

I have trained the CNN using ''train-zhihu4-only-title-all.txt''. When i am using the predict file for prediction on "test-zhihu6-title-desc.txt" using the word2vec as "zhihu-word2vec-title-desc.bin-100", I am getting the following error:

Restoring Variables from Checkpoint
2018-06-27 20:49:22.480037: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key b-1 not found in checkpoint
Traceback (most recent call last):
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: Key b-1 not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "p7_TextCNN_predict.py", line 77, in
saver.restore(sess, tf.train.latest_checkpoint(FLAGS.ckpt_dir))
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1802, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key b-1 not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Caused by op 'save/RestoreV2', defined at:
File "p7_TextCNN_predict.py", line 74, in
saver = tf.train.Saver()
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1338, in init
self.build()
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1347, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1384, in _build
build_save=build_save, build_restore=build_restore)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 835, in _build_internal
restore_sequentially, reshape)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 472, in _AddRestoreOps
restore_sequentially)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 886, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1463, in restore_v2
shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
op_def=op_def)
File "/home/user/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): Key b-1 not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]


Python: 2.7 ... Can you help me figure it out as there is no b-1 key in checkpoint?

Thank you..

@brightmart
Copy link
Owner

brightmart commented Jun 27, 2018 via email

@bikramkhastgir
Copy link
Contributor Author

Hi,

The 'pretrain word embedding' flag is set to False. Also this was because it was reading from the vocab pickle file in 'r' and 'a' instead of 'rb' and 'ab'. I have made those changes. Now the error is just this much without the utf-8 error. Any suggestions?

Thanks for your time,
Bikram

@bikramkhastgir
Copy link
Contributor Author

The contents of the "checkpoint" file is :


model_checkpoint_path: "model.ckpt-9"
all_model_checkpoint_paths: "model.ckpt-5"
all_model_checkpoint_paths: "model.ckpt-6"
all_model_checkpoint_paths: "model.ckpt-7"
all_model_checkpoint_paths: "model.ckpt-8"
all_model_checkpoint_paths: "model.ckpt-9"


That is all of it which is getting saved while training.

@brightmart
Copy link
Owner

do you still get same error?

@bikramkhastgir
Copy link
Contributor Author

yes... will i upload the files and you can try to reproduce them in your system??

@brightmart
Copy link
Owner

brightmart commented Jun 30, 2018 via email

@bikramkhastgir
Copy link
Contributor Author

Hi @brightmart ,

Thank you for your help. The uploaded files are in the URL:

{ https://anonfile.com/oa3ef0f3bb/data_util.py
https://anonfile.com/p931f4fcb5/p7_TextCNN_predict.py
https://anonfile.com/q13af0fcb9/p7_TextCNN_train.py
https://anonfile.com/r033faf2b3/p8_TextRNN_model.py
https://anonfile.com/s533f5f5b1/p7_TextCNN_model.py
https://anonfile.com/tc3afbfbbe/data_util_zhihu.py
https://anonfile.com/u53af4feb9/p8_TextRNN_train.py }

I have used data_util only in training for the CNN. The training files for both CNN and RNN is either 'train-zhihu4-only-title-all.txt' downloaded from Zhihu url or the 'sample_multiple_label.txt' from your repo.

The RNN is also throwing error while getting trained as key not found. Both the CNN and RNN have slightly different error. The CNN is giving error while predicting and the RNN while training.

Note: The use word embedding is set as False while training in CNN. I am using

Regards,

@bikramkhastgir
Copy link
Contributor Author

i am using Tensorflow 1.8.0

@kevinsay
Copy link

kevinsay commented Aug 2, 2018

@bikramkhastgir “Not found: Key b-1 not found in checkpoint”,
i modify the cnn program for train single label,this error also appears when i predict,has this error been solved? i need you help.

@bikramkhastgir
Copy link
Contributor Author

No @kevinsay ... I couldnt figure out exactly which routine needs Key b-1. So i am still hoping for @brightmart to figure it out..

@switchhh
Copy link

switchhh commented Dec 17, 2018

@bikramkhastgir Hi!Did you solve this problem? I met the same error when i predict...

@switchhh
Copy link

@bikramkhastgir Hi,I think I got the way to solve this problem.. the filter_nums array is different in train and predict, the b's name in model defined by b-%s, s is the filter_num....

@bikramkhastgir
Copy link
Contributor Author

@switchhh Were you able to run it?
I can see "num_filters" is 128 for both train and predict and "filter_sizes" = [6,7,8]

@switchhh
Copy link

switchhh commented Dec 27, 2018

@bikramkhastgir Yes, I can run it. the problem is filter_sizes is different in train and predict, sorry for writing wrong, in train it's[6,7,8],but in prediction is [1,2,3,4,5,6,7,8] in my edition, but i check it now, the bug was fixed, you can try it again. and sorry for my poor english..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants