Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pretrained EfficientNet on GPU throws an error: Key efficientnet-b5/blocks_0/conv2d/kernel/RMSProp not found in checkpoint #652

Open
mijung-kim opened this issue Jan 7, 2020 · 4 comments

Comments

@mijung-kim
Copy link

Ubuntu 16.04 LTS
TF 1.15
Python 3.7
Using docker

command to reproduce (however, I used my own data):
$ CUDA_VISIBLE_DEVICES=0 python main.py --data_dir $MY_CUSTOM_DATA --num_label_classes=2 --model_dir=efficientnet-b5 --model_name=efficientnet-b5

I have tried to use pre-trained efficientnet-b1, b4, and b5, which gave me the same error as follows. Please let me know if you have found any solutions on this matter.

tensorflow.python.framework.errors_impl.NotFoundError: Key efficientnet-b5/blocks_0/conv2d/kernel/RMSProp not found in checkpoint
[[{{node save/RestoreV2}}]]

@2696120622
Copy link

@mijung-kim
I have encountered the same error info:
"NotFoundError: Key efficientnet-lite0/blocks_0/conv2d/kernel/RMSProp not found in checkpoint" and
"NotFoundError: Key efficientnet-b0/blocks_0/conv2d/kernel/RMSProp not found in checkpoint"
with efficientnet and efficientnet-lite respectively.
Do you know how to do?

@wheemyungshin
Copy link

Same Error here!

@2696120622
Copy link

2696120622 commented Apr 9, 2020

It looks like that the released ckpt was trained using 'sgd' optimier. I have fixed this error by changing the optimizer_name to 'sgd' in main.py when restoring params from the released ckpt.

optimizer = utils.build_optimizer(learning_rate,'sgd')

@bnascimento
Copy link

bnascimento commented Jan 12, 2021

Thanks guys! that got me a bit further, but following that, I stumble into another issue

WARNING:tensorflow:Reraising captured error
W0112 22:04:47.559992 140622358390592 error_handling.py:149] Reraising captured error
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found.
  (0) Not found: Key global_step not found in checkpoint
         [[{{node save/RestoreV2}}]]
  (1) Not found: Key global_step not found in checkpoint
         [[{{node save/RestoreV2}}]]
         [[save/RestoreV2/_403]]
0 successful operations.
0 derived errors ignored.

Any idea how to solve it? Iam using tf 2.3.
Does this work only for tf 1.15?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants