-
Notifications
You must be signed in to change notification settings - Fork 71
Hi ,i get the error msg like this : #19
Comments
Is the GPU memory too small ? |
This looks like some data issue as the complaint was about a keyerror
probably related to image id.
…On Wed, Oct 12, 2022 at 1:03 AM ross-Hr ***@***.***> wrote:
Is the GPU memory too small ?
—
Reply to this email directly, view it on GitHub
<#19 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKERUNSBRJU2Q3HUGY73TTWCZWDNANCNFSM6AAAAAARDAN2FU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
It is the annoantions error. I reload the annoations to solve the error.
My tf==2.10.0
|
this looks like the checkpoint specified (either pretrained checkpoint, or
checkpoint restored from last training in the same model directory) is
different from the configured architecture/encoder, please check if the
architecture/encoder variant, depth, dim etc match.
…On Mon, Oct 17, 2022 at 6:30 PM ross-Hr ***@***.***> wrote:
It is the annoantions error. I reload the annoations to solve the error.
But the new error likes :
W1018 09:27:13.350448 139820555069248 checkpoint.py:213] Value in
checkpoint could not be found in the restored object: (root).optimizer's
state 'v' for
(root).model.decoder.decoder.dec_layers.5.mlp.mlp_layers.0.dense1.bias
WARNING:tensorflow:Value in checkpoint could not be found in the restored
object: (root).optimizer's state 'v' for
(root).model.decoder.decoder.dec_layers.5.mlp.mlp_layers.0.dense2.kernel
W1018 09:27:13.350490 139820555069248 checkpoint.py:213] Value in
checkpoint could not be found in the restored object: (root).optimizer's
state 'v' for
(root).model.decoder.decoder.dec_layers.5.mlp.mlp_layers.0.dense2.kernel
WARNING:tensorflow:Value in checkpoint could not be found in the restored
object: (root).optimizer's state 'v' for
(root).model.decoder.decoder.dec_layers.5.mlp.mlp_layers.0.dense2.bias
W1018 09:27:13.350531 139820555069248 checkpoint.py:213] Value in
checkpoint could not be found in the restored object: (root).optimizer's
state 'v' for
(root).model.decoder.decoder.dec_layers.5.mlp.mlp_layers.0.dense2.bias
My tf==2.10.0
This looks like some data issue as the complaint was about a keyerror
probably related to image id.
… <#m_1252035150792023031_m_2240461384712268694_>
On Wed, Oct 12, 2022 at 1:03 AM ross-Hr *@*.*> wrote: Is the GPU memory
too small ? — Reply to this email directly, view it on GitHub <#19
(comment)
<#19 (comment)>>,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAKERUNSBRJU2Q3HUGY73TTWCZWDNANCNFSM6AAAAAARDAN2FU
<https://github.com/notifications/unsubscribe-auth/AAKERUNSBRJU2Q3HUGY73TTWCZWDNANCNFSM6AAAAAARDAN2FU>
. You are receiving this because you are subscribed to this thread.Message
ID: @.*>
—
Reply to this email directly, view it on GitHub
<#19 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKERUI2WQF5WTTWZUH2FS3WDX4VDANCNFSM6AAAAAARDAN2FU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
I git clone the repo and did not change anything.
but get the above error. |
well, i change the code in model.py |
@chentingpc |
you should be able to do pdb in the code when running in eager mode |
2022-10-12 15:43:57.254005: W tensorflow/core/grappler/optimizers/data/slack.cc:103] Could not find a final
prefetch` in the input pipeline to which to introduce slack.I1012 15:43:57.996680 140468541171456 api.py:459] train_step begins...
I1012 15:44:07.279798 140468532778752 api.py:459] train_step begins...
INFO:tensorflow:batch_all_reduce: 369 all-reduces with algorithm = nccl, num_packs = 1
I1012 15:44:10.852259 140499206152832 cross_device_ops.py:897] batch_all_reduce: 369 all-reduces with algorithm = nccl, num_packs = 1
I1012 15:44:17.169317 140468541171456 api.py:446] Trainable variables:
I1012 15:44:17.426999 140468541171456 api.py:446] vit/stem_conv/kernel:0 (16, 16, 3, 768)
I1012 15:44:17.432081 140468541171456 api.py:446] vit/stem_conv/bias:0 (768,)
I1012 15:44:17.436969 140468541171456 api.py:446] vit/stem_ln/gamma:0 (768,)
....
INFO:tensorflow:batch_all_reduce: 369 all-reduces with algorithm = nccl, num_packs = 1
I1012 15:44:31.484436 140499206152832 cross_device_ops.py:897] batch_all_reduce: 369 all-reduces with algorithm = nccl, num_packs = 1
I1012 15:44:37.695064 140468532778752 api.py:459] train_step ends...
I1012 15:44:38.920633 140468541171456 api.py:459] train_step ends...
2022-10-12 15:45:08.671253: W tensorflow/core/framework/op_kernel.cc:1768] UNKNOWN: KeyError: 351529
Traceback (most recent call last):
File "/root/anaconda3/envs/pix2seq/lib/python3.9/site-packages/tensorflow/python/ops/script_ops.py", line 271, in call
ret = func(*args)
File "/root/anaconda3/envs/pix2seq/lib/python3.9/site-packages/tensorflow/python/autograph/impl/api.py", line 642, in wrapper
return func(*args, **kwargs)
File "/tmp/autograph_generated_filecefzj46v.py", line 22, in get_area
retval__1 = ag.converted_call(ag__.ld(np).asarray, ([ag__.ld(id_to_ann)[ag__.ld(i)]['area'] for i in ag__.ld(ids)],), dict(dtype=ag__.ld(np).float32), fscope_1)
File "/tmp/autograph_generated_filecefzj46v.py", line 22, in
retval__1 = ag.converted_call(ag__.ld(np).asarray, ([ag__.ld(id_to_ann)[ag__.ld(i)]['area'] for i in ag__.ld(ids)],), dict(dtype=ag__.ld(np).float32), fscope_1)
KeyError: 351529
2022-10-12 15:45:08.671413: W tensorflow/core/framework/op_kernel.cc:1768] UNKNOWN: KeyError: 415619
Traceback (most recent call last):
File "/root/anaconda3/envs/pix2seq/lib/python3.9/site-packages/tensorflow/python/ops/script_ops.py", line 271, in call
ret = func(*args)
File "/root/anaconda3/envs/pix2seq/lib/python3.9/site-packages/tensorflow/python/autograph/impl/api.py", line 642, in wrapper
return func(*args, **kwargs)
File "/tmp/autograph_generated_filecefzj46v.py", line 22, in get_area
retval__1 = ag.converted_call(ag__.ld(np).asarray, ([ag__.ld(id_to_ann)[ag__.ld(i)]['area'] for i in ag__.ld(ids)],), dict(dtype=ag__.ld(np).float32), fscope_1)
File "/tmp/autograph_generated_filecefzj46v.py", line 22, in
retval__1 = ag.converted_call(ag__.ld(np).asarray, ([ag__.ld(id_to_ann)[ag__.ld(i)]['area'] for i in ag__.ld(ids)],), dict(dtype=ag__.ld(np).float32), fscope_1)
KeyError: 415619
`
My gpu is 2 * RTX 3070 with 8G .
The text was updated successfully, but these errors were encountered: