Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't continue training from checkpoint. #17

Open
ailgun opened this issue Aug 5, 2018 · 3 comments
Open

Doesn't continue training from checkpoint. #17

ailgun opened this issue Aug 5, 2018 · 3 comments

Comments

@ailgun
Copy link

ailgun commented Aug 5, 2018

Hello,

I'm having a minor issue when trying to continue an interrupted training.

It is creating these files in /tmp/ as it's supposed to:
screen shot 2018-08-05 at 7 54 12 pm
And these under model/NewPokemon in my working directory:
screen shot 2018-08-05 at 7 54 56 pm

On other code I worked on, I would import the saved weights and biases with something like;

saver = tf.train.import_meta_graph('model.meta')
saver.restore(sess, tf.train.latest_checkpoint('./'))

On this code I just couldn't make it continue from where it is left off. It starts from 0 back again when I run PokeGAN.py even though I'm running inside the directory and the folder structure seems correct.

I haven't modified the code except the fact that, mine saves the model every epoch instead of every 500 - just changed the modulo i%500 to i%1.

Any help would be much appreciated.

Thanks very much!

@b3nh
Copy link

b3nh commented Jun 3, 2019

@ailgun did you ever resolve this problem and get training to continue after interruption? I'm facing similar issue

@ailgun
Copy link
Author

ailgun commented Jun 4, 2019

@arklabco Nope, sorry! I've decided to learn more by following a few tutorials online and creating a simple GAN but using NVIDIAs code for actual training :-)

@schlonja
Copy link

Have encountered the same problem. Has anyone solved it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants