-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training loss #26
Comments
You can always use tensorboard though! |
@008karan Hi Karan. I am not a TF user. Can you please instruct me how to use tensorboard in this case? I see an tfevents file in the model directory but it seems not to be written for tensorboard. The script used |
I have trained my model on GPU and using tensorboard is similar here. You will find events.out files in your checkpoint folder. Just run tensorboard on it. |
@008karan Thank you Karan! It works for me now after I use tensorboard==1.15.0. Do you know how the author can continuously get the evaluation metrics as in this thread? I can only get the evaluation metrics at the end of my training progress. |
In tensorboard you get loss and learning rate here I think you can add whatever you want in logs to see them on tensorboard! |
When I trained Electra small on my Spanish corpus the loss was shown if trained on GPU. Now, I got access to TFRC and trained it using its TPUs pod and loss it is not shown. Of course, I can get it from Tensorboard events but would be great to log it by default when running on TPU. |
You can set the tensorflow log level to info and it will be much more verbose including printing the loss. |
Hello,
I was wondering whether it is possible to add some loss metrics to the training cycle? The only thing I see during training Electra model is
1275000/3000000 = 42.5%, SPS: 3.1, ELAP: 9:24:02, ETA: 6 days, 11:55:19
which tells nothing about how good is it. I'm trying to add some code to the estimator, but it seems to me that it could be much easier to show all the metrics in order to see how successful the model is at this stage.
I'm training non-English model, so I wanted to get better insight into how my model is performing at the moment.
Thanks
The text was updated successfully, but these errors were encountered: