Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No train metrics show up in scalar plots #2911

Closed
zzb3886 opened this issue Nov 8, 2019 · 10 comments
Closed

No train metrics show up in scalar plots #2911

zzb3886 opened this issue Nov 8, 2019 · 10 comments

Comments

@zzb3886
Copy link

zzb3886 commented Nov 8, 2019

Consider Stack Overflow for getting support using TensorBoard—they have
a larger community with better searchability:

https://stackoverflow.com/questions/tagged/tensorboard

Do not use this template for for setup, installation, or configuration
issues. Instead, use the “installation problem” issue template:

https://github.com/tensorflow/tensorboard/issues/new?template=installation_problem.md

To report a problem with TensorBoard itself, please fill out the
remainder of this template.

Environment information (required)

Please run diagnose_tensorboard.py (link below) in the same
environment from which you normally run TensorFlow/TensorBoard, and
paste the output here:

https://raw.githubusercontent.com/tensorflow/tensorboard/master/tensorboard/tools/diagnose_tensorboard.py

Diagnostics

Diagnostics output
--- check: autoidentify                                                                                                                                                                       
INFO: diagnose_tensorboard.py version 4725c70c7ed724e2d1b9ba5618d7c30b957ee8a4                                                                                                                


--- check: general                                                                                                                                                                            
INFO: sys.version_info: sys.version_info(major=3, minor=6, micro=8, releaselevel='final', serial=0)                                                                                           
INFO: os.name: posix                                                                                                                                                                          
INFO: os.uname(): posix.uname_result(sysname='Linux', nodename='master1', release='4.15.0-66-generic', version='#75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019', machine='x86_64')                
INFO: sys.getwindowsversion(): N/A                                                                                                                                                            


--- check: package_management                                                                                                                                                                 
INFO: has conda-meta: False                                                                                                                                                                   
INFO: $VIRTUAL_ENV: None                                                                                                                                                                      


--- check: installed_packages                                                                                                                                                                 
INFO: installed: tensorboard==2.0.1                                                                                                                                                           
INFO: installed: tensorflow==2.0.0                                                                                                                                                            
INFO: installed: tensorflow-estimator==2.0.1                                                                                                                                                  


--- check: tensorboard_python_version                                                                                                                                                         
INFO: tensorboard.version.VERSION: '2.0.1'                                                                                                                                                    


--- check: tensorflow_python_version                                                                                                                                                          
/usr/lib/python3/dist-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.25.3) or chardet (3.0.4) doesn't match a supported version!                                     
  RequestsDependencyWarning)                                                                                                                                                                  
INFO: tensorflow.__version__: '2.0.0'                                                                                                                                                         
INFO: tensorflow.__git_version__: 'v2.0.0-rc2-26-g64c3d38'                                                                                                                                    


--- check: tensorboard_binary_path                                                                                                                                                            
INFO: which tensorboard: b'/home/bz/.local/bin/tensorboard\n'                                                                                                                                 


--- check: readable_fqdn                                                                                                                                                                      
INFO: socket.getfqdn(): 'master1.bz'                                                                                                                                                          


--- check: stat_tensorboardinfo                                                                                                                                                               
INFO: directory: /tmp/.tensorboard-info                                                                                                                                                       
INFO: os.stat(...): os.stat_result(st_mode=16895, st_ino=14562209, st_dev=2430, st_nlink=2, st_uid=1000, st_gid=1000, st_size=4096, st_atime=1572563590, st_mtime=1573247067, st_ctime=1573247
067)                                                                                                                                                                                          
INFO: mode: 0o40777                                                                                                                                                                           


--- check: source_trees_without_genfiles                                                                                                                                                      
INFO: tensorboard_roots (1): ['/home/bz/.local/lib/python3.6/site-packages']; bad_roots (0): []                                                                                     

--- check: full_pip_freeze                                                                                                                                                                    
INFO: pip freeze --all:                                                                                                                                                                       
absl-py==0.8.1                                                                                                                                                                                
asn1crypto==0.24.0                                                                                                                                                                            
astor==0.8.0                                                                                                                                                                                  
attrs==17.4.0                                                                                                                                                                                 
Automat==0.6.0                                                                                                                                                                                
bleach==2.1.2                                                                                                                                                                                 
cachetools==3.1.1                                                                                                                                                                             
certifi==2018.1.18                                                                                                                                                                            
chardet==3.0.4                                                                                                                                                                                
click==6.7                                                                                                                                                                                    
colorama==0.3.7                                                                                                                                                                               
command-not-found==0.3                                                                                                                                                                        
configobj==5.0.6                                                                                                                                                                              
constantly==15.1.0                                                                                                                                                                            
cryptography==2.1.4                                                                                                                                                                           
decorator==4.1.2                                                                                                                                                                              
distro-info===0.18ubuntu0.18.04.1                                                                                                                                                             
entrypoints==0.2.3.post1                                                                                                                                                                      
eventkit==0.8.5                                                                                                                                                                               
gast==0.2.2                                                                                                                                                                                   
google-auth==1.7.0                                                                                                                                                                            
google-auth-oauthlib==0.4.1                                                                                                                                                                   
google-pasta==0.1.8                                                                                                                                                                           
grpcio==1.25.0                                                                                                                                                                                
h5py==2.10.0                                                                                                                                                                                  
html5lib==0.999999999                                                                                                                                                                         
httplib2==0.9.2                                                                                                                                                                               
hyperlink==17.3.1                                                                                                                                                                             
ib-insync==0.9.53                                                                                                                                                                             
idna==2.6                                                                                                                                                                                     
incremental==16.10.1                                                                                                                                                                          
ipykernel==4.8.2                                                                                                                                                                              
ipython==5.5.0                                                                                                                                                                                
ipython-genutils==0.2.0                                                                                                                                                                       
ipywidgets==6.0.0                                                                                                                                                                             
Jinja2==2.10                                                                                                                                                                                  
joblib==0.14.0                                                                                                                                                                                
jsonschema==2.6.0                                                                                                                                                                             
jupyter-client==5.2.2                                                                                                                                                                         
jupyter-console==5.2.0                                                                                                                                                                        
jupyter-core==4.4.0                                                                                                                                                                           
Keras==2.3.1                                                                                                                                                                                  
Keras-Applications==1.0.8                                                                                                                                                                     
Keras-Preprocessing==1.1.0                                                                                                                                                                    
keyring==10.6.0                                                                                                                                                                               
keyrings.alt==3.0                                                                                                                                                                             
language-selector==0.1                                                                                                                                                                        
Markdown==3.1.1                                                                                                                                                                               
MarkupSafe==1.0                                                                                                                                                                               
mistune==0.8.3                                                                                                                                                                                
nbconvert==5.3.1                                                                                                                                                                              
nbformat==4.4.0                                                                                                                                                                               
nest-asyncio==1.0.0                                                                                                                                                                           
netifaces==0.10.4                                                                                                                                                                             
notebook==5.2.2                                                                                                                                                                               
numpy==1.17.3                                                                                                                                                                                 
oauthlib==3.1.0                                                                                                                                                                               
opt-einsum==3.1.0       
PAM==0.4.2                                                                                                                                                                                    
pandas==0.24.2                                                                                                                                                                                
pandocfilters==1.4.2                                                                                                                                                                          
pexpect==4.2.1                                                                                                                                                                                
pickleshare==0.7.4                                                                                                                                                                            
pip==19.3.1                                                                                                                                                                                   
prompt-toolkit==1.0.15                                                                                                                                                                        
protobuf==3.10.0                                                                                                                                                                              
pyasn1==0.4.2                                                                                                                                                                                 
pyasn1-modules==0.2.1                                                                                                                                                                         
pycrypto==2.6.1                                                                                                                                                                               
Pygments==2.2.0                                                                                                                                                                               
pygobject==3.26.1                                                                                                                                                                             
pyOpenSSL==17.5.0                                                                                                                                                                             
pyserial==3.4                                                                                                                                                                                 
python-apt==1.6.4                                                                                                                                                                             
python-dateutil==2.8.0                                                                                                                                                                        
python-debian==0.1.32                                                                                                                                                                         
pytz==2019.1                                                                                                                                                                                  
pyxdg==0.25                                                                                                                                                                                   
PyYAML==3.12                                                                                                                                                                                  
pyzmq==16.0.2                                                                                                                                                                                 
requests==2.18.4                                                                                                                                                                              
requests-oauthlib==1.3.0                                                                                                                                                                      
requests-unixsocket==0.1.5                                                                                                                                                                    
rsa==4.0                                                                                                                                                                                      
scikit-learn==0.21.3                                                                                                                                                                          
scipy==1.3.1                                                                                                                                                                                  
SecretStorage==2.3.1                                                                                                                                                                          
selenium==3.141.0                                                                                                                                                                             
service-identity==16.0.0                                                                                                                                                                      
setuptools==41.6.0                                                                                                                                                                            
simplegeneric==0.8.1                                                                                                                                                                          
six==1.13.0                                                                                                                                                                                   
sklearn==0.0                                                                                                                                                                                  
ssh-import-id==5.7                                                                                                                                                                            
systemd-python==234                                                                                                                                                                           
tdameritrade==0.0.7                                                                                                                                                                           
tensorboard==2.0.1                                                                                                                                                                            
tensorflow==2.0.0                                                                                                                                                                             
tensorflow-estimator==2.0.1                                                                                                                                                                   
termcolor==1.1.0                                                                                                                                                                              
terminado==0.7                                                                                                                                                                                
testpath==0.3.1                                                                                                                                                                               
tornado==4.5.3                                                                                                                                                                                
tqdm==4.32.2                                                                                                                                                                                  
traitlets==4.3.2                                                                                                                                                                              
Twisted==17.9.0                                                                                                                                                                               
ufw==0.36                                                                                                                                                                                     
unattended-upgrades==0.1                                                                                                                                                                      
urllib3==1.25.3                                                                                                                                                                               
vboxapi==1.0                                                                                                                                                                                  
wcwidth==0.1.7                                                                                                                                                                                
webencodings==0.5                                                                                                                                                                             
Werkzeug==0.16.0                                                                                                                                                                              
wheel==0.33.6                                                                                                                                                                                 
wrapt==1.11.2                                                                                                                                                                                 
zope.interface==4.3.2                                                                                                                                                                         
                                                                                                                                                                                              

For browser-related issues, please additionally specify:

  • Browser type and version (e.g., Chrome 64.0.3282.140):
  • Screenshot, if it’s a visual issue:

Issue description

I just upgraded tensorflow to 2.0. In training, I noticed tensorboard now has two runs for each experiment, including train and validation. However, only validation has scalar value curves. Train metric plots are always empty.

I can reproduce this issue by using the script in tensorboard get started guide: https://www.tensorflow.org/tensorboard/get_started. The script prints out reasonable train and val metrics as it should, but I'm just not getting the right plots.
Screenshot from 2019-11-08 13-21-36

@zzb3886
Copy link
Author

zzb3886 commented Nov 10, 2019

I tried this a few more times. It looks like tensorboard 2.0 has trouble updating the train metrics by itself. If I kill tensorboard and restart it, it will then show both train and validation metrics. If the training is still ongoing, the validation metrics will be updated where as the train metrics are stuck.

@rmothukuru
Copy link

rmothukuru commented Nov 11, 2019

@zzb3886,
I ran the script provided in the link, https://www.tensorflow.org/tensorboard/get_started and could observe the Graphs for both Training and Validation. Here is the Gist.

Can you please provide more details about your issue.

Regarding If the training is still ongoing, the validation metrics will be updated where as the train metrics are stuck. =>

Tensorflow Graphs get updated from the Event Files stored during Training. So, it is recommended to see and analyze the graphs after the Training is completed, rather than during Training. Please let me know your opinion about the same.

@dgrahn
Copy link
Contributor

dgrahn commented Nov 14, 2019

I'm encountering the same problem. The train scaler isn't updated until TensorBoard is restarted.

@zzb3886
Copy link
Author

zzb3886 commented Nov 14, 2019

In the script, if the tensorboard is started before training is started, then the problem occurs.

@dgrahn
Copy link
Contributor

dgrahn commented Nov 14, 2019 via email

@wchargin
Copy link
Contributor

Probably a duplicate of #2084; can you please try the workaround listed
in that issue and see if it resolves the problem?
#2084 (comment)

@psybuzz
Copy link
Contributor

psybuzz commented Nov 15, 2019

Thanks for the report. I can confirm this was working with tf-nightly-2.0-preview==2.0.0.dev20190306 and broken in tf-nightly-2.0-preview==2.0.0.dev20190307. Bisected to tensorflow/tensorflow@c66b603

So it seems Keras callbacks have been affected by this bug since March, sadly. I'm going to triage this to someone who knows more context.

Googlers, see cl/237090182

@zzb3886
Copy link
Author

zzb3886 commented Nov 15, 2019

Adding profile_batch=0 to the keras callback resolves it.

@psybuzz
Copy link
Contributor

psybuzz commented Nov 15, 2019

Duplicate of #2084

@psybuzz psybuzz marked this as a duplicate of #2084 Nov 15, 2019
@psybuzz psybuzz closed this as completed Nov 15, 2019
@AI-P-K
Copy link

AI-P-K commented Sep 3, 2020

I have the following code and I can't manage to get tensorboard to show my anything else but epoch_accuracy and epoch_loss. Can anyone help me? i have followed the steps above and is still not working.

This is the command I run in terminal tensorboard --logdir='logs/'

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D
import pickle
import numpy as np
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import time
from datetime import datetime
from packaging import version
import os

X = pickle.load(open("X.pickle","rb"))
y = pickle.load(open("y.pickle","rb"))

X=np.array(X/255.0)
y=np.array(y)

dense_layers = [0, 1, 2]
layer_sizes = [32, 64, 128]
conv_layers = [1, 2, 3]

for dense_layer in dense_layers:
for layer_size in layer_sizes:
for conv_layer in conv_layers:
NAME = "{}-conv-{}-nodes-{}-dense-{}".format(conv_layer, layer_size, dense_layer, int(time.time()))
tensorboard = tf.keras.callbacks.TensorBoard(log_dir ='/users/silviumarc/pycharmprojects/classifier/logs/{}'.format(NAME), update_freq='epoch', profile_batch=0, histogram_freq=1)
print(NAME)

model = Sequential()
model.add(Conv2D(layer_size, (4,4), input_shape = X.shape[1:]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))

for l in range(conv_layer-1):
model.add(Conv2D(layer_size, (4,4)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten())
for l in range(dense_layer):
model.add(Dense(layer_size))
model.add(Activation("relu"))

model.add(Dense(64))
model.add(Activation('relu'))

model.add(Dense(1))
model.add(Activation("sigmoid"))

model.compile(loss="binary_crossentropy",
optimizer= 'adam',
metrics=['accuracy'])

model.fit(X, y, batch_size=7, epochs=2, validation_split=0.5, callbacks=[tensorboard])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants