Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-Sensical Print Values For Metrics #44

Open
onurdanaci opened this issue Aug 30, 2019 · 4 comments
Open

Non-Sensical Print Values For Metrics #44

onurdanaci opened this issue Aug 30, 2019 · 4 comments

Comments

@onurdanaci
Copy link

Hi,
While training a binary classifier with binary_crossentropy output layer, and binary labels in Keras (TF - GPU backend) using your keras_metrics package, I've been getting lots of f1-score values printed as: 0.0000e+00

It wasn't making any sense as I was seeing a drop in loss, and an increase in accuracy (either/both for train/val). I decided to dig it up, and started printing everything available in keras_metrics.

Realized I am getting all the metrics from keras_metrics printed as 0.0000e+00 !

I get true positive, true negatives, false positives, false negatives, precision, recall, f1-score (each is binary like binary_recall etc), all of them printed as 0.0000e+00.

There must either be a bug, or a catastrophic division or something propagating.

I have the following module versions:

sklearn '0.21.3'
tensorflow gpu '0.21.3'
keras '2.2.4'
numpy '1.16.4'
scipy '1.3.1'
pandas '0.25.0'
keras_metrics '1.1.0'

Do you know what may be the issue? It stopped my progress big time, before I even realize.

Best,
Onur

@ybubnov
Copy link
Member

ybubnov commented Aug 30, 2019

Hi @onurdanaci, thank you for posting an issue. Is it possible to put the model and data with which I could reproduce the issue? From my previous investigations it's most likely the data values is a cause, but can be a bug in the implementation as well.

@onurdanaci
Copy link
Author

Hi @ybubnov unfortunately I am legally not allowed to disclose the data.

But, the training set is N_data=240k, N_features=18k; and, 240k binary class labels. Validation set is the same for N_data=30k.

And, each data point (of 18k features) is minmax normalized on [0,1] scale by scikitlearn.

@onurdanaci
Copy link
Author

Model is, more or less:

Layer (type) Output Shape Param #

conv1d_63 (Conv1D) (None, 18488, 128) 384


batch_normalization_19 (Batc (None, 18488, 128) 512


activation_19 (Activation) (None, 18488, 128) 0


max_pooling1d_30 (MaxPooling (None, 9244, 128) 0


conv1d_64 (Conv1D) (None, 9244, 64) 24640


max_pooling1d_31 (MaxPooling (None, 4622, 64) 0


conv1d_65 (Conv1D) (None, 4622, 32) 6176


max_pooling1d_45 (MaxPooling (None, 2311, 32) 0


conv1d_41 (Conv1D) (None, 2309, 1) 97


flatten_43 (Flatten) (None, 2309) 0


dense_249 (Dense) (None, 512) 1182720


dense_250 (Dense) (None, 512) 262656


dense_251 (Dense) (None, 64) 32832


dense_252 (Dense) (None, 1) 65

Total params: 1,510,082
Trainable params: 1,478,370
Non-trainable params: 31,712

But, first 7 layers are transferred from an encoder of an autoencoder, so they are frozen and not trained.

@ybubnov
Copy link
Member

ybubnov commented Sep 2, 2019

Well, can't say what exactly is happening. But if you say that binary output is within range [0, 1], then everything should be fine and sum of TP+TN+FP+FN should give you the total amount of input vectors.

Under the hood keras-metrics simply cast the model prediction of the output layer to the integer value and perform simple calculations. When the output validation data is out of [0, 1] range, the result can be anything: keras does allow to feed any data for validation so keras-metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants