the multi classification problem #2

zorrocai · 2018-04-28T03:33:53Z

I have wrote the deeplda in pytorch. When it comes to test moudle, classification method confused me a lot. I want to use one to rest method in mulit classification. It seems that your classification is a new interesting method. Could you explain more about it, and what is the theory behind it?

zorrocai · 2018-04-28T09:41:58Z

Are your per-class mean hidden representations based on batch size samples? so is your accuracy?

dmatte · 2018-05-01T09:47:36Z

I have wrote the deeplda in pytorch.

Thats great! I have seen that you also opened an issue for getting the gradients of the eigenvalue decomposition in pytorch. Hopefully this is available soon.

When it comes to test moudle, classification method confused me a lot.
I want to use one to rest method in mulit classification. It seems that your classification is a new interesting method. Could you explain more about it, and what is the theory behind it?

One-to-rest is not required as we build on top of Multiclass LDA.

Computing the individual class probabilities itself is also nothing new.
In fact, it is identical to what you do in the classical (none-deep) version of LDA.
You can also find this in the sklearn implementation of LDA
which in turn builds on sklearn's LinearClassifierMixin.

Are your per-class mean hidden representations based on batch size samples? so is your accuracy?

For computing the updates we use batch-statistics
(you could also use running averages similar to what is done for example in batch-normalization).
For evaluating and testing the statistics are re-computed on the entire training set to get more reliable estimates for both, means and covariances.

zorrocai · 2018-05-10T01:41:50Z

Thanks for your guides. I have prepared the IELTS test those days. Actually, instead of using the eigenvalue decomposition in LDA first, I tried to learn the projection matrix A directly, with back propagation. And it seems that the results were just good too.

zorrocai · 2018-05-23T02:26:57Z

I have trained the AlexNet with lda-eigenvalues loss.But here wasn't any good improvement on train accuracy.
The sample outputs of my code listed below:

epoch 13 avg_train_loss: -0.999989
('LDA-Eigenvalues (Train):', '[0. 0. 0. 0. 0. 0. 0. 0. 0.]')
Ratio min(eigval)/max(eigval): 0.005, Mean(eigvals): 0.000
train accuracy: 12.085143
epoch 14 avg_train_loss: -0.999989
('LDA-Eigenvalues (Train):', '[0. 0. 0. 0. 0. 0. 0. 0. 0.]')
Ratio min(eigval)/max(eigval): 0.006, Mean(eigvals): 0.000
train accuracy: 12.023333
epoch 15 avg_train_loss: -0.999990
('LDA-Eigenvalues (Train):', '[0. 0. 0. 0. 0. 0. 0. 0. 0.]')
Ratio min(eigval)/max(eigval): 0.007, Mean(eigvals): 0.000
train accuracy: 11.959375

I wonder that the poor performance may caused by the bad LDA-Eigenvalues?
or may there have some other tricks?

zorrocai · 2018-05-24T13:15:22Z

Actually, I run into such a situation:
S_B/S_W becomes a minus identity matrix druing training.So the eigenvalues are all 1.
this condition may prevent further training.

I really don't konw the reason...

dmatte · 2018-05-28T07:29:52Z

Did you try this with our theano version or with your pytorch implementation?

Did you train AlexNet on ImageNet? This woun't work as you have 1000 classes.
This implies that your co-variances in the objective function have a size of 1000 x 1000.
You would need a huge mini-batch-size (1001 * 1000) to get stable estimates for the covariance matrices and the model would not fit into your GPU memory.

zorrocai · 2018-05-28T07:41:31Z

In my pytorch implementation, and just train on MNIST dataset.

zorrocai · 2018-05-28T07:58:32Z

I found that the wrong position of line S_W += lambdaI caused the problem. I added this line before S_B=S_T - S_W.
After I moved S_W +=lambdaI behind S_B=S_T - S_W , the problem has gone. But it still has poor eigenvalues after one epoch like this:
('LDA-Eigenvalues (Train):', '[-0.01 -0.01 -0.01 -0.01 -0.01 -0.01 -0.01 -0.01 1.68]')
Ratio min(eigval)/max(eigval): -0.007, Mean(eigvals): 0.176

Unlike your theano codes could have much bigger eigenvalues:
LDA-Eigenvalues (Train): [ 5.83 7.17 7.45 8.01 8.67 11.22 11.81 14.82 18.66]
Ratio min(eigval)/max(eigval): 0.312, Mean(eigvals): 10.403

xuanhanyu · 2019-12-07T12:55:33Z

Thanks for your guides. I have prepared the IELTS test those days. Actually, instead of using the eigenvalue decomposition in LDA first, I tried to learn the projection matrix A directly, with back propagation. And it seems that the results were just good too.

Eigenvalue decomposition is Non-differentiable in pytorch. what should I do?

webzerg · 2020-05-10T00:47:01Z

@zorrocai do you mind share your pytorch implementation code? Thanks

zjyLibra · 2021-07-01T04:50:13Z

Why theano can’t call the GPU on the server and tried many ways to configure the environment？I want your pytorch implementation code too,Thanks，Can you share it？ @zorrocai

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the multi classification problem #2

the multi classification problem #2

zorrocai commented Apr 28, 2018

zorrocai commented Apr 28, 2018

dmatte commented May 1, 2018

zorrocai commented May 10, 2018

zorrocai commented May 23, 2018 •

edited

Loading

zorrocai commented May 24, 2018

dmatte commented May 28, 2018

zorrocai commented May 28, 2018

zorrocai commented May 28, 2018 •

edited

Loading

xuanhanyu commented Dec 7, 2019

webzerg commented May 10, 2020

zjyLibra commented Jul 1, 2021

the multi classification problem #2

the multi classification problem #2

Comments

zorrocai commented Apr 28, 2018

zorrocai commented Apr 28, 2018

dmatte commented May 1, 2018

zorrocai commented May 10, 2018

zorrocai commented May 23, 2018 • edited Loading

zorrocai commented May 24, 2018

dmatte commented May 28, 2018

zorrocai commented May 28, 2018

zorrocai commented May 28, 2018 • edited Loading

xuanhanyu commented Dec 7, 2019

webzerg commented May 10, 2020

zjyLibra commented Jul 1, 2021

zorrocai commented May 23, 2018 •

edited

Loading

zorrocai commented May 28, 2018 •

edited

Loading