Model and Parameters For LSP Dataset #23

kazunaritakeichi · 2016-06-27T15:52:44Z

I trained lsp dataset with default parameters.
The test error is large.
Do you have appropriate model or parameters for lsp dataset ?

Thanks!

lunzueta · 2016-06-27T19:36:46Z

Hi @ktak199 I just did the same thing today and I also saw that the error was much higher than in the case of FLIC. I guess we should check more in detail what parameters are used in the original paper (https://arxiv.org/pdf/1312.4659v3.pdf). I'm now training with MPII using the same default parameters as with FLIC, to see what happens, but I'll retake the training/testing with LSP afterwards. I'll tell you if I get better results after tuning the parameters. Please, let me know too if you are luckier after tuning the parameters too.

kazunaritakeichi · 2016-06-28T01:42:58Z

Hi @lunzueta Thank you! Ok, I'll also try and tell you!

lunzueta · 2016-06-28T12:21:54Z

Hi @ktak199. I've tested this time with MPII and the default parameters. The tests have less error in general than in the case of LSP, but they still are quite bad compared to FLIC. So, I guess that in both cases specific parameters should be used. During the training I observed that in both cases it tended to overfitting quite quickly.

kazunaritakeichi · 2016-06-28T14:18:49Z

Hi @lunzueta. One way against overfittin may be tuning dropout parameters. http://stats.stackexchange.com/questions/109976/in-convolutional-neural-networks-how-to-prevent-the-overfitting

lunzueta · 2016-06-28T17:41:20Z

@ktak199 The dropout is already considered in the implementation, with the same value mentioned in the paper:
h = F.dropout(F.relu(self.fc6(h)), train=self.train, ratio=0.6)
h = F.dropout(F.relu(self.fc7(h)), train=self.train, ratio=0.6)

Now, I'm training LSP with the following parameter changes:

crop_pad_sup=1.0 -> I think this is the σ parameter mentioned in the paper, which is set to 1 for LSP
lcn=0 -> in other contexts I found that using this kind of contrast was worse than not using it, so I'm deactivating this, to see what happens
lr=0.0001 -> in the paper they say this is the most important parameter to be set. I'm changing this value in a similar way as I've done in other contexts, to see what happens
For now, in this training I'm getting the following graphic:

It looks a bit better than in the previous training, but not good enough yet...

kazunaritakeichi · 2016-06-29T16:22:23Z

@lunzueta
σ parameter is set to 1.0 for FLIC and 2.0 for LSP in the paper and lr parameter is set to 0.0005 for both datasets, isn't it?
I don't yet know which lcn should be 0 or 1.

I'm testing with the following parameter.
・cropping 0 "For LSP we use the full image as initial bounding box since the humans are relatively tightly cropped by design."

lunzueta · 2016-06-29T16:30:43Z

Hi @ktak199. Yes, you are right about σ, I said it wrong. I've continued doing some more tests changing the parameters (crop vs no-crop, local contrast vs no-local-contrast, etc), but I'm not getting... let's say... "normal" results with LSP. The result I normally get in the tests is a very small avatar (compared to the actual body size) in the middle of the image. I'm a bit stuck with this too. Now, I was just trying to do this same training using the caffe branch instead the master branch, to see if this could be something related to the deep learning framework. I'll let you know. Good luck with your tests too, I hope we can get something closer to the expected results.

yutuofish2 · 2016-06-29T16:33:20Z

Hi @lunzueta
I am running on MPII by setting the dropout ratio as 0.9. The other parameters are left as default. Currently the test loss has started to converge, however it is still at a high loss.

kazunaritakeichi · 2016-06-29T16:41:08Z

@lunzueta
This is log.png (cropping is 0).
test loss is increasing...

lunzueta · 2016-06-29T16:42:49Z

Thanks for sharing this @yutuofish2. I see you are training for more than 600 epochs. I wonder how many should be a good number, but I see that your training looks much better than what I was getting.

yutuofish2 · 2016-06-30T20:50:19Z

@ktak199
You would need to modify the function fliplr() in transform.py. The authors have fixed this problem about 10 hours ago. However, it seems that there are still some bugs ...

lunzueta · 2016-07-01T06:52:41Z

This time I trained a model with LSP, just changing the optimizer to 'MomentumSGD', and maintaining the rest of parameters the same way. I got the following results, which still aren't good enough:

Good to know that there have been some new fixes in the code. I'll try them next. Thanks for that @mitmul!

kazunaritakeichi · 2016-07-01T22:51:34Z

I tried newer version(shell/train_lsp.sh).
Below is the result.

lunzueta · 2016-07-02T07:20:41Z

@ktak199 I was also doing the same thing, but I still was at Epoch 200, and I'm getting a similar graphic:

So, what do you think it might be happening? Maybe it's too early and we should we wait till Epoch 1000? Just in case, meanwhile, I'm going to train with FLIC again in another PC to see if it still trains as before.

mitmul · 2016-07-02T07:28:56Z

Sorry for inconvenience, there are some fatal bugs maybe in data processing part. I'm trying to find them now and will update if I could fix them and confirm the training can be done correctly. So please wait or try to find bugs and send PRs. Thanks.

lunzueta · 2016-07-02T07:30:14Z

Thank you very much for taking care of this issue @mitmul. I'm learning a lot from all this :-)

kazunaritakeichi · 2016-07-02T07:40:26Z

Thank you so much @mitmul !
I'll learn the paper and the code so that I can contribute.

lunzueta · 2016-07-02T16:13:37Z

Could the problem be, in the case of LSP, that there are some joint positions with negative values (expressing that they are occluded) and these make the training get crazy? I say this because I've retrained with FLIC for a few epochs and it looked to be converging normally. The only difference I see are those negative values.

lunzueta · 2016-07-03T18:11:57Z

Well... I started a new training with MPII, which has all the body joint positions marked on the image, and after about 130 epochs of training I got this graphic, with a strange outlier and which doesn't seem to converge:

And this kind of results from it, which are always the same pose:

So, certainly, I guess we should review in detail how the data is processed.

kazunaritakeichi · 2016-07-03T23:42:32Z

I tried with FLIC dataset.
I got the similar result with MPII @lunzueta.

lunzueta · 2016-07-10T19:14:12Z

Hi guys. Based on the code provided in the caffe branch, I've done some tests with MPII (I attach the caffe net and solver files I've used for that), and after training for some hundred of epochs it seems to give some responses that make more sense (not always the same mean pose as shown above). In order to generate the LMDB format data, I used the same functions provided in this code (cropping, etc), but without applying the local contrast (because this wasn't possible to reproduce in Caffe), so I don't think that the failure is there. The AlexNet architecture defined in Chainer format also seems to be correct. So, taking this into account, where could the failure be? (I still couldn't find it)

deeppose.zip

aspenlin · 2019-07-10T22:12:20Z

Hi @lunzueta @yutuofish2 may I ask which python program you use to plot the images with joints positions on them? The only one I can find is evaluate_flic.py but it still doesn't seem right.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model and Parameters For LSP Dataset #23

Model and Parameters For LSP Dataset #23

kazunaritakeichi commented Jun 27, 2016

lunzueta commented Jun 27, 2016

kazunaritakeichi commented Jun 28, 2016

lunzueta commented Jun 28, 2016

kazunaritakeichi commented Jun 28, 2016

lunzueta commented Jun 28, 2016

kazunaritakeichi commented Jun 29, 2016 •

edited

Loading

lunzueta commented Jun 29, 2016

yutuofish2 commented Jun 29, 2016

kazunaritakeichi commented Jun 29, 2016 •

edited

Loading

lunzueta commented Jun 29, 2016

yutuofish2 commented Jun 30, 2016

lunzueta commented Jul 1, 2016

kazunaritakeichi commented Jul 1, 2016 •

edited

Loading

lunzueta commented Jul 2, 2016

mitmul commented Jul 2, 2016

lunzueta commented Jul 2, 2016

kazunaritakeichi commented Jul 2, 2016

lunzueta commented Jul 2, 2016 •

edited

Loading

lunzueta commented Jul 3, 2016

kazunaritakeichi commented Jul 3, 2016 •

edited

Loading

lunzueta commented Jul 10, 2016

aspenlin commented Jul 10, 2019

Model and Parameters For LSP Dataset #23

Model and Parameters For LSP Dataset #23

Comments

kazunaritakeichi commented Jun 27, 2016

lunzueta commented Jun 27, 2016

kazunaritakeichi commented Jun 28, 2016

lunzueta commented Jun 28, 2016

kazunaritakeichi commented Jun 28, 2016

lunzueta commented Jun 28, 2016

kazunaritakeichi commented Jun 29, 2016 • edited Loading

lunzueta commented Jun 29, 2016

yutuofish2 commented Jun 29, 2016

kazunaritakeichi commented Jun 29, 2016 • edited Loading

lunzueta commented Jun 29, 2016

yutuofish2 commented Jun 30, 2016

lunzueta commented Jul 1, 2016

kazunaritakeichi commented Jul 1, 2016 • edited Loading

lunzueta commented Jul 2, 2016

mitmul commented Jul 2, 2016

lunzueta commented Jul 2, 2016

kazunaritakeichi commented Jul 2, 2016

lunzueta commented Jul 2, 2016 • edited Loading

lunzueta commented Jul 3, 2016

kazunaritakeichi commented Jul 3, 2016 • edited Loading

lunzueta commented Jul 10, 2016

aspenlin commented Jul 10, 2019

kazunaritakeichi commented Jun 29, 2016 •

edited

Loading

kazunaritakeichi commented Jun 29, 2016 •

edited

Loading

kazunaritakeichi commented Jul 1, 2016 •

edited

Loading

lunzueta commented Jul 2, 2016 •

edited

Loading

kazunaritakeichi commented Jul 3, 2016 •

edited

Loading