-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to do regression? #512
Comments
As a heads-up, it does not really make sense to model classification as a regression problem: why is the distance between a "1" and "9" larger than that between "2" and "3", if we don't count in the semantic of digits? This being said, your network probably suffers from large learning rates - decreasing it would eliminate the nan error, but again you probably won't get anything useful out of mnist regression. |
Hi, thank you for the comment! I didn't find any regression example yet (if you know some, please tell me!), so I will continue it and update my progress:) Thank you very much! |
Hi,
since I just need the accuracy, don't care the logprob, so just comment them. |
I think using accuracy for regression is misleading. Instead of modifying Replace
|
Hi, |
Glad it works :) |
@XucongZhang Does your regression work with many labels? I am trying to do a regression with caffe. In my problem, label is not only one value, it is a vector of float. |
@thuanvh Hi, glad someone mention that! Yes, I also use multi label. I used the [https://github.com//pull/147]. The example how to generate the data is located in /caffe-dev/src/caffe/test/test_data generate_sample_data.py.
|
Thank @XucongZhang , I am using your suggestion. And I will inform you the result later. |
Hi @XucongZhang I have problem in using HDF5 file. Loading HDF5 files 0.h5 And error at the loss layer euclidean_loss_layer.cpp: check failed: bottom[0]->channels() == bottom[1]->channels() (136 vs 68000). Why the label dimension is 68000? |
Hi @thuanvh I think you made mistake on generate the .h5 file. For example, I have 1280 samples, each images is 36*60, the label is 3 float values, and the batch size is 128. I got this: Loading HDF5 file/home/XXX/data.h5 It is my experience and understanding, just for your reference. |
Hi @XucongZhang |
I tried to add some log line in Euclidiean Loss Layer file, and the loss value is well calculated. layers { |
That functionality requires #522 patch. So either wait until is integrated in Maybe is a problem in the weight or bias initialization, try bias=0.1 |
@XucongZhang if you are willing to do an example of regression and a PR we will be happy to integrate it in Caffe. |
I applied the patch to master. In training progress, the loss is still nan, I wrote a log line in Euclidien Loss for loggin. I found that, after the first test, the loss become nan. Here the log I get
As you see, after the line |
@thuanvh For the train part, I got the actual number after decreasing the base_lr in the solver file. And for the loss, I just changed the loss layer of train like this:
And for the test, I deleted the prob layer, and also changed the accuracy layer like this:
I also modified the source code in caffe-dev/src/caffe/layers/accuracy_layer.cpp to get the euclidean distance instead of different of class. Then it works. |
Why in the train part we not use output in layer loss, and in the test part we need its output (top) ? |
The loss is of course crucial, but it's unimportant apart from its role in Look at the lenet_consolidated_solver.protoxt for an example of reporting On Thu, Jun 26, 2014 at 9:32 AM, thuanvh notifications@github.com wrote:
|
Hi, from LeNet Mnist example, I do step by step some little changes based on your suggestion about regression so that assure the loss is not nan. Now it is a normal number. I am running the training. I think the problem that I get is the scale of input data image. After scale to [0, 1], loss is not nan. |
Hi @thuanvh , would you please share your method to convert your dataset into HDF5 format? For example, I have a directory containing all the images, and I have a txt file containing labels for each image. How could I convert the data and the label into HDF5 file? Thank you. |
@XucongZhang Hi, I am using caffe for image regression, I saw your comments and got the data prepared. In my case, I need set the lr to be 1e-8 to make the nan disappear. But the program is not converging, I use image to do localization, any advice about that? |
@buaawelldon Hi, from my experience, you can also normalize your label and change initialization of the filter to avoid nan. I suspect the learning rate is so small that you can not learn something. |
Thanks, I will try your advice. On Sun, Oct 12, 2014 at 6:11 PM, Xucong Zhang notifications@github.com
|
@XucongZhang Can I ask you a question> I load the data in hdf5 format. I have 390 h5 files. The questions is the log file outputs |
It outputs " I1127 10:36:23.443383 37867 hdf5_data_layer.cpp:49] Successully loaded 128 rows |
Yes, it will outputs the operations. I know it will be pretty annoying, and 2014-11-27 3:41 GMT+01:00 chocolate9624 notifications@github.com:
|
Hello, the training step run correctly and it creates model file, but when i try to classify a new image in the model with python, the result is F0304 12:30:04.102885 40208 layer.hpp:347] Check failed: ExactNumBottomBlobs() == bottom.size() (2 vs. 1) EUCLIDEAN_LOSS Layer takes 2 bottom blob(s) as input. Do you have any idea about the problem? |
Hi, From my point of view, first of all you don't need the Euclidean loss for And the error means you just pass one bottom blob but it seems not the 2015-03-04 12:49 GMT+01:00 souzou notifications@github.com:
|
Hi souzou! I got the same error as you, have you figured out the problem? Thank you! |
@bearpaw Hello, I have the same problem with you. Do you solve this problem and can you share the answer with me? Thanks a lot! |
@sjtujulian Hi. I use HDF5 layer to handle multi-label data. Please refer to the official caffe HDF5 demo. |
@bearpaw Thank you very much! |
@XucongZhang Why the "num_output" of ip2 layer is 1 instead of 2? Because I think that the .h5 file contains (x,y) and it's like N*2. |
What's your problem? Can you describe it in detail? From: Julian |
@XucongZhang Hi, Xoucong! I'd like to do regression using caffe for training a network which predicts optic flow magnitude map and optic flow direction map of an image (not two) using a multitask EuclideanLoss. the ground truth is a grayscale magnitude optic flow and 2D vector of optic flow. previously I did classification with caffe and declaring GT was easy, but in this case, I didn't get it how to put this ground truth into hdf5 file and basically the txt file it refers to. To predict those maps, I modified AlexNet fully connected layers to fully convolutional and a in the end defined two EuclideanLosses, but it is wrong. could you let me know how to do regression in this case as you have already done it. and another question is basically how to specify the dimensions of output which e.g. for vector optic flow prediction, it has two channels. x and y. ########## I didn't get it how to use this layer and how to make those training data into list.txt layer { layer { layer { ########################################## the deploy.prototxt: name: "AlexNet" ########################################the solver.prototxt: Thanks a lot already. |
@smajida I create multi-label hdf5 files through the MATLAB demo provided by caffe (just look at /caffe/matlab/hdf5creation). It basically takes your data and writes it in a .h5 in "chunks" (batch size). Just be careful because if your file is around 10GB caffe might complain that it is too big. You then have to divide your data in smaller .h5 files. Afterwards, your txt file will simply point to the files e.g. train1.h5 and so on. I place my .h5 files in the same folder as my prototxt folder so it can easily find it. By the way, you might be able to get some ideas from @XucongZhang 's train test prototxt file here: https://www.mpi-inf.mpg.de/fileadmin/inf/d2/xucong/MPIIGaze/train_test.prototxt I still don't exactly understand why we use two outputs at the last inner product layer (if we're regressing shouldn't it be one?). |
@chriss2401 thanks a lot. basically I have 500 .png images with dimension (800,800,3)as training data with values 0 to 255 and labels are grayscale png images with dimension (800,800). so based on hdf5 demo, I changed the code for the beginning 4 images from training and 4 from labels, but store2hdf5 complains about the dimensions. here is there error: batch no. 1 Error in demo (line 49) the data_disk size is (500,500,3,4) and label_disk (500,500,4) how should I solve this problem? |
@smajida your data should be formated in the following way: Training: 800 800 3 500 (rows,colums,channels,number - don't forget to permute since caffe and matlab have a different format) Labels: 800 800 500 That way when the check at line 15 happens ( assert(lab_dims(end)==num_samples ) both arrays will output the same number (500) |
By the way, when testing my model, my accuracy is incredibly high. Did anyone else get this ? |
@chriss2401 I tried it now my data is exactly |
@chriss2401 that is interesting because I also had read it but surprisingly it worked. not only for my one channel label image, but also I concatenated optical flow (u,v) vector to it. thanks for your help. Now I have a question. you mentioned permuting because the order in caffe and matlab differs. in this order which I have prepared, caffe doesn't work, does it? could you tell me in which order shall I prepare hdf5? as for regression, the last layer's out was put equal to 1, when I put it equal to 2 it complained 500000 vs 25000 which by putting output to 1 this problem was solved, but I have another problem to get to the final regression training. layer { type: 'EuclideanLoss' name: 'loss' top: 'loss' layer { is that a problem of the order dimension by which I built hd5? |
@smajida as you may know, when you imread an image on MATLAB the dimensions are HxWxC (height width channels). Caffe works with WxHxC. Therefore before using the hdf5 functions to write your data you should permute in the following way. im = permute(im,[2 1 3]); of course you have a fourth channel, but you get my idea. Once you flip the width and height you should be good to go. As for the regression problem I'm not 100 percent sure. @XucongZhang 's project has an accuracy layer but when I use it in my project (regression through RNN) it outputs some crazy values (like 20+). So I removed it as some other people mentioned in here and I'm looking at results. But I'm not sure how it will work because your GT size is different from your data (800 800 versus 800 800 3). Right now I'm using GT of 10x4 and data of 10x4x256x28x28 (scalar for each image). |
@chriss2401 Oops, sorry when you were replying I was editing my previous post. I managed to get the net run and it was trained on four images ofcourse loss is in the scale of e+7. I hope now by using the real dataset and using pretrained models I will be able to get much less loss. as for the permutation. I got the gist. I can computer accuracy after training outside caffe. now I just want to make the net start to get trained, but it would be good to have a accuracy layer in this case. In my case I have an image with 500x500 as labels. I resized images to 500 unlike 800 to be able to use a similar semantic label model as finetuning as well. how should I change the accuracy level to make it suitable for this case? any idea or suggestion? I know L2 norm, but I don't know how to access the blobs of labels and blobs of previous layer which is the last layer before loss. thanks |
@chriss2401 just a small question, shall I substract the mean image from images before making hd5 files. and in my case dividing the image by 255 before hdf5 making would be better?
|
@smajida HDF5 layer doesn't have any preprocessings like others. You should do preprocessing before storing your data in HDF5 format. |
@souzou have you ever solved your problem? if, can you share your solution? thanks. |
Hello, the training step run correctly and it creates model file,when I use this model to classify the images for test,the result is differ from the value of label,could you give me some advice,think you very much. |
@XucongZhang ,if the code udate?I cant find the loop that you change in accuracy_layer.cpp.May you tell me the line?And if my data has 128D features and label is a float number,it isnt a picture,the number of data is 10w,How I should setup the params?Hope that you help me,thanks! |
I have given the source inside hdf5_data_param which contains the filenames in .h5 format. My questions are
|
@MohsenFayyaz89 Thanks it has been a while that I have migrated to python layers and I am not using any more HDF5 layers. |
@XucongZhang hello,I use caffe to estimate depth from a single image, which is a regression problem. However, when I use the trained model to predict, the model can just output the same with any input. Hope you help me, thanks! |
HI All, I am new to FCN, how can i use both SoftmaxWithLoss and SmoothL1loss layers to fcn code. Please any one help me on this. Thanking you |
Hi,
I am trying to modify the mnist example to be a regression network. I just changed the loss layer from "SOFTMAX_LOSS" to "EUCLIDEAN_LOSS", and "num_output" of ip2 layer to be 1 instead of 10. But I got the result like this:
I0617 15:26:45.970600 10216 solver.cpp:141] Iteration 0, Testing net (#0)
I0617 15:26:47.555521 10216 solver.cpp:179] Test score #0: 1
I0617 15:26:47.555577 10216 solver.cpp:179] Test score #1: 0
I0617 15:26:51.046875 10216 solver.cpp:274] Iteration 100, lr = 0.00992565
I0617 15:26:51.047067 10216 solver.cpp:114] Iteration 100, loss = nan
I0617 15:26:54.535904 10216 solver.cpp:274] Iteration 200, lr = 0.00985258
I0617 15:26:54.536092 10216 solver.cpp:114] Iteration 200, loss = nan
I0617 15:26:58.024719 10216 solver.cpp:274] Iteration 300, lr = 0.00978075
I0617 15:26:58.024911 10216 solver.cpp:114] Iteration 300, loss = nan
I0617 15:27:01.514154 10216 solver.cpp:274] Iteration 400, lr = 0.00971013
I0617 15:27:01.514345 10216 solver.cpp:114] Iteration 400, loss = nan
I0617 15:27:05.003473 10216 solver.cpp:274] Iteration 500, lr = 0.00964069
I0617 15:27:05.003661 10216 solver.cpp:114] Iteration 500, loss = nan
I0617 15:27:05.003675 10216 solver.cpp:141] Iteration 500, Testing net (#0)
I0617 15:27:06.572185 10216 solver.cpp:179] Test score #0: 1
I0617 15:27:06.572234 10216 solver.cpp:179] Test score #1: nan
I0617 15:27:10.060245 10216 solver.cpp:274] Iteration 600, lr = 0.0095724
I0617 15:27:10.060436 10216 solver.cpp:114] Iteration 600, loss = nan
Can anyone help me with it, or could you give me a example to do regression with Caffe?
Thank you very much!
Updates:
Just in case anyone wants to work on multi-label regression problem, please refer to our project webpage:
https://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-computing/research/gaze-based-human-computer-interaction/appearance-based-gaze-estimation-in-the-wild/
In the "Method" part, you will find my configuration file as well as the Matlab code convert .mat to .h5
P.S. Thanks to #1746, that is a time saver!
The text was updated successfully, but these errors were encountered: