Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi label regression in Caffe #1765

Closed
olddocks opened this issue Jan 21, 2015 · 17 comments
Closed

Multi label regression in Caffe #1765

olddocks opened this issue Jan 21, 2015 · 17 comments

Comments

@olddocks
Copy link

i am extracting 30 facial keypoints (x,y) from an input image as per kaggle facialkeypoints competition.

How do i setup caffe to run a regression and produce 30 dimensional output??.

Input: 96x96 image
Output: 30  (30 dimensional output).

How do i setup caffe accordingly?. I am using EUCLIDEAN_LOSS (sum of squares) to get the regressed output. Here is a simple logistic regressor model using caffe but it is not working. Looks accuracy layer cannot handle multi-label output.

I0120 17:51:27.039113  4113 net.cpp:394] accuracy <- label_fkp_1_split_1
I0120 17:51:27.039135  4113 net.cpp:356] accuracy -> accuracy
I0120 17:51:27.039158  4113 net.cpp:96] Setting up accuracy
F0120 17:51:27.039201  4113 accuracy_layer.cpp:26] Check failed: bottom[1]->channels() == 1 (30 vs. 1) 
*** Check failure stack trace: ***
    @     0x7f7c2711bdaa  (unknown)
    @     0x7f7c2711bce4  (unknown)
    @     0x7f7c2711b6e6  (unknown)

Here is the layer file:

name: "LogReg"
layers {
  name: "fkp"
  top: "data"
  top: "label"
  type: HDF5_DATA
  hdf5_data_param {
   source: "train.txt"
   batch_size: 100
  }
    include: { phase: TRAIN }

}

layers {
  name: "fkp"
  type: HDF5_DATA
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "test.txt"
    batch_size: 100
  }

  include: { phase: TEST }
}

layers {
  name: "ip"
  type: INNER_PRODUCT
  bottom: "data"
  top: "ip"
  inner_product_param {
    num_output: 30
  }
}
layers {
  name: "loss"
  type: EUCLIDEAN_LOSS
  bottom: "ip"
  bottom: "label"
  top: "loss"
}

layers {
  name: "accuracy"
  type: ACCURACY
  bottom: "ip"
  bottom: "label"
  top: "accuracy"
  include: { phase: TEST }
}

I am have seen this topic but really cant get a grasp of it. I see that stable version of caffe can handle only 1 or 2 outputs.

@jingweiz
Copy link

I think you should use also a EUCLIDEAN_LOSS instead of ACCURACY for the accuracy layer according to this: #512

@olddocks
Copy link
Author

The problem is loss layer with EUCLIDEAN_LOSS can output 1 value. It cannot output 30 outputs. Lets say INNER_PRODUCT layer gives 30 output, the loss layer computes sum of all 30 outputs to produce loss. Thats not i want. I want to compute 30 losses for 30 outputs .

@pannous
Copy link

pannous commented Jan 21, 2015

The input blob 'label' is only one-dimensional, it needs to be a 30 dimensional as well
You can prepare the data accordingly, there once was a pull request which is now out of date:
#144

@olddocks
Copy link
Author

The input data hd5 is already 30 dimensional. i read somewhere only hd5 can handle multiple input labels.

screenshot from 2015-01-21 17 31 13

i remove the accuracy layer from above, i connect the IP to euclidean loss layer. It runs but i am not entirely convinced. Am i doing it right? i see loss value but not converging hmmm :( I am still not using test data just training.

pbu@pbu-OptiPlex-740-Enhanced:~/Desktop$ ./facialkp.sh
I0121 17:32:16.788698  2711 caffe.cpp:103] Use CPU.
I0121 17:32:17.163740  2711 caffe.cpp:107] Starting Optimization
I0121 17:32:17.163918  2711 solver.cpp:32] Initializing solver from parameters: 
base_lr: 0.01
display: 100
max_iter: 10000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 5000
snapshot: 10000
snapshot_prefix: "/home/pbu/Desktop/tmp"
solver_mode: CPU
net: "/home/pbu/Desktop/facialkp.prototxt"
I0121 17:32:17.164019  2711 solver.cpp:67] Creating training net from net file: /home/pbu/Desktop/facialkp.prototxt
I0121 17:32:17.184345  2711 net.cpp:275] The NetState phase (0) differed from the phase (1) specified by a rule in layer data
I0121 17:32:17.184592  2711 net.cpp:39] Initializing net from parameters: 
name: "LogReg"
layers {
  top: "data"
  top: "label"
  name: "fkp"
  type: HDF5_DATA
  hdf5_data_param {
    source: "train.txt"
    batch_size: 100
  }
  include {
    phase: TRAIN
  }
}
layers {
  bottom: "data"
  top: "ip"
  name: "ip"
  type: INNER_PRODUCT
  inner_product_param {
    num_output: 30
  }
}
layers {
  bottom: "ip"
  bottom: "label"
  top: "loss"
  name: "loss"
  type: EUCLIDEAN_LOSS
}
state {
  phase: TRAIN
}
I0121 17:32:17.195250  2711 net.cpp:67] Creating Layer fkp
I0121 17:32:17.195312  2711 net.cpp:356] fkp -> data
I0121 17:32:17.195348  2711 net.cpp:356] fkp -> label
I0121 17:32:17.195379  2711 net.cpp:96] Setting up fkp
I0121 17:32:17.195487  2711 hdf5_data_layer.cpp:57] Loading filename from train.txt
I0121 17:32:17.195663  2711 hdf5_data_layer.cpp:69] Number of files: 1
I0121 17:32:17.195683  2711 hdf5_data_layer.cpp:29] Loading HDF5 filefacialkp-train.hd5
I0121 17:32:18.079417  2711 hdf5_data_layer.cpp:49] Successully loaded 4934 rows
I0121 17:32:18.079493  2711 hdf5_data_layer.cpp:81] output data size: 100,9216,1,1
I0121 17:32:18.079552  2711 net.cpp:103] Top shape: 100 9216 1 1 (921600)
I0121 17:32:18.079574  2711 net.cpp:103] Top shape: 100 30 1 1 (3000)
I0121 17:32:18.079612  2711 net.cpp:67] Creating Layer ip
I0121 17:32:18.079637  2711 net.cpp:394] ip <- data
I0121 17:32:18.079668  2711 net.cpp:356] ip -> ip
I0121 17:32:18.079705  2711 net.cpp:96] Setting up ip
I0121 17:32:18.081114  2711 net.cpp:103] Top shape: 100 30 1 1 (3000)
I0121 17:32:18.081243  2711 net.cpp:67] Creating Layer loss
I0121 17:32:18.081267  2711 net.cpp:394] loss <- ip
I0121 17:32:18.081290  2711 net.cpp:394] loss <- label
I0121 17:32:18.081318  2711 net.cpp:356] loss -> loss
I0121 17:32:18.081346  2711 net.cpp:96] Setting up loss
I0121 17:32:18.089938  2711 net.cpp:103] Top shape: 1 1 1 1 (1)
I0121 17:32:18.089965  2711 net.cpp:109]     with loss weight 1
I0121 17:32:18.090049  2711 net.cpp:170] loss needs backward computation.
I0121 17:32:18.090068  2711 net.cpp:170] ip needs backward computation.
I0121 17:32:18.090085  2711 net.cpp:172] fkp does not need backward computation.
I0121 17:32:18.090101  2711 net.cpp:208] This network produces output loss
I0121 17:32:18.090122  2711 net.cpp:467] Collecting Learning Rate and Weight Decay.
I0121 17:32:18.090142  2711 net.cpp:219] Network initialization done.
I0121 17:32:18.090158  2711 net.cpp:220] Memory required for data: 3710404
I0121 17:32:18.090208  2711 solver.cpp:41] Solver scaffolding done.
I0121 17:32:18.090230  2711 solver.cpp:160] Solving LogReg
I0121 17:32:18.090246  2711 solver.cpp:161] Learning Rate Policy: step
I0121 17:32:18.172677  2711 solver.cpp:209] Iteration 0, loss = 40128.5
I0121 17:32:18.172803  2711 solver.cpp:224]     Train net output #0: loss = 40128.5 (* 1 = 40128.5 loss)
I0121 17:32:18.172847  2711 solver.cpp:445] Iteration 0, lr = 0.01
I0121 17:32:21.068056  2711 solver.cpp:209] Iteration 100, loss = 164.532
I0121 17:32:21.068186  2711 solver.cpp:224]     Train net output #0: loss = 164.532 (* 1 = 164.532 loss)
I0121 17:32:21.068215  2711 solver.cpp:445] Iteration 100, lr = 0.01
I0121 17:32:23.937835  2711 solver.cpp:209] Iteration 200, loss = 124.259
I0121 17:32:23.937965  2711 solver.cpp:224]     Train net output #0: loss = 124.259 (* 1 = 124.259 loss)
I0121 17:32:23.937993  2711 solver.cpp:445] Iteration 200, lr = 0.01
I0121 17:32:26.836241  2711 solver.cpp:209] Iteration 300, loss = 132.833
I0121 17:32:26.836433  2711 solver.cpp:224]     Train net output #0: loss = 132.833 (* 1 = 132.833 loss)
I0121 17:32:26.836462  2711 solver.cpp:445] Iteration 300, lr = 0.01
I0121 17:32:29.702158  2711 solver.cpp:209] Iteration 400, loss = 161.586
I0121 17:32:29.702286  2711 solver.cpp:224]     Train net output #0: loss = 161.586 (* 1 = 161.586 loss)
I0121 17:32:29.702313  2711 solver.cpp:445] Iteration 400, lr = 0.01
I0121 17:32:32.559435  2711 solver.cpp:209] Iteration 500, loss = 105.922
I0121 17:32:32.559563  2711 solver.cpp:224]     Train net output #0: loss = 105.922 (* 1 = 105.922 loss)
I0121 17:32:32.559590  2711 solver.cpp:445] Iteration 500, lr = 0.01
I0121 17:32:35.418654  2711 solver.cpp:209] Iteration 600, loss = 149.592
I0121 17:32:35.418781  2711 solver.cpp:224]     Train net output #0: loss = 149.592 (* 1 = 149.592 loss)
I0121 17:32:35.418810  2711 solver.cpp:445] Iteration 600, lr = 0.01
I0121 17:32:38.276181  2711 solver.cpp:209] Iteration 700, loss = 131.396
I0121 17:32:38.276309  2711 solver.cpp:224]     Train net output #0: loss = 131.396 (* 1 = 131.396 loss)
I0121 17:32:38.276336  2711 solver.cpp:445] Iteration 700, lr = 0.01
I0121 17:32:41.135144  2711 solver.cpp:209] Iteration 800, loss = 159.562
I0121 17:32:41.135272  2711 solver.cpp:224]     Train net output #0: loss = 159.562 (* 1 = 159.562 loss)
I0121 17:32:41.135298  2711 solver.cpp:445] Iteration 800, lr = 0.01
I0121 17:32:43.993687  2711 solver.cpp:209] Iteration 900, loss = 130.259
I0121 17:32:43.993819  2711 solver.cpp:224]     Train net output #0: loss = 130.259 (* 1 = 130.259 loss)
I0121 17:32:43.993846  2711 solver.cpp:445] Iteration 900, lr = 0.01
I0121 17:32:46.854629  2711 solver.cpp:209] Iteration 1000, loss = 107.512
I0121 17:32:46.854954  2711 solver.cpp:224]     Train net output #0: loss = 107.512 (* 1 = 107.512 loss)
I0121 17:32:46.854984  2711 solver.cpp:445] Iteration 1000, lr = 0.01

@pannous
Copy link

pannous commented Jan 23, 2015

Did you resolve your issue?
It looks like the network is "predicting zeros"?

@olddocks
Copy link
Author

Thanks @pannous
nope, there is a problem with dimensions of input data and labels it seems :)

@shelhamer
Copy link
Member

EUCLIDEAN_LOSS is the right loss for linear regression -- the scalar output is the sum of squared errors across dimensions, which is the total loss for the regression. It can learn predictions of any blob dimensionality (scalar, vector of 30 like your case, or matrix).

Please ask modeling questions on the caffe-users mailing list.

@olddocks
Copy link
Author

i fixed the labels and data dimensions and there is another problem. The network is predicting same output in the fc7 layer, no matter what the input is

I am continuing this thread here: https://groups.google.com/forum/#!topic/caffe-users/o4cpDNylo3Q

@nayef
Copy link

nayef commented Feb 18, 2015

@olddocks how did you fix the labels and data dimensions? could you please share the code?

@olddocks
Copy link
Author

@nayef i fixed the issue by adding bias=1 to the layers. Adding another ip layer also helped.

@nayef
Copy link

nayef commented Feb 18, 2015

So no change was needed in data dimensions?
On Feb 18, 2015 8:00 PM, "olddocks" notifications@github.com wrote:

@nayef https://github.com/nayef i fixed the issue by adding bias=1 to
the layers. Adding another ip layer also helped.


Reply to this email directly or view it on GitHub
#1765 (comment).

@olddocks
Copy link
Author

Yes i changed the dimensions needed. It worked 👍

@nayef
Copy link

nayef commented Feb 18, 2015

What change did you make in dimensions?
On Feb 18, 2015 9:03 PM, "olddocks" notifications@github.com wrote:

Yes i changed the dimensions needed. It worked [image: 👍]


Reply to this email directly or view it on GitHub
#1765 (comment).

@Franck-Dernoncourt
Copy link
Contributor

@olddocks I'd also be interested to have a more thorough explanation on how you fixed the issue. Thanks!

@olddocks
Copy link
Author

@stardust2602
Copy link

Hi @olddocks , i am dealing with a similar regression problem. However, my .csv file contains image filename and 3 real values labels. How can I create hdf5 input file? Also, if I use some popular network such as AlexNet, VGG,... to train a regression problem, what should be the appropriate changes in prototxt files?

@wusongbeckham
Copy link

Hi, @olddocks I am doing the facial keypoints detection on the dataset: http://mmlab.ie.cuhk.edu.hk/archive/CNN_FacePoint.htm. I used similary network as you, but the output for different image is always the same. Could you give me some suggestions about this problem?
part 1: dataprocess
import os
import numpy as np
import h5py
import cv2
import math

num_cols = 1
num_rows = 3466 #10000 images for train and 3466 images for validation
height = 39 #
width = 39 #
labeldim = 10
total_size = num_cols * num_rows * height * width
data=np.zeros([num_rows,num_cols,height,width])
data=data.astype(np.float32)
label=np.zeros([num_rows,labeldim])
label=label.astype(np.float32);

dirname ='/home/deep-learning/caffe/matlab/facial_point_estimation/dataset/face/train/'
filename ='/home/deep-learning/caffe/matlab/facial_point_estimation/dataset/face/train/testImageList.txt'
f = open(filename)
line = f.readline()
i=0
while line:
print i
content=line.split(' ')
content[1:]=[float(j) for j in content[1:]]
imgname=dirname+content[0]
img=cv2.imread(imgname)
img=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
face=img[int(content[3]):int(content[4]),int(content[1]):int(content[2])]
face=cv2.resize(face,(height,width))
face=face*(1./255)
face=face.astype(np.float32)
data[i,0,:,:]=face
#print data[i,0,:,:].shape

facewidth =int(content[2])-int(content[1])+1
faceheight=int(content[4])-int(content[3])+1

facepoint=content[5:]
facepoint[0::2]=[((float(j-int(content[1]))/(float(facewidth )/39.0))-19.5)/19.5 for j in facepoint[0::2]]
facepoint[1::2]=[((float(j-int(content[3]))/(float(faceheight)/39.0))-19.5)/19.5 for j in facepoint[1::2]]
#print facepoint
for j in facepoint:
    assert(1>=j>=0 or 0>=j>=-1)  
label[i,:]=facepoint
line = f.readline()
i+=1

with h5py.File(os.getcwd()+ '/test_data.h5', 'w') as f:
f['data'] = data
f['label'] = label

with open(os.getcwd() + '/test_data_list.txt', 'w') as f:
f.write(os.getcwd() + '/test_data.h5\n')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants