Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[RFC] the fcn-xs example for image segmentation #975

Merged
merged 7 commits into from
Dec 30, 2015
Merged

[RFC] the fcn-xs example for image segmentation #975

merged 7 commits into from
Dec 30, 2015

Conversation

tornadomeet
Copy link
Contributor

this is the refactor version of #940, and it need the change of mshadow, refer the pr dmlc/mshadow#87

@tqchen
Copy link
Member

tqchen commented Dec 21, 2015

@winstywang can you review this?

@tqchen
Copy link
Member

tqchen commented Dec 21, 2015

One question: Can we reuse(or enhance) the model.fit for fcn training? instead of using the current solver?

@tornadomeet
Copy link
Contributor Author

@tqchen , currently cannot use the model.fit, the biggist problem is: the input size of the images may not be the same, and we should bind at every batch(here the minibatch eques 1, and i only replace the data and label during ervery bind for efficiency), model.fit assume that our input data size is the same during training.
another reason is the mxnet does not support image segmentation io, so there is no appropriate dataiter passed in model.fit;
it will be great if mxnet support the image segmentation io (c++ side) for the distributed training in the future!! i just done some work before, but it seems a bit hard for me to add that.
the example here is referenced with https://github.com/BVLC/caffe/wiki/Model-Zoo#fully-convolutional-semantic-segmentation-models-fcn-xs

| fcn-16s | 1e-12 | 27 |
| fcn-8s | 1e-14 | 19 |

the training image number is only : 2027, and the Validation image number is: 462
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have results on VOC submission server? We had better add the number in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has not yet.
in this example, i used the sub-set of pascal voc dataset, and random choose 70% number images for training, it seems that it is not the standard train,val.
so if evaluate on voc submission server, i need to retrain the model, and after the new year hodlday, i'll do it, so please wait me for about 2 weeks(one week for train:) )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, after merge in this PR, I could do this.

@winstywang
Copy link
Contributor

I have reviewed the codes. The codes are generally good. One major issue is that it uses different size for training. Actually, we can randomly crop the input to a fixed size instead. It will speed up the training significantly, and make the codes shorter and cleaner.

@tornadomeet
Copy link
Contributor Author

@winstywang , thanks, i have answered some corresponding review, and i'll fix the cut_off_size bug first.

@tornadomeet
Copy link
Contributor Author

@winstywang , i update the example by:

  1. fix the cut_off_size bug of data.py
  2. change ElementWiseSum to directly "+" symbol.
  3. add the h_w parameter for Crop Operator, and this can be used for general cropping operator.
  4. fix/add some descriptions.
  5. update the new symbol in yun.baidu for use.

ps, thought i add the h_w for explicitly set the height and width for crop, but i also retain the use of "crop_like" symbol for height and width when corpping. the reason is that even if we crop the input data with the same size, we should also need to manually calculator the crop h_w in some other place, just like in fcn16s and fcn8s , https://github.com/tornadomeet/mxnet/blob/b8b9700fcdb1d7fb4c79a3d271da178639453b8a/example/fcn-xs/symbol_fcnxs.py#L179 and https://github.com/tornadomeet/mxnet/blob/b8b9700fcdb1d7fb4c79a3d271da178639453b8a/example/fcn-xs/symbol_fcnxs.py#L186 , we should know the size of score_pool4c and score_pool3c, so if some hyper-parameter of cnn structure is changed, like the stride, pad... then we should manually calculator the size again, so the crop_like symbol is used for auto calculate the crop_size without consider the change of paramter.
what do you think about this?

"Input data should be 4D in batch-num_filter-y-x";
std::vector<int> crop_shape;
if (param_.num_args == 1) {
std::cout << "ok1" << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this debug info

@winstywang
Copy link
Contributor

Two more comments @tornadomeet
In this application, using crop_like is proper. What I mean is that in implementation of new op, we had better make it more general. Current implementation is good.

After fixing these two issues, we can merge in this PR, though some further efforts are needed to make it better, including but not limited to:

  1. Using batch for training.
  2. Multi-GPU support
  3. Add evaluation on VOC

@tornadomeet
Copy link
Contributor Author

@winstywang , ths~ this example should be enhanced more in the further, especially the batch training.

fix description

fix baiduyun address

remove debug info
winstywang added a commit that referenced this pull request Dec 30, 2015
[RFC] the fcn-xs example for image segmentation
@winstywang winstywang merged commit 0826a1f into apache:master Dec 30, 2015
} else {
SoftmaxGrad(grad, out, label);
}
grad *= param_.grad_scale;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tornadomeet Why did you remove the scaling by 1/s3[2]? This is causing gradient to explode for large output map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@piiswrong , sorry for the inconvenience, i removed it by accidently, could you please recover it, i am out of office today, thank you~
another, in the fcn-xs example, i do not use the scaling by 1/sc[2] due to the using of very small learning reate(just like the caffe does, one image is regards as one instance, no matter of the number of image labels), then no gradient explode there.

@zeakey
Copy link
Contributor

zeakey commented Mar 25, 2016

@tornadomeet @winstywang
I run the fcn demo but error raised:
raise ValueError('Must specify all the arguments in %s' % arg_key)
in python/mexnet/sombol.py line585, cause name "bigscore_bias" is not in "arg_names".

I wonder if this is a bug or just I haven't set fcnxs_model_prefix properly ?

@tornadomeet
Copy link
Contributor Author

@zeakey #1627

@zeakey
Copy link
Contributor

zeakey commented Mar 26, 2016

@tornadomeet OK, thank you for all ! It's boring for people new to mxnet

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants