[Vote] Softmax and Loss Convention #434

tqchen · 2015-10-30T16:29:09Z

According to discussions in #426
We agreed in all case, the multi-output and single output softmax can be combined into one-class, maybe overloaded by shape of label.

_Option 1_
- SoftmaxOutput for output without gradient attached.
- After attach a loss, XXXOutput will be able to backprop gradient of the loss, while forward behavior remains unchanged.
- Softmax remains the same, with loss already attached.
- May need to introduce attach_loss and maybe special loss class.
_Option 2_
- SoftmaxOutput for output, with specific backward behavior of cross-entropy-loss
  - As an alternative task based naming instead of math based: MulticlassProbOutput to be clear to user.
- Softmax behaves normally(only take input), and can prop gradient back from any output source(being able to compose as internal node)
- CrossEntropyLoss can be composed with anything, including Softmax, to get loss in forward and gradient in backward.

Please edit this post to add more options .

The text was updated successfully, but these errors were encountered:

tqchen · 2015-10-30T16:32:59Z

I vote for 2

piiswrong · 2015-10-30T16:41:27Z

I vote for 2 with the following name change:
SoftmaxOutput -> SoftmaxLoss
Softmax -> activationop with option "softmax" since its really just an activation

antinucleon · 2015-10-30T16:55:28Z

+1 for 2

tqchen · 2015-10-31T03:47:32Z

@pluskid @mli

mli · 2015-10-31T03:48:24Z

+1 for 2

pluskid · 2015-10-31T04:23:01Z

Sorry currently traveling. I think 1 is a more unified behavior but requires more changes to the internals (eg attach-loss etc). I'm generally ok with either design.

tqchen · 2015-11-01T16:55:22Z

#444 #450 Imposes the changes for python and R

tqchen · 2015-11-01T16:56:06Z

@hjk41 can you make a windows build for the most recent version?

tqchen · 2015-11-02T17:09:12Z

all sides updated

tqchen mentioned this issue Oct 30, 2015

[Discussion] Softmax and SoftmaxLoss #426

Closed

tqchen added the Discussion label Oct 30, 2015

tqchen mentioned this issue Nov 2, 2015

[R] An easy-to-use interface for mlp training #460

Merged

pluskid added a commit to dmlc/MXNet.jl that referenced this issue Nov 2, 2015

update doc Softmax -> SoftmaxOutput (apache/mxnet#434)

6389094

tqchen closed this as completed Nov 2, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Vote] Softmax and Loss Convention #434

[Vote] Softmax and Loss Convention #434

tqchen commented Oct 30, 2015

tqchen commented Oct 30, 2015

piiswrong commented Oct 30, 2015

antinucleon commented Oct 30, 2015

tqchen commented Oct 31, 2015

mli commented Oct 31, 2015

pluskid commented Oct 31, 2015

tqchen commented Nov 1, 2015

tqchen commented Nov 1, 2015

tqchen commented Nov 2, 2015

[Vote] Softmax and Loss Convention #434

[Vote] Softmax and Loss Convention #434

Comments

tqchen commented Oct 30, 2015

tqchen commented Oct 30, 2015

piiswrong commented Oct 30, 2015

antinucleon commented Oct 30, 2015

tqchen commented Oct 31, 2015

mli commented Oct 31, 2015

pluskid commented Oct 31, 2015

tqchen commented Nov 1, 2015

tqchen commented Nov 1, 2015

tqchen commented Nov 2, 2015