Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differentiate cost layer with other layers #380

Closed
emailweixu opened this issue Nov 7, 2016 · 2 comments
Closed

Differentiate cost layer with other layers #380

emailweixu opened this issue Nov 7, 2016 · 2 comments

Comments

@emailweixu
Copy link
Collaborator

Currently, there is no way to tell whether a layer is a cost (loss) layer or not. However, there is a crucial difference between cost layer and non-cost layers. During backpropagation, cost layer does not need gradient from its output. The gradient of its output is implicitly assumed to be 1's. There are several benefits of adding a mechanism to differentiate cost layer with other layers

  • Prevent the incorrect use of the output of a cost layer as the input of other layers
  • When there are multiple outputs of a model including both cost layer and non-cost layers, when calculating cost (using Argument::sumCost), the trainer should only sum over the cost layers, excluding the non-cost layers, so that it can show the correct cost during training.
@typhoonzero
Copy link
Contributor

Can we close this issue now while we are refactoring with new "op" design?

@jacquesqiao
Copy link
Member

the future work will be done in fluid, so close this

zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this issue Sep 25, 2019
thisjiang pushed a commit to thisjiang/Paddle that referenced this issue Oct 28, 2021
…PaddlePaddle#380) (PaddlePaddle#380)

* add MergeExprs to reuse the common part of mergeSum and mergeProduct

* update grama format,remove comment code,revert build.sh
wangxicoding pushed a commit to wangxicoding/Paddle that referenced this issue Dec 9, 2021
* "add mlm params to dygraph ernie1.0"

* finish p-tuning v1.0

* mend

* delete unused coment

* add label_normalized

* P-tuning: support Chid task of FewCLUE

* 1. decouple evaluate and train

* 1.add FewCLUE datasets(9/9)
2.implement p-tuning strategy by transform_function
3.unify train_script beteween `chid` task and other 8 tasks of FewCLUE

* add README.md
gglin001 added a commit to graphcore/Paddle-fork that referenced this issue Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants