Implementation of loss function #6

CN-BiGLiu · 2019-03-13T03:46:57Z

Thanks for implementation from Long, and there are two points confusing me

the total loss is defined as classification loss + transfer loss, which is different from equation(3) classification loss-transfer loss.
the domain discriminator is updated based on the total loss instead of the transfer loss.
Hoping for your help

caozhangjie · 2019-03-13T04:14:43Z

We use a trick to reverse the gradient before the gradient back-propagate from discriminator to the feature extractor. So we do not need to use -1 at the discriminator loss to train the feature extractor.

There is no input from the discriminator to the classification loss. According to the Pytorch auto-grad principle. There is no gradient from the classification loss even we back-propagate from the sum of the classification loss and the transfer loss.

CN-BiGLiu · 2019-03-13T06:15:46Z

Thanks for your answer, the trick is x.register_hook(grl_hook(coeff)), is that right?

caozhangjie · 2019-03-13T16:19:04Z

Yes

MeLonJ10 · 2019-05-21T04:19:13Z

How about the tensorflow version? Where is the trick of reversing the gradient?
Thanks a lot!

sy565612345 · 2019-05-21T05:06:17Z

How about the tensorflow version? Where is the trick of reversing the gradient?
Thanks a lot!

The Tensorflow version is under implementation.
The trick of gradient reversing is in pytorch/network.py line 388. The grl_hook adds a grl layer between the ResNet CNN and the Domain Discriminator, which enables the update of the two adversarial players in one feedforward and backward propagation.

sxwawa · 2019-06-05T12:11:45Z

In the DANN model, after inserting a GRL between the generator and the discriminator, the gradient of domain loss w.r.t the feature extractor F will be multiplied by -1. But in CDAN model, the input of discriminator is the tensor product between feature vector and the predicted probability vector. So during backward propagation, the domain loss would have gradient with regard to both feature extractor F and classifier G. May I know how your algorithm computes the gradient of domain loss w.r.t. the predicted probabilities output by classifier G? Will the grl_hook also reverse the gradient of domain loss w.r.t. the classifier G? Thanks a lot!

sy565612345 · 2019-06-05T16:31:56Z

In the DANN model, after inserting a GRL between the generator and the discriminator, the gradient of domain loss w.r.t the feature extractor F will be multiplied by -1. But in CDAN model, the input of discriminator is the tensor product between feature vector and the predicted probability vector. So during backward propagation, the domain loss would have gradient with regard to both feature extractor F and classifier G. May I know how your algorithm computes the gradient of domain loss w.r.t. the predicted probabilities output by classifier G? Will the grl_hook also reverse the gradient of domain loss w.r.t. the classifier G? Thanks a lot!

In pytorch/loss.py line 22, softmax_output = input_list[1].detach()
This detaches G from the domain loss during back-propagation, so the domain loss will not be used to update classifier G.

xyqfountain · 2019-07-31T21:12:16Z

I cannot undstand two things. I appreciate it if you can explain. (1)pytorch/loss.py line 33. entropy.register_hook(grl_hook(coeff)) , Why the entropy need this *-1 hook? The grads passed back from the domain discriminator to the feature extractor have been inverted by using x.register_hook(grl_hook(coeff)) , Registering a *-1 hook for the entropy confuses me. (2) I noticed that you use softmax_output=input_list[1].detach() which blocks the grads from the discrininator to the classifier, but the entropy is obtained by loss_func.Entropy(softmax_output) resulting to entropy.requires_grad=True. This means the grads can be back-propagated to the classifier through entropy (am I right?), What is this for?

buerzlh · 2020-07-02T09:52:10Z

I cannot undstand two things. I appreciate it if you can explain. (1)pytorch/loss.py line 33. entropy.register_hook(grl_hook(coeff)) , Why the entropy need this *-1 hook? The grads passed back from the domain discriminator to the feature extractor have been inverted by using x.register_hook(grl_hook(coeff)) , Registering a *-1 hook for the entropy confuses me. (2) I noticed that you use softmax_output=input_list[1].detach() which blocks the grads from the discrininator to the classifier, but the entropy is obtained by loss_func.Entropy(softmax_output) resulting to entropy.requires_grad=True. This means the grads can be back-propagated to the classifier through entropy (am I right?), What is this for?

I also feel strange about problem(1). Do you understand now?

buerzlh · 2020-07-02T09:53:52Z

In the DANN model, after inserting a GRL between the generator and the discriminator, the gradient of domain loss w.r.t the feature extractor F will be multiplied by -1. But in CDAN model, the input of discriminator is the tensor product between feature vector and the predicted probability vector. So during backward propagation, the domain loss would have gradient with regard to both feature extractor F and classifier G. May I know how your algorithm computes the gradient of domain loss w.r.t. the predicted probabilities output by classifier G? Will the grl_hook also reverse the gradient of domain loss w.r.t. the classifier G? Thanks a lot!

In pytorch/loss.py line 22, softmax_output = input_list[1].detach()
This detaches G from the domain loss during back-propagation, so the domain loss will not be used to update classifier G.

But generally speaking, the domain loss needs to optimize the feature extraction network G

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of loss function #6

Implementation of loss function #6

CN-BiGLiu commented Mar 13, 2019

caozhangjie commented Mar 13, 2019

CN-BiGLiu commented Mar 13, 2019

caozhangjie commented Mar 13, 2019

MeLonJ10 commented May 21, 2019

sy565612345 commented May 21, 2019

sxwawa commented Jun 5, 2019

sy565612345 commented Jun 5, 2019

xyqfountain commented Jul 31, 2019

buerzlh commented Jul 2, 2020

buerzlh commented Jul 2, 2020

Implementation of loss function #6

Implementation of loss function #6

Comments

CN-BiGLiu commented Mar 13, 2019

caozhangjie commented Mar 13, 2019

CN-BiGLiu commented Mar 13, 2019

caozhangjie commented Mar 13, 2019

MeLonJ10 commented May 21, 2019

sy565612345 commented May 21, 2019

sxwawa commented Jun 5, 2019

sy565612345 commented Jun 5, 2019

xyqfountain commented Jul 31, 2019

buerzlh commented Jul 2, 2020

buerzlh commented Jul 2, 2020