-
Notifications
You must be signed in to change notification settings - Fork 123
Intialize Differentiable Optimizer with non-leaf tensort #133
Comments
Hi there! I am not sure it is the exact problem that I faced but I will just drop what I did: My use case is that I have a model g that should be updated using another model f. After n update steps of g (using f), I want to update f using a meta-update. Like you said, if you only put the model g into the higher context, it will not track the dependence on f and accordingly you can't backpropagate through the n updates to get the meta-gradient. It is thus important that the model you put into the higher context has all parameters for both g and f in it, so that everything is tracked throughout updates. I have the following toy example below. The model x is your g model and model y is your f model that is used to update x. `import higher
x = Model(2, 1) print("X parameters", list(x.parameters())) in_ = torch.Tensor([1, 1]) #print("Y output", y(in_)) for p in z_copy.model1.parameters(): z_optimizer = optim.SGD(z.model1.parameters(), lr=1.0) with higher.innerloop_ctx(z, z_optimizer, copy_initial_weights=False) as (fnet, diffopt):
|
Hello, I used higher a while ago and if I remember correctly you could create a differentiable optimizer starting from a normal one.
Now I need to optimize non-leaf tensors (my model g weights are generated from another model f). The problem with that is that, apparently, I cannot optimize them because they are not leaf tensors.
Technically I could generate new leaf tensors starting from them but then I wouldn't be able to backpropagate back to the model f that generated them.
Does anyone have a solution?
Thanks
The text was updated successfully, but these errors were encountered: