Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start weight sharing API #325

Closed
wants to merge 1 commit into from
Closed

Start weight sharing API #325

wants to merge 1 commit into from

Conversation

seanmor5
Copy link
Contributor

@seanmor5 seanmor5 commented Jul 21, 2022

Resolves #169

Still a few API considerations/questions:

  1. Should we offer an explicit :parameters option on layer creation?
  2. How should we handle partial parameter sharing, e.g. if I just want to share the kernel of an embedding layer with a dense layer how do I express that?
  3. What kind of validations do we need in place to ensure we don't pass invalid parameters to share between layers?
  4. How do we share parameters between layers where the sharing is valid, but where the param shape function is incompatible? An example is a sharing between a dense layer and a bilinear layer. The dense layer has a parameter shape function with arity-1 for 1 input and the bilinear layer has a shape function with arity-2 for 2 inputs.
  5. What happens if we "cut-off" shared portions of the graph? I don't think the compiler will have the cache built correctly if we do in that case?

These are just the few I can think of right now. I'm sure there's more things to figure out from there. But the good news is the compiler works and the change is really simple :)

@josevalim
Copy link
Contributor

josevalim commented Jul 22, 2022

Should we offer an explicit :parameters option on layer creation?

IMO set_parameters is enough if it is not very common and we already have a get_parameters API.

How should we handle partial parameter sharing, e.g. if I just want to share the kernel of an embedding layer with a dense layer how do I express that?

Can we allow a list of keys on get_parameters(layer, keys \\ :all)? Then set_parameters can see which keys are missing.

What happens if we "cut-off" shared portions of the graph? I don't think the compiler will have the cache built correctly if we do in that case?

I think we are able to see the original part is no longer and therefore we need to look at the parameter in the "usual" position?

@seanmor5 seanmor5 closed this Oct 26, 2022
@seanmor5
Copy link
Contributor Author

Too much has changed, and this needs to be revisited with considerations from the other library

@seanmor5 seanmor5 deleted the sm-weight-sharing branch January 21, 2023 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add weight sharing
2 participants