-
Notifications
You must be signed in to change notification settings - Fork 21
added merging methods and scoring functions #165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
added merging methods and scoring functions
|
@microsoft-github-policy-service agree company="Microsoft" |
| return grads * keep_mask | ||
| return hook | ||
|
|
||
| def load_mask(f_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way we can avoid needing these functions save_mask and load_mask?
weight_mask should be included in state_dict by now so can be reloaded from checkpoints?
mttl/models/modifiers/sparse_mask.py
Outdated
| keep_masks = torch.zeros_like(m.sparse_layer.weight) | ||
| m.revert_weight_grad_and_update_mask(keep_masks) | ||
| # based on gradient-magnitude | ||
| elif parameter_selection_procedure=='gradient_magnitude': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add tests for the new permutations of these parameters in test_sparse_mask.py?
| self.sparse_layer.forward = types.MethodType(mod_forward, self.sparse_layer) | ||
|
|
||
| @torch.no_grad() | ||
| def convert_sparse_weight_to_1D(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ever used? ctrl+f doesn't find anything for me.
| if m.sparse_cat == 'block_sparse': | ||
| keep_masks = get_block_mask(m) | ||
| elif m.sparse_cat == "regular_sparse": | ||
| # check: sample noise-block-idx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove comments when ready. optionally add logging
moved merging model to separate folder, updated essential function and arguments for sparse-adapter training
|
will make a new pr |
Added merging methods:
Added different scoring functions