Skip to content

Conversation

@SaminYeasar
Copy link

  • Added merging methods:

    • SparseMerge
    • SLERP
    • LERP
    • TiesMerge: made correction
    • TaskArithmatic
    • ModelBreadcrumbs
    • UniformMerge
  • Added different scoring functions

    • grow and drop
    • layer drop + sparse
    • model wise sparse
    • gradient-magnitude based sparse
    • weight-magnitude based sparse
    • added backwardhook: will mask gradient during backdrop

@SaminYeasar
Copy link
Author

@microsoft-github-policy-service agree company="Microsoft"

return grads * keep_mask
return hook

def load_mask(f_name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we can avoid needing these functions save_mask and load_mask?

weight_mask should be included in state_dict by now so can be reloaded from checkpoints?

keep_masks = torch.zeros_like(m.sparse_layer.weight)
m.revert_weight_grad_and_update_mask(keep_masks)
# based on gradient-magnitude
elif parameter_selection_procedure=='gradient_magnitude':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add tests for the new permutations of these parameters in test_sparse_mask.py?

self.sparse_layer.forward = types.MethodType(mod_forward, self.sparse_layer)

@torch.no_grad()
def convert_sparse_weight_to_1D(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever used? ctrl+f doesn't find anything for me.

if m.sparse_cat == 'block_sparse':
keep_masks = get_block_mask(m)
elif m.sparse_cat == "regular_sparse":
# check: sample noise-block-idx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comments when ready. optionally add logging

moved merging model to separate folder, updated essential function and arguments for sparse-adapter training
@SaminYeasar
Copy link
Author

will make a new pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants