You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I see the fine-tune of lora implementions in your code is only tune the parameters of image-encoder in sam, if it is important to take adaptation of downstream prompt encoder and mask decoder in sam?
why I try to expand the fine-tune to those block, but the result shows like that
Is the period of training should be longer than that of fine-tune of only lora?
The text was updated successfully, but these errors were encountered:
You are right, my implementation adapt the encoder only. I tought that it would be wise to adapt the feature extractor which is the encoder. I believe that adapting the mask decoder would make similar results so I don't understand why the results are like this.
For the prompt encoder, I am not sure if it is necessary to adapt.
I see the fine-tune of lora implementions in your code is only tune the parameters of image-encoder in sam, if it is important to take adaptation of downstream prompt encoder and mask decoder in sam?
why I try to expand the fine-tune to those block, but the result shows like that
Is the period of training should be longer than that of fine-tune of only lora?
The text was updated successfully, but these errors were encountered: