Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unload_textual_inversion() method #6013

Closed
apolinario opened this issue Dec 1, 2023 · 8 comments · Fixed by #6656
Closed

Add unload_textual_inversion() method #6013

apolinario opened this issue Dec 1, 2023 · 8 comments · Fixed by #6656

Comments

@apolinario
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
Currently you can add textual inversion with .load_textual_inversion(). But once the tokenizer and text encoder have the new embeddings, it's complex to remove the embeddings and get it back to the original state.

Describe the solution you'd like.
An unload_textual_inversion() to remove alien/foreign tokens and get the text encoder back to the original state (maybe there could also be a way to pass specific tokens to be removed).

Describe alternatives you've considered.
As @patrickvonplaten described internally, it is possible to remove tokens currently by doing something like:

token_id = pipe.tokenizer.convert_token_to_id("<token>")
del pipe.tokenizer._added_tokens_decoder[token_id]
pipe.tokenizer._update_trie()

and

text_embeddings = pipe.text_encoder.get_input_embeddings().weight
text_embeddings = text_embedding[:len(pipe.tokenizer)]
pipe.text_encoder.set_input_embeddings(text_embeddings)

and it would probably be easier with a specific remove_token method from Transformers (huggingface/transformers#15032, huggingface/transformers#4827).
However, imo it does not eliminate the usefulness of having a diffusers specific method that leverages the ways of removing tokens in order to provide a simple API for folks to remove tokens.

Additional context.
Hot-swapping (keeping a base model warm and swapping the LoRAs on top) requires fusing/unfusing LoRAs, but with pivotal tuning this becomes more complex to reset the text encoder, tokenizer

@sayakpaul
Copy link
Member

@patrickvonplaten would you have time to implement this one? Otherwise, happy to look into it later.

@patrickvonplaten
Copy link
Contributor

No bandwidth atm, also don't think it's super high prio tbh

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Dec 1, 2023

@sayakpaul can open this issue to the community too:)

@kiluazen
Copy link

kiluazen commented Dec 2, 2023

I want to work on this!
@sayakpaul

@sayakpaul
Copy link
Member

All yours!

@apolinario
Copy link
Collaborator Author

apolinario commented Dec 7, 2023

As a reference, I have implemented here a function for unloading a token from the tokenizer and text encoder of the CLIP text encoder of Stable Diffusion XL

token_to_remove = "<s0>" #for example
#takes the 2nd token from the tokenizer, the first is BOS
token_id = tokenizer(token_to_remove)["input_ids"][1] 

#delete from tokenizer
del tokenizer._added_tokens_decoder[token_id]
del tokenizer._added_tokens_encoder[token_to_remove]
tokenizer._update_trie()

#delete from text encoder 
tokenizer_size = len(tokenizer)
text_embedding_dim = text_encoder.get_input_embeddings().embedding_dim
text_embedding_weights = text_encoder.get_input_embeddings().weight[:tokenizer_size]
text_embeddings_filtered = nn.Embedding(tokenizer_size, text_embedding_dim)
text_embeddings_filtered.weight.data = text_embedding_weights
text_encoder.set_input_embeddings(text_embeddings_filtered)

@kovtcharov
Copy link

is there an existing workaround or fix for this? the original proposed solution above has not worked for me.

@fabiorigano
Copy link
Contributor

hi, since I see no one has added a PR to solve this issue, I am pushing mine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants