Skip to content

Add unload_textual_inversion() method  #6013

Closed
@apolinario

Description

@apolinario

Is your feature request related to a problem? Please describe.
Currently you can add textual inversion with .load_textual_inversion(). But once the tokenizer and text encoder have the new embeddings, it's complex to remove the embeddings and get it back to the original state.

Describe the solution you'd like.
An unload_textual_inversion() to remove alien/foreign tokens and get the text encoder back to the original state (maybe there could also be a way to pass specific tokens to be removed).

Describe alternatives you've considered.
As @patrickvonplaten described internally, it is possible to remove tokens currently by doing something like:

token_id = pipe.tokenizer.convert_token_to_id("<token>")
del pipe.tokenizer._added_tokens_decoder[token_id]
pipe.tokenizer._update_trie()

and

text_embeddings = pipe.text_encoder.get_input_embeddings().weight
text_embeddings = text_embedding[:len(pipe.tokenizer)]
pipe.text_encoder.set_input_embeddings(text_embeddings)

and it would probably be easier with a specific remove_token method from Transformers (huggingface/transformers#15032, huggingface/transformers#4827).
However, imo it does not eliminate the usefulness of having a diffusers specific method that leverages the ways of removing tokens in order to provide a simple API for folks to remove tokens.

Additional context.
Hot-swapping (keeping a base model warm and swapping the LoRAs on top) requires fusing/unfusing LoRAs, but with pivotal tuning this becomes more complex to reset the text encoder, tokenizer

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions