Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unload_textual_inversion method #6656

Merged
merged 5 commits into from
Jan 29, 2024

Conversation

fabiorigano
Copy link
Contributor

What does this PR do?

Fixes #6013

Before submitting

Who can review?

@apolinario @sayakpaul

@patrickvonplaten
Copy link
Contributor

@yiyixuxu could you try to give this a review?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for working on this! I left some questions :)

src/diffusers/loaders/textual_inversion.py Outdated Show resolved Hide resolved

# Fix token ids in tokenizer
key_id = 1
for token_id in tokenizer.added_tokens_decoder:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why do we need this block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this block to make all token ids sequential after one of the added tokens is removed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for explaining this! @fabiorigano

I'm not very familiar with the use case
cc @apolinario ad @linoytsaban here can you take a look to see if we need to reorder the remaining added tokens after we remove some?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fabiorigano! I looked a bit but I'm actually not quite sure why it's necessary to reorder 🤔

Copy link
Contributor Author

@fabiorigano fabiorigano Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @linoytsaban, the reordering block is useful to have the same indeces as the text embeddings in the encoder, so multiple unload_textual_inversion(<token>) calls will remove the correct text embeddings. If the reordering is not done, when re-executing unload_textual_inversion(<another-token>) the last for loop may fail, because it may remove a different text embedding than expected.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha! that makes total sense, thanks for explaining! 🤗

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me! thanks for working on this:)
let's wait for a final review from @apolinario

src/diffusers/loaders/textual_inversion.py Outdated Show resolved Hide resolved
src/diffusers/loaders/textual_inversion.py Outdated Show resolved Hide resolved
src/diffusers/loaders/textual_inversion.py Outdated Show resolved Hide resolved
src/diffusers/loaders/textual_inversion.py Outdated Show resolved Hide resolved
src/diffusers/loaders/textual_inversion.py Outdated Show resolved Hide resolved

# Fix token ids in tokenizer
key_id = 1
for token_id in tokenizer.added_tokens_decoder:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for explaining this! @fabiorigano

I'm not very familiar with the use case
cc @apolinario ad @linoytsaban here can you take a look to see if we need to reorder the remaining added tokens after we remove some?

Copy link
Collaborator

@linoytsaban linoytsaban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this PR @fabiorigano! LGTM :)

@yiyixuxu yiyixuxu merged commit 5d8b198 into huggingface:main Jan 29, 2024
14 checks passed
@yiyixuxu
Copy link
Collaborator

@fabiorigano @linoytsaban thank you both!

great work as always @fabiorigano :)

@fabiorigano
Copy link
Contributor Author

@yiyixuxu @linoytsaban thank you for your reviews! :)

AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* Add unload_textual_inversion

* Fix dicts in tokenizer

* Fix quality

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Fix variable name after last update

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add unload_textual_inversion() method
5 participants