Difficulty matching paper results #26

MatthewCWeston · 2023-05-04T05:33:46Z

Hello there - this is a really cool paper. I'd been trying to emulate the original paper's results for adding sunglasses to a cat using both the officially released embeddings, and each of the embedding generation methods in the repo. While I was able to get a working setup using a special embedding generation method I implemented myself, nothing I tried was able to get the existing pipeline to reliably complete the task in question.

My workflow:

Initialize editing pipeline with DDIM scheduler and standard SD1.4 weights
map cat_sd14 to cat-wearing-sunglasses_sd14 (or a generated pair of embeddings; mean difference in any case)
Run with default cross-attention guidance (tried other values; didn't improve the output)
Compare the generated (reconstructed) and modified images to each other

Given that I was able to generate embeddings for which it works, it can't be too far off.

My results with 'special' embeddings (proof that the workflow above can work, but my method for generating this embedding is not in line with what the paper describes):

My results with released embeddings (similar results using each of the released caption generation methods to generate embeddings):

Did I miss part of the paper? I get similar results from the official demo and the gradio app, which makes things especially tricky to diagnose. My best guess, given what I've seen, is something related to prompt engineering for the generated captions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difficulty matching paper results #26

Difficulty matching paper results #26

MatthewCWeston commented May 4, 2023 •

edited

Loading

Difficulty matching paper results #26

Difficulty matching paper results #26

Comments

MatthewCWeston commented May 4, 2023 • edited Loading

MatthewCWeston commented May 4, 2023 •

edited

Loading