Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Edit Directions #3

Open
prometx11 opened this issue Feb 11, 2023 · 9 comments
Open

Custom Edit Directions #3

prometx11 opened this issue Feb 11, 2023 · 9 comments

Comments

@prometx11
Copy link

In the description where you explain how to create new edit directions you mention to create 2 files with large number of sentences (~1000), but there is no explanation of what those sentences should be exactly.

I tried to find out more details before opening this issue, but I could not figure out what the 2 files should contain exactly.

Thanks for the help and this amazing work!

@pix2pixzero
Copy link
Owner

Hi,

The src/make_edit_direction.py script expects each sentence file to have one sentence per line.
For instance, for the making a "banana" embedding, a "banana.txt" file should look something like this:

A ripe yellow banana sits on a surface, its curves contrasting with the straight lines around it.
The banana is the standout feature of the picture, its bright color drawing the eye.
The fruit's skin is slightly speckled, indicating its natural sweetness.
Shadows play across the surface of the banana, adding depth and dimension to the image.
The bright yellow color of the banana is a burst of happiness in an otherwise neutral background.
.
.
.

To generate these sentences, you could prompt your choice of language model (GPT3, ChatGPT, etc) with sentences like:
"Generate many sentences that describe a picture containing a {word}."

Please let us know if you run into any issues with generating custom directions!

Regards,
Authors

@HelenMao
Copy link

@pix2pixzero Hi, thanks for your detailed instructions and releasing the codes of your inspiring work. I have tried "cat2yawning cat" by constructing text files following "Generate many sentences that describe a picture containing a yawning cat." using chatGPT. However, this editing direction nearly unalter the provided cat image and could not reproduce the results like the paper. Could you give more suggestions or release more editing direction files to verify the effectiveness of other editing directions?

@pix2pixzero
Copy link
Owner

Yes, we plan to release a huggingface demo and a large list of directions soon.
Meanwhile, there are a couple of things you could try to improve the direction editing:

  • scale the direction vector to make the direction more prominent. i.e. in this file you can add a multiplier to the directions by changing construct_direction(args.task_name) to construct_direction(args.task_name)*1.5.

  • You can decrease the amount of cross-attention guidance with the flag --xa_guidance.

Please let us know if you still cannot get the results you want!

Regards,
Authors

@HelenMao
Copy link

Hi, I still cannot reproduce results of yawning cat following your suggestions.
I change into construct_direction(args.task_name)*1.5 and use --xa_guidance=0.05
This is the editing result of cat_1.png in the test images folder, which nearly unalters.
cat_1

When I still remain the default setting:construct_direction(args.task_name) and use --xa_guidance=0.1
This is the editing result:
cat_1

When I use this setting:construct_direction(args.task_name)*1.5 and --xa_guidance=0.1
This is the editing result:

cat_1

Moreover, using different source images leads to different editing effects (nearly unchange also).

I find whether the sentences of yawning cat influence the editing direction. Below are sentences I used for deriving the editing direction. I generate more than 1000 sentences:
A yawning cat stretches its limbs in a sunbeam.
The cat's mouth is wide open as it lets out a deep yawn.
The sleepy cat's eyes droop as it takes a long, satisfying stretch.
The yawning cat seems to be in a state of pure relaxation.
The cat's tongue can be seen as it yawns and stretches.
The lazy cat is not in a hurry to do anything, but to yawn and nap.
The cat's fur is rumpled from a long nap, and its mouth is wide in a yawn.
A sleepy cat stretches and yawns, enjoying a cozy nap.
The cat's pink tongue is visible as it lets out a big yawn.
The lazy cat yawns and curls up for another nap.
The cat's jaws open wide as it yawns, revealing sharp teeth.
The yawning cat looks like it's ready to go back to sleep.
The cat's eyes are half closed as it yawns, enjoying a lazy day.
The relaxed cat stretches out its paws and lets out a big yawn.
The sleepy cat is content to yawn and laze about in the sun.
The yawning cat looks like it just woke up from a deep nap.
The cat's mouth is wide open as it yawns, baring its teeth.
The lazy cat yawns and looks up at the camera, as if to say "don't disturb me."

@rahulvigneswaran
Copy link

OpenAI's api keeps freezing and on top of that doing this for several directions seems expensive. Is there any other way to generate such sentences?

@rahulvigneswaran
Copy link

I have generated edit directions for 100 classes in cifar100 - https://github.com/rahulvigneswaran/pix2pix-zero-directions

@Larerr
Copy link

Larerr commented Mar 20, 2024

@rahulvigneswaran 404 Not Found. QAQ

@OrangeZz0331
Copy link

Hi,

The src/make_edit_direction.py script expects each sentence file to have one sentence per line. For instance, for the making a "banana" embedding, a "banana.txt" file should look something like this:

A ripe yellow banana sits on a surface, its curves contrasting with the straight lines around it.
The banana is the standout feature of the picture, its bright color drawing the eye.
The fruit's skin is slightly speckled, indicating its natural sweetness.
Shadows play across the surface of the banana, adding depth and dimension to the image.
The bright yellow color of the banana is a burst of happiness in an otherwise neutral background.
.
.
.

To generate these sentences, you could prompt your choice of language model (GPT3, ChatGPT, etc) with sentences like: "Generate many sentences that describe a picture containing a {word}."

Please let us know if you run into any issues with generating custom directions!

Regards, Authors

Thank you so much!
And I would like to konw if I want to try a style change, for example, if I need “sketch”style, how should I use chatgpt3.5 to generate sentences describing this style?

@israrbacha
Copy link

I have generated edit directions for 100 classes in cifar100 - https://github.com/rahulvigneswaran/pix2pix-zero-directions

This link does not exist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants