Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving Realism for Graffiti on Road Signs Using Instruct-Pix2Pix – Suggestions #138

Open
davidemerolla opened this issue Oct 10, 2024 · 0 comments

Comments

@davidemerolla
Copy link

Hi everyone,

I'm working on a project where I need to add realistic graffiti to a road sign (like a STOP sign) using Instruct-Pix2Pix. I'm trying to generate a vandalized appearance with spray-painted graffiti onto the traffic sign and the model I’m using is Stable Diffusion Instruct-Pix2Pix.

Here’s my process so far:

  • Input Image: Clean STOP sign in JPG format.

  • Model: timbrooks/instruct-pix2pix loaded through Stable Diffusion Instruct-Pix2Pix pipeline.

  • Scheduler: Using the Euler Ancestral Discrete scheduler for better control over the generated output.

  • Prompt: "Add black spray-painted graffiti to the sign."

Below the CODE snippet:

import PIL
import requests
import torch
from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler

# Load the Instruct-Pix2Pix model
model_id = "timbrooks/instruct-pix2pix"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16, safety_checker=None)
pipe.to("cuda")

# Set the scheduler for better control of the image
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)

image = PIL.Image.open("STOP_sign.jpg").convert("RGB")

# Prompt to apply graffiti to the STOP sign
prompt = "add black spray-painted graffiti to the sign."

# Set a seed for reproducibility
generator = torch.manual_seed(42)

# Upscale the image resolution
image = image.resize((image.width * 2, image.height * 2), PIL.Image.LANCZOS)

# Run the model with the prompt and the image
images = pipe(prompt, image=image, num_inference_steps=50, image_guidance_scale=1, generator=generator).images

# Show and save the final image with graffiti
images[0].show()
images[0].save("stop_sign_with_graffiti_pix2pix.png", quality=95)

# Clear GPU cache after execution
torch.cuda.empty_cache()

I show you my INPUT IMAGE, what I would like to obtain (REFERENCE IMAGE) and the OUTPUT

INPUT IMAGE
STOP_sign

REFERENCE IMAGE
s-l400

OUTPUT
stop_sign_with_graffiti_pix2pix

HELP

  1. Parameter Tweaks: I’ve set num_inference_steps=50 and image_guidance_scale=1. Should I adjust these settings to improve the quality or realism of the output?

  2. Prompt Optimization: I’m using the prompt “add black spray-painted graffiti to the sign.” Would changing the wording improve the final image? If so, any recommendations?

  3. Texture/Weathering Effects: The graffiti appears somewhat unnatural. How can I make the graffiti look more realistic and blend with the worn texture of the sign? Should I include specific texture descriptions in the prompt?

  4. Any Other Techniques: Would experimenting with different schedulers, control models, or even applying additional post-processing steps help? I’m open to any suggestions, especially if you’ve worked with similar tasks!

Any feedback or examples would be greatly appreciated.
Thanks in advance for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant