Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SDTurbo pipeline #15

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

cyrildiagne
Copy link

@cyrildiagne cyrildiagne commented Jan 20, 2024

Adds the SD Turbo pipeline (as demoed here):

  • Adds a new pipeline SDTurboPipeline
  • Adds a new scheduler EulerDiscreteScheduler
  • Adds Tensor.sqrt & interp tensor utils
  • Adds the pipeline to the react example

TODO:

  • Find a better way to identify the SD Turbo model config

Note:
This PR doesn't include the usage of TAESD which greatly increases speed for a small quality downgrade (as applied in this demo). But it's not part of the original pipeline so it makes more sense to add it in a separate PR.

@jdp8
Copy link
Contributor

jdp8 commented Feb 6, 2024

Hello @cyrildiagne,

Sorry for taking so long but I was trying to implement the Img2Img pipeline of your SD-Turbo but I'm not getting the expected output. The input image that I'm using is the following:

images_input-2

This is more or less what I'm getting:

image

I added the addNoise() function (taken from here) to the scheduler like so:

addNoise (originalSamples: Tensor, noise: Tensor, timestep: number) {
    const sigma = this.sigmas.data[timestep]
    return originalSamples.add(noise.mul(sigma))
}

Also added code similar to the img2img pipeline that already exists in the Stable Diffusion pipeline, before the denoising loop like so:

if (input.img2imgFlag) {
      const inputImage = input.inputImage || new Float32Array()
      const strength = input.strength || 0.8

      await dispatchProgress(input.progressCallback, {
        status: ProgressStatus.EncodingImg2Img,
      })

      let imageLatent = await this.encodeImage(inputImage, input.width, input.height) // Encode image to latent space
      imageLatent = imageLatent.mul(this.scheduler.initNoiseSigma)

      // Taken from https://towardsdatascience.com/stable-diffusion-using-hugging-face-variations-of-stable-diffusion-56fd2ab7a265#2d1d
      const initTimestep = Math.round(input.numInferenceSteps * strength)
      const timestep = initTimestep
      latents = this.scheduler.addNoise(imageLatent, latents, timestep)
      // Computing the timestep to start the diffusion loop
      const tStart = Math.max(input.numInferenceSteps - initTimestep, 0)
      timesteps = timesteps.slice(tStart)
}

I've kept trying but I'm not completely sure what I'm doing wrong or what's missing to implement the SD-Turbo Img2Img pipeline. Do you have any idea what it could be?

I would appreciate any assistance with this. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants