diffusion

denoising diffusion probabilistic models”。 DDPM: https://arxiv.org/abs/2006.11239 Input: noised image + iteration
model output: noise predicted -> denoising

stable diffusion

text encoder to vector
generation model (diffusion) Denoising U-Net
decoder to final version in pixel space parallel training
mid journey: during training process, illustrates the results encoded from the denoising images

text encoder:

gpt coding/ BIRT
criteria: CLIP score/ FID-10k
- FID: standard -> pre-trained CNN classification model -> representation ; the distance between the representation of the generated images and the representation of the real images (assumption of Gaussians distribution)
  两组distribution的距离 limitation: need a large scale of generated images
- CLIP: An additional Image Encoder model CLIP score: the vectors similarity between the encoded text and encoded generated image representations

decoder:

feature: Training without knowing the correspondence between images and text intermediate:

compressed image: sample and downsample -> train
latent representation: auto-encoder ??
- input: H*W*3 latent: h*w*c (exceeding vision dimension)

generation model:

input: noised image + text
output: intermediate
text(additional): condition (can be ignored during inferation) 加噪过程，改为加在中间杂序上，使用auto-encoder的encoder部分 train a noise predictor denoising: initialized by sampling normal distribution noise

algorithm

part 1 training

loss function during training: 2. xo -> clean images 4. $\epsilon$ samples from normal distribution ($\mu$ = 0,v = 1) 5. inside: weighted sum, noising
the larger t is, the more proportion the noise added
$\epsilon_\theta$ : noise predictor input: noiy image + t(step/iteration) output: predicted noise image

compared with the target noise you have sampled in step 4

difference with origin steps noise and denoise step by step < DDPM training > predicting the noise by once
why?

sample

generate image strangeness: elinimate the predicted noise and add a new one afterward (plus signal)

theory

map the generated distribution to the actual world distribution Q: to measure the similarity of the two- A: maximum likelihood Estimation:(MLE)
$P_\theta(x)<->P_{data}(x)$
sample
all objective for image generation model

KL divergence

KL diverges: 衡量两种分布差异程度 definition：$D_{KL}(P | Q) = \int p(x) \log \left(\frac{p(x)}{q(x)}\right) dx$ 非对称性

VAE encoder & diffusion model

q(z|x) z: distribution (major Gaussians) given the data x (x -> image to imitate) maximize louwer bound maximize lower bound of logP(x) VAE: $\mathbb{E}{q(z|x)}[\log{\frac{p(x,z)}{q(z|x)}} ]$ DDPM: z->x_0 $\mathbb{E}{q(x_1:x_T|x)}[\log{\frac{p(x_0:x_T)}{q(x_1:x_T|x)}} ]$

course:C5 000

unlearning

related work

concept censorship

Prior approaches have focused on dataset filtering [30], post-generation filtering [29], or inference guiding [38]

removal or guidance post-hoc: using classifier after training adding guidance to the inference process *[38] [38] Patrick Schramowski, Manuel Brack, Björn Deiseroth, and Kristian Kersting. Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models. arXiv preprint arXiv:2211.05105, 2022. sota guidance-based approach
image cloaking adding adversarial perturbations

model edit

GAN -> diffusion model by a token for a new subject trained using only a handful of images

unlearning

previous assumption: unintentional memorization; undesired knowledfe is identifiable on a set of training data points
our: erase a high-level visual concept

inspiration and source

set-like composition? energy-based models EBM
A and not B as the difference between log probability densities for A and B [10], [11], [37], [38]
score based composition
reference:

(source/Kimi.jpg) future: EBM, stable, practical

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

note.md

note.md

diffusion

stable diffusion

text encoder:

decoder:

generation model:

algorithm

part 1 training

sample

theory

KL divergence

VAE encoder & diffusion model

unlearning

related work

concept censorship

model edit

unlearning

inspiration and source

Files

note.md

Latest commit

History

note.md

File metadata and controls

diffusion

stable diffusion

text encoder:

decoder:

generation model:

algorithm

part 1 training

sample

theory

KL divergence

VAE encoder & diffusion model

unlearning

related work

concept censorship

model edit

unlearning

inspiration and source