VAE Config Files #27

JosephDiPalma · 2024-12-29T19:20:48Z

I'm interested in fine-tuning a VAE and was wondering the config files you used for the final vq-f4 model in your paper?
I see the existing config files under the autoencoder directory, but they are missing some hyper-parameters like number of training epochs.
Please advise.

srikarym · 2024-12-30T15:21:49Z

The VAE used in our paper is a VQ-f4 VAE (VQGAN). The original LDM repo only supports KL VAE training, so we used the codebase of taming-transformers. We fine-tuned the VAE for 50k iterations on 3 GPUs, with bs=12 per GPU.

Diffusers library now supports training a VQ-VAE: https://github.com/huggingface/diffusers/tree/main/examples/vqgan

You might want to track image quality metrics like PSNR / SSIM during training.

JosephDiPalma · 2024-12-31T15:17:25Z

What about the discriminator hyper-parameters and the codebook weight for the LPIPS loss?

JosephDiPalma · 2024-12-31T20:41:45Z

On second thought, would it be possible to share the config file you used for this phase?
I have managed to make the code run, but not sure if my results are correct.
They look strange as the total loss is increasing and I'm not sure if one of my settings is bad.

I appreciate all the help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VAE Config Files #27

VAE Config Files #27

JosephDiPalma commented Dec 29, 2024

srikarym commented Dec 30, 2024 •

edited

Loading

JosephDiPalma commented Dec 31, 2024

JosephDiPalma commented Dec 31, 2024

VAE Config Files #27

VAE Config Files #27

Comments

JosephDiPalma commented Dec 29, 2024

srikarym commented Dec 30, 2024 • edited Loading

JosephDiPalma commented Dec 31, 2024

JosephDiPalma commented Dec 31, 2024

srikarym commented Dec 30, 2024 •

edited

Loading