Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to reach FID 1.92 on Imagenet 64 #8

Open
Schwartz-Zha opened this issue Jul 10, 2024 · 8 comments
Open

How to reach FID 1.92 on Imagenet 64 #8

Schwartz-Zha opened this issue Jul 10, 2024 · 8 comments

Comments

@Schwartz-Zha
Copy link

I downloaded the pretrained Imagenet 64 checkpoint and use the provided sampling commands (with slight modification to make it run on my machine)

export OMPI_COMM_WORLD_RANK=0
export OMPI_COMM_WORLD_LOCAL_RANK=0
export OMPI_COMM_WORLD_SIZE=8

MODEL_FLAGS="--data_name=imagenet64 --class_cond=True --eval_interval=1000 --save_interval=1000 --num_classes=1000 --eval_batch=250 --eval_fid=True --eval_similarity=False --check_dm_performance=False --log_interval=100"

# CUDA_VISIBLE_DEVICES=0 mpiexec -n 1 python ./code/image_sample.py $MODEL_FLAGS --class_cond=True --num_classes=1000 --out_dir ./ctm-sample-paths/ctm_bs_1440/ --model_path=./ctm-runs/ctm_bs_1440/ema_0.999_006000.pt --training_mode=ctm --class_cond=True --eval_num_samples=6400 --batch_size=800 --device_id=0 --stochastic_seed=True --save_format=npz --ind_1=36 --ind_2=20 --use_MPI=True --sampler=exact --sampling_steps=1

CUDA_VISIBLE_DEVICES=0 mpiexec -n 1 python ./code/image_sample.py $MODEL_FLAGS --class_cond=True --num_classes=1000 --out_dir ./ctm-sample-paths/ctm_bs_1440_author/ --model_path=./ckpts/ema_0.999_049000.pt --training_mode=ctm --class_cond=True --eval_num_samples=6400 --batch_size=800 --device_id=0 --stochastic_seed=True --save_format=npz --ind_1=36 --ind_2=20 --use_MPI=True --sampler=exact --sampling_steps=1

And I tested it according to the provided instrutions.

CUDA_VISIBLE_DEVICES=0 python code/evaluations/evaluator.py     ref-statistics/VIRTUAL_imagenet64_labeled.npz     ctm-sample-paths/ctm_bs_1440_author/ctm_exact_sampler_1_steps_049000_itrs_0.999_ema_/

But the performance reading is significantly worse.

Inception Score: 68.49456024169922FID: 6.839029518808786
sFID: 22.465793478419187
Precision: 0.7965625
Recall: 0.6551

How exactly could I read FID 1.92 ? Even with pretrained model directly from the author?

@ehedlin
Copy link

ehedlin commented Dec 2, 2024

I got similar results although I tried with QKVAttentionLegacy and XformersAttention and suspect that the issues raised here could be hurting results? Which form of attention did you use?

@ehedlin
Copy link

ehedlin commented Dec 2, 2024

I also realised that the inference code doesnt seem to use rejection sampling as shown in the paper. This line seems to show that rejection sampling was ran at a ratio of 10,000 except the code referenced doesn't seem to exist in this repo. There is also this file with no documentation, however it seems the most promising.

@Schwartz-Zha
Copy link
Author

I am on a newer version of NVIDIA GPU, i.e. L40S, so no worry on the hardware support of Attention and xformers. Did you manage to merge the rejection sampling into the evaluation pipeline?

@ehedlin
Copy link

ehedlin commented Dec 2, 2024

I ran the classifier rejection code but it seemed to produce similar results so I emailed the authors to ask about the difference in performance.

@Schwartz-Zha
Copy link
Author

Sorry I am a newbie here, do you mind pasting your command to run code/classifier_rejection.py here? I literally have no idea how to do that. CTM claimed to use this sampling strategy to achieve a better result, I am confused how to do that.

@ehedlin
Copy link

ehedlin commented Dec 4, 2024

I was able to get the published performance by running --eval_num_samples=50000 when generating samples (default is 6400). Im assuming that's what was intended as that number seems to be hard coded into the evaluation script. Put another way, the quality of the generated samples dont seem to be the problem, just that the FID is inflated when matching against fewer samples.

@Schwartz-Zha
Copy link
Author

Ah, I see. That makes perfect sense.

By the way, do you still need classifier rejection sampling after changing to 50000 samples?

@ehedlin
Copy link

ehedlin commented Dec 5, 2024

No, I just used code/image_sample.py and code/evaluations/evaluator.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants