GitHub - Mowenyii/Uniform-Attention-Maps: [WACV 2025] Uniform Attention Maps: Enhancing Image Fidelity in Reconstruction and Editing

Uniform Attention Maps

This repository provides the implementation of the paper "Uniform Attention Maps: Enhancing Image Fidelity in Reconstruction and Editing" (WACV 2025).

Keywords: Diffusion Model, Image Inversion, Image Editing

Method Overview

CLICK for the full abstract

Text-guided image generation and editing using diffusion models have achieved remarkable advancements. Among these, tuning-free methods have gained attention for their ability to perform edits without extensive model adjustments, offering simplicity and efficiency. However, existing tuning-free approaches often struggle with balancing fidelity and editing precision, particularly due to the influence of cross-attention in DDIM inversion, which introduces reconstruction errors. To address this, we analyze reconstruction from a structural perspective and propose a novel approach that replaces traditional cross-attention with uniform attention maps, significantly enhancing image reconstruction fidelity. Our method effectively minimizes distortions caused by varying text conditions during noise prediction. To complement this improvement, we introduce an adaptive mask-guided editing technique that integrates seamlessly with our reconstruction approach, ensuring consistency and accuracy in editing tasks. Experimental results demonstrate that our approach not only excels in achieving high-fidelity image reconstruction but also performs robustly in real image composition and editing scenarios. This study underscores the potential of uniform attention maps to enhance the fidelity and versatility of diffusion-based image processing methods.

(a) Image reconstruction using DDIM with different prompts. (b) Our approach introduces Uniform Attention Maps. (c) The proposed tuning-free image editing framework.

Getting Started

Environment Requirement

conda create -n masa python=3.9 -y
conda activate masa
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
pip install -r environment/masactrl_requirements.txt

Benchmark Download

You can download the benchmark PIE-Bench (Prompt-driven Image Editing Benchmark) here.

Inference

python run_editing_masactrl.py --edit_category_list 0 1 2 3 4 5 6 7 8 9 --output_path output --edit_method_list "ddim+masactrl" --guidance_list 7.5 --quantile_list 0.5 --recon_t_list 200

Evaluation

python evaluation/eval.py --metrics "structure_distance" "psnr_unedit_part" "lpips_unedit_part" "mse_unedit_part" "ssim_unedit_part" "clip_similarity_source_image" "clip_similarity_target_image" "clip_similarity_target_image_edit_part" --edit_category_list 0 1 2 3 4 5 6 7 8 9 --tgt_methods 1_ddim+masactrl --output_path output

Results

Based on our algorithm, as shown in the pseudocode, adding the red-highlighted line of code can significantly improve the editing effect. The quantitative results are shown in the table below, with the red-colored "+ ours" indicating the improvements.

Quantitative comparison of image editing on the PIE benchmark. The methods are compared using the Masactrl attention control.

Cite Us

@misc{mo2024uniformattentionmapsboosting,
      title={Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing}, 
      author={Wenyi Mo and Tianyu Zhang and Yalong Bai and Bing Su and Ji-Rong Wen},
      year={2024},
      eprint={2411.19652},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.19652}, 
}

Acknowledgement

Our code is heavily based on the prompt-to-prompt, PnPInversion, MasaCtrl, pix2pix-zero , Plug-and-Play, thanks to all the contributors!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
docs		docs
environment		environment
evaluation		evaluation
models		models
utils		utils
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
run_editing_masactrl.py		run_editing_masactrl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uniform Attention Maps

Method Overview

Getting Started

Environment Requirement

Benchmark Download

Inference

Evaluation

Results

Cite Us

Acknowledgement

About

Releases

Packages

Languages

Mowenyii/Uniform-Attention-Maps

Folders and files

Latest commit

History

Repository files navigation

Uniform Attention Maps

Method Overview

Getting Started

Environment Requirement

Benchmark Download

Inference

Evaluation

Results

Cite Us

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages