Data Augmentation via Latent Diffusion for Saliency Prediction (ECCV 2024)

This is the GitHub repository for Data Augmentation via Latent Diffusion for Saliency Prediction paper in ECCV 2024, Milano, Italy.

📝 TL;DR We use latent diffusion to augment data for saliency prediction. We show our method improves the performance of existing saliency models. Moreover, we learn multilevel features that improve saliency prediction.

📄 Resources

📚 Paper
📦 Supplementary Material
🖼️ Poster
🎞️ Video
💻 Virtual Poster Session

[🤗 Demo is coming soon!]

Saliency prediction models are constrained by the limited diversity and quantity of labeled data. Standard data augmentation techniques such as rotating and cropping alter scene composition, affecting saliency. We propose a novel data augmentation method for deep saliency prediction that edits natural images while preserving the complexity and variability of real-world scenes. Since saliency depends on high-level and low-level features, our approach involves learning both by incorporating photometric and semantic attributes such as color, contrast, brightness, and class. To that end, we introduce a saliency-guided cross-attention mechanism that enables targeted edits on the photometric properties, thereby enhancing saliency within specific image regions. Experimental results show that our data augmentation method consistently improves the performance of various saliency models. Moreover, leveraging the augmentation features for saliency prediction yields superior performance on publicly available saliency benchmarks. Our predictions align closely with human visual attention patterns in the edited images, as validated by a user study.

🌐 Visualizations

Explore our visualizations and learn more about our project at https://augsal.github.io/.

📜 Citation

If you use this work in your research, please cite our paper as follows:

@InProceedings{aydemir2024augsal,
  title     = {Data Augmentation via Latent Diffusion forSaliency Prediction},
  author    = {Aydemir, Bahar and Bhattacharjee, Deblina and Zhang, Tong and Salzmann, Mathieu and S{"u}sstrunk, Sabine},
  booktitle = {18th European Conference on Computer Vision (ECCV), Proceedings},
  year      = {2024},
}

📜 License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
augmented_images		augmented_images
images		images
pdfs		pdfs
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Augmentation via Latent Diffusion for Saliency Prediction (ECCV 2024)

📄 Resources

🌐 Visualizations

📜 Citation

📜 License

About

Releases

Packages

Languages

IVRL/AugSal

Folders and files

Latest commit

History

Repository files navigation

Data Augmentation via Latent Diffusion for Saliency Prediction (ECCV 2024)

📄 Resources

🌐 Visualizations

📜 Citation

📜 License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages