📝 TL;DR TempSAL introduces a novel approach to predicting saliency in dynamic scenes by leveraging temporal information to model how human attention shifts over time. Unlike conventional saliency prediction models that focus on static images, TempSAL predicts attention at different time intervals, capturing the evolving nature of salient regions as viewers focus on various aspects of a scene. Using a temporal model trained on the SALICON dataset, TempSAL achieves state-of-the-art performance in temporal saliency prediction. Key applications include video analysis, human-computer interaction, and attention-based scene understanding. The method can predict both individual time-step saliency maps and an aggregated saliency map for an entire sequence.
📢 New Release: TensorFlow Weights
TensorFlow weights for TempSAL is now available in this repo! 🎉
Example of Evolving Human Attention Over Time:
The top row shows temporal (in orange) and image (in pink) saliency ground truth from the SALICON dataset. The bottom row displays our predictions. Each temporal saliency map$\mathcal{T}_i$ , where$i \in {1,\ldots,5}$ , represents one second of observation time. Notably, in$\mathcal{T}_1$ , the chef is the salient focus, while in$\mathcal{T}_2$ and$\mathcal{T}_3$ , the food on the barbecue becomes the most salient region. Temporal saliency maps can be predicted for each interval individually or combined to produce a refined saliency map for the entire observation period.
-
Visit the TempSAL Project Page for more resources and supplementary materials.
-
📹 Video on YouTube: Watch an overview of TempSAL.
-
📑 Slides (PDF): Download the presentation slides.
-
🖼️ Poster (PDF): View the poster displayed at CVPR 2023, summarizing our model, key experiments, and results.
-
💻 Virtual Poster Session: Access the virtual poster session for additional context.
Install all necessary packages by running the following command in the src/
folder:
pip install -r requirements.txt
-
Download Model Checkpoint:
Download the pre-trained model from Google Drive. -
Run Inference:
Follow instructions ininference.ipynb
to generate predictions on both temporal and image saliency.
-
Download Ground-Truth Data
Temporal saliency ground-truth maps and fixation data from the SALICON dataset are available here. -
Generate Custom Saliency Volumes
Alternatively, usegenerate_volumes.py
to create temporal saliency slices with customizable intervals.
For projects focused on temporal saliency training and predictions, please refer to TemporalSaliencyPrediction by Ludo Hoffstetter.
If you use this work in your research, please cite our paper as follows:
@InProceedings{aydemir2023tempsal,
title = {TempSAL - Uncovering Temporal Information for Deep Saliency Prediction},
author = {Aydemir, Bahar and Hoffstetter, Ludo and Zhang, Tong and Salzmann, Mathieu and S{"u}sstrunk, Sabine},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2023},
}
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.