[ICCV 2019] ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning #7

uhhyunjoo · 2022-04-13T16:02:55Z

	link
paper	ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning
code	papers with code

uhhyunjoo · 2022-04-14T07:48:23Z

Abstract

비디오 pairs 간의 fine-grained Spatio-Temporal relations 를 고려하는 Video Similarity Learning architure 인 ViSiL 을 제안함
- 해당 relations 는 이전의 video retrieval 방식에서는 고려되지 않았음
- 여기서 말하는 이전 방식은 whole frame 이나 whole video 를 embed 시켜서 a vector descriptor 로 만들고, 그 후에 similarity estimation 을 하는 것임
ViSiL 은 refined frame-to-frame similarity matrics 를 이용해서 video-to-video similarity 를 computation 을 학습하는 CNN-based 모델임
이로 인해 intra-frame relation 와 inter-frame relation 를 둘 다 고려 가능함
제안된 method
1. regional CNN features 에 Tensor Dot (TD) 랑 Chamfer Similarity (CS) 를 적용해서, pairwise frame similarity 를 estimate 함
  - 이로 인해 frames 간의 similarity 가 계산되기 전에 feature aggregation 이 되는 것을 avoid 함
2. video frames 간의 similarity matrix 가 a four-layer CNN 에 fed 되고, 이후 CS 이용해서 a video-to-video similarity 로 만듦
  - 이로 인해 videos 간의 similarity 가 계산되기 전에 featurea aggregation 이 되는 것을 avoid 함
  - 이로 인해 matching frame sequences 간의 temporal similarity patterns 를 capture 함
3. train : a triplet loss scheme
4개의 retreival 문제에 대해 5개의 벤치마크 데이터셋을 이용하여 evaluate 했고, sota 달성함

uhhyunjoo added ICCV International Conference on Computer Vision V2V Video to Video Retrieval labels Apr 14, 2022

uhhyunjoo changed the title ~~new2~~ [ICCV 2019] ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning Apr 14, 2022

uhhyunjoo moved this to note 🔥 in paper-reading-project Apr 14, 2022

uhhyunjoo added this to paper-reading-project Apr 14, 2022