Paper Reading

ICLR2021

In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning: This paper studies Semi-Suervised Learning (SSL). The paper suggests that consistency regularization, a popular approach in SSL has limitations such as requiring domain-specific data augmentation.Psudo-labeling on the other hand does not have these lmimitation but underperforms relative to consistency regularization. They suggest that this is due to high amount of noise in the pseudo labels which resulst from over-confident models. They make a connection between network calibration and uncertainty estimation and by including model uncertainty in the process of pseudo label selection, reduce the noise level and improve the overall performance. They experiment with multiple methods for uncertainty estimation and show that all this methos achieve similar results.

ICCV2021

CVPR2021

Learning Position and Target Consistency for Memory-based Video Object Segmentation : Matching-based methods do not consider any prior about the sequential order of the frames and how pixels of an object move together. This paper addresses this problem by introducing 1) global retrieval module, 2) position guidance module, 3) object relation module. Global retrival mainly follows the architecture in STM. For position guidance module, additional local keys are extracted from the query embedding and the previous adjacent memory embedding. This module adds positional encoding to both aforementioned embeddings, making the local keys position-sensitive. Finally, the object relation module brings the object-level information from the first frame to improve the target consistency. This way, we specifically pay attention to the first frame, unlike previous methods that treat the first frame the same as others stored in the memory bank.

Similar idea for temporal consistency, Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning , One-Shot Object Detection with Co-Attention and Co-Excitation.
Why the position embedding is added only to the previous frame?

Delving Deep into Many-to-many Attention for Few-shot Video Object Segmentation: In few-shot VOS, a support set for multiple appearances of a target object is provided; given the query images containing an object instance from the same class, the model should segment objects of the same category. Two main approaches are either computing a prototype feature vector from the support set and detecting the object in the query via comparison to the prototype or perform many-to-many attention between the support set and the query frames, which is computationally expensive. This paper considers the latter and proposes a solution for reducing the exponential cost in many-to-many attention operation to linear without performance loss.

A limitation is the naive way of choosing the agent frame from the video (middle frame), which could be the subject of future work.

Group Collaborative Learning for Co-Salient Object Detection:

Task: Co-salient object detection targets at detecting common salient objects sharing the same attributes given a group of relevant images.
Why: Instead of only using images from the same group (similar things), teach the network dissimilar things using images from the other group. Therefore, the goal is to increase the intra-group compactness and the inter-group distinctiveness.
How: The Group Affinity module brings the embeddings of the objects from the same category closer by computing a general group consensus from a group of images containing the same object (using correlation ops). The Group Collaborative Learning Network improves the inter-group separability by similar operations, only adding cross-group correlation. The consensus computed from this operation should not be able to detect the common object.
Reference: SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks

Inception Convolution with Efficient Dilation Search:

Task: Any architecture using Conv layer.
Why: To have an adaptive receptive field by searching optimal dilation rates across spatial and channel dimensions instead of using a fixed manual dilation.
How: Using a search algorithm referred to as EDO (efficient dilation optimization). The statistical optimization minimizes the L1 error between the expectation of the output of the pre-trained weights (from the so-called supernet) and the expectation of the output from the sampled dilation weights. For more information about the role of the pre-trained weights refer to DARTS method.
Question: why should the dilation pattern give us the same expected value as the pre-trained supernet? Does this optimization happen together with the actual training of the backbone weights?

Improving Multiple Object Tracking with Single Object Tracking: This paper proposes the SOTMOT architecture for multiple object tracking to bring the single object tracking advances to MOT setup! The training pipeline consists of offline and online phases. During the offline training, the SOT branch (which is based on CenterNet) is trained via minimizing the ridge regression loss. CenterNet produces the heatmap of the objects as well as an offset value for the object center and bounding box sizes. In online inference, an association algorithm (DeepSORT) is used to find the optimal trajectory for each object.

Additional references: Learning Feature Embeddings for Discriminant Model based Tracking, Simple Online and Realtime Tracking with a Deep Association Metric

Neurips2020

Dual-Resolution Correspondence Networks:

Additional references: Neighbourhood Consensus Networks

ECCV2020, highlights

Embedding Propagation: Smoother Manifold for Few-Shot Classification:

Additional references: Learning from labeled and unlabeled data with label propagation

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Paper Reading

ICLR2021

ICCV2021

CVPR2021

Neurips2020

ECCV2020, highlights

About

Releases

Packages

fatemehazimi990/Paper-Reading

Folders and files

Latest commit

History

Repository files navigation

Paper Reading

ICLR2021

ICCV2021

CVPR2021

Neurips2020

ECCV2020, highlights

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages