- 6D姿态估计
- 目标检测类
- 图像分割
- 人脸识别
- 目标跟踪
- 三维点云&重建
- 图像处理
- 图像分类
- 动作识别
- 视频分析
- OCR & GAN
- 小样本/零样本/弱监督/无监督/自监督
- 行人跟踪/行人检测/ReID
Figure | Paper List | Link | Keywords |
---|---|---|---|
G2L-Net: Global to Local Network for Real-Time 6D Pose Estimation With Embedding Vector Features - CVPR 2020 | arXiv GitHub | 6D Object Pose Estimation, RGB-D image, real-time | |
LatentFusion: End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation - CVPR 2020 | arXiv GitHub | 6D Object Pose Estimation, Unseen objects, Latent 3D representation | |
Single-Stage 6D Object Pose Estimation - CVPR 2020 | arXiv GitHub | 6D Object Pose Estimation, Single-stage, both accuracy and speed | |
PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation - CVPR 2020 | arXiv GitHub | 6D Object Pose Estimation, single RGBD image, 3D-keypoint detection | |
HybridPose: 6D Object Pose Estimation under Hybrid Representations - CVPR 2020 | arXiv GitHub | 6D Object Pose Estimation, RGB data, Real-time | |
MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion - CVPR 2020 | arXiv GitHub | 6D Object Pose Estimation, RGB-D data, Real-time | |
EPOS: Estimating 6D Pose of Objects with Symmetries - CVPR 2020 | arXiv HomePage | 6D Object Pose Estimation, Single RGB input, Surface Fragments |
Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection
Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector
AugFPN: Improving Multi-scale Feature Learning for Object Detection
Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection
Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
CentripetalNet: Pursuing High-quality Keypoint Pairs for Object Detection
Semi-Supervised Semantic Image Segmentation with Self-correcting Networks
Deep Snake for Real-Time Instance Segmentation
CenterMask : Real-Time Anchor-Free Instance Segmentation
SketchGCN: Semantic Sketch Segmentation with Graph Convolutional Networks
PolarMask: Single Shot Instance Segmentation with Polar Representation
xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation
BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation
Enhancing Generic Segmentation with Learned Region Representations
Towards Universal Representation Learning for Deep Face Recognition
Suppressing Uncertainties for Large-Scale Facial Expression Recognition
Face X-ray for More General Face Forgery Detection
Pose Agnostic Cross-spectral Hallucination via Disentangling Independent Factors
Deep Spatial Gradient and Temporal Depth Learning for Face Anti-spoofing
Learning Meta Face Recognition in Unseen Domains
ROAM: Recurrently Optimizing Tracking Model
PF-Net: Point Fractal Network for 3D Point Cloud Completion
PointAugment: an Auto-Augmentation Framework for Point Cloud Classification
Learning multiview 3D point cloud registration
C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds
RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image
Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion
In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction from 2D Landmarks
Attentive Context Normalization for Robust Permutation-Equivariant Learning
- 论文地址:https://arxiv.org/abs/1907.02545 Weiwei Sun, Wei Jiang, Eduard Trulls, Andrea Tagliasacchi, Kwang Moo Yi
PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes
SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans
Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching
Unsupervised Learning of Intrinsic Structural Representation Points
Learning to Shade Hand-drawn Sketches
Single Image Reflection Removal through Cascaded Refinement
Generalized ODIN: Detecting Out-of-distribution Image without Learning from Out-of-distribution Data
Deep Image Harmonization via Domain Verification
RoutedFusion: Learning Real-time Depth Map Fusion
Neural Contours: Learning to Draw Lines from 3D Shapes
Towards Photo-Realistic Virtual Try-On by Adaptively Generating鈫Preserving Image Content
Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task(图像处理-图像特征匹配)
Correspondence Networks with Adaptive Neighbourhood Consensus(图像处理-图像特征匹配)
Normalized and Geometry-Aware Self-Attention Network for Image Captioning(图像处理-图像字幕)
Self-training with Noisy Student improves ImageNet classification
Image Matching across Wide Baselines: From Paper to Practice
Towards Robust Image Classification Using Sequential Attention Models
Learning in the Frequency Domain
Learning from Web Data with Memory Module
Making Better Mistakes: Leveraging Class Hierarchies with Deep Networks
论文地址:https://arxiv.org/abs/1912.09393
VIBE: Video Inference for Human Body Pose and Shape Estimation
Distribution-Aware Coordinate Representation for Human Pose Estimation
4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras
Optimal least-squares solution to the hand-eye calibration problem
D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
Multi-Modal Domain Adaptation for Fine-Grained Action Recognition
Distribution Aware Coordinate Representation for Human Pose Estimation
The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation
PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation
Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation
G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features
Deep Image Spatial Transformation for Person Image Generation
-
代码:https://github.com/RenYurui/ Global-Flow-Local-Attention
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
Object Relational Graph with Teacher-Recommended Learning for Video Captioning
Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution
Blurry Video Frame Interpolation
Hierarchical Conditional Relation Networks for Video Question Answering
Action Modifiers:Learning from Adverbs in Instructional Video
Visual Grounding in Video for Unsupervised Word Translation
MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask(视频分析-光流估计)
Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects(视频预测)
ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models
MSG-GAN: Multi-Scale Gradient GAN for Stable Image Synthesis
Robust Design of Deep Neural Networks against Adversarial Attacks based on Lyapunov Theory
4.PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer
Improved Few-Shot Visual Classification
Meta-Transfer Learning for Zero-Shot Super-Resolution
Instance Credibility Inference for Few-Shot Learning
Rethinking the Route Towards Weakly Supervised Object Localization
NestedVAE: Isolating Common Factors via Weak Supervision
Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation
Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction
ClusterFit: Improving Generalization of Visual Representations
Auto-Encoding Twin-Bottleneck Hashing
Learning Representations by Predicting Bags of Visual Words
A Characteristic Function Approach to Deep Implicit Generative Modeling
Unsupervised Learning of Intrinsic Structural Representation Points
Cross-modality Person re-identification with Shared-Specific Feature Transfer
2.Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction
3.The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction
GhostNet: More Features from Cheap Operations
Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral
GPU-Accelerated Mobile Multi-view Style Transfer
Bundle Adjustment on a Graph Processor
Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral
Holistically-Attracted Wireframe Parsing
AdderNet: Do We Really Need Multiplications in Deep Learning?
CARS: Contunuous Evolution for Efficient Neural Architecture Search
Π-nets: Deep Polynomial Neural Networksv
Explaining Knowledge Distillation by Quantifying the Knowledge
Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution
2.Closed-loop Matters: Dual Regression Networks for Single Image Super-Resolution
Visual Commonsense R-CNN
Scalable Uncertainty for Computer Vision with Functional Variational Inference
Deep Representation Learning on Long-tailed Data: A Learnable Embedding Augmentation Perspective
Representations, Metrics and Statistics For Shape Analysis of Elastic Graphs
Filter Grafting for Deep Neural Networks
12-in-1: Multi-Task Vision and Language Representation Learning
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
Unbiased Scene Graph Generation from Biased Training
Towards Visually Explaining Variational Autoencoders
BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition
High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks
SAM: The Sensitivity of Attribution Methods to Hyperparameters
Π− nets: Deep Polynomial Neural Networks
Towards Backward-Compatible Representation Learning
On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location
KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations(数据集)