历年综述论文分类汇总戳这里↘️ CV-Surveys施工中~~~~~~~~~~
- Fast and Efficient Restoration of Extremely Dark Light Fields
- 相机校准
- Camera Pose Estimation(相机姿势估计)
- Meta Approach to Data Augmentation Optimization
- Improving Model Generalization by Agreement of Learned Representations From Data Augmentation
- Multi-Head Deep Metric Learning Using Global and Local Representations
- Hierarchical Proxy-Based Loss for Deep Metric Learning
- Joint Classification and Trajectory Regression of Online Handwriting Using a Multi-Task Learning Approach
- Semi-Supervised Multi-Task Learning for Semantics and Depth
- MUGL: Large Scale Multi Person Conditional Action Generation with Locomotion
⭐code🏠project - 基于姿势引导的动作合成
- CFLOW-AD: Real-Time Unsupervised Anomaly Detection With Localization via Conditional Normalizing Flows
⭐code - A Semi-Supervised Generalized VAE Framework for Abnormality Detection Using One-Class Classification
- novelty detection(奇异值检测)
- Beyond Mono to Binaural: Generating Binaural Audio From Mono Audio With Depth and Cross Modal Attention
🏠project - 声源定位
- 声源分离
- Periocular(眼周) 识别
- InfographicVQA
⭐code - Efficient Counterfactual Debiasing for Visual Question Answering
⭐code - Audio video scene-aware dialog(视听场景感知对话)
- Try-On
- Robots
- Revealing Disocclusions in Temporal View Synthesis Through Infilling Vector Prediction
⭐code🏠project📺video - Fast and Explicit Neural View Synthesis
- Novel-View Synthesis of Human Tourist Photos
- Evaluating and Mitigating Bias in Image Classifiers: A Causal Perspective Using Counterfactuals
- Class-Balanced Active Learning for Image Classification
⭐code - Learnable Adaptive Cosine Estimator (LACE) for Image Classification
⭐code - Enhancing Few-Shot Image Classification With Unlabelled Examples
⭐code - 零样本分类
- 小样本分类
- 细粒度识别
- 物品姿势估计
- Object Pose Refinement
- 动物姿势
- Fully Convolutional Cross-Scale-Flows for Image-Based Defect Detection
⭐code - Automated Defect Inspection in Reverse Engineering of Integrated Circuits
- 下水道缺陷分类
- MovingFashion: A Benchmark for the Video-To-Shop Challenge
🌻dataset - Challenges in Procedural Multimodal Machine Comprehension: A Novel Way To Benchmark
- 用于检测跟踪海域人类
- 图像识别
- 自动驾驶
- 用于从高空鱼眼相机中检测和跟踪行人
- Is an Image Worth Five Sentences? A New Look Into Semantics for Image-Text Matching
⭐code⭐code - Let There Be a Clock on the Beach: Reducing Object Hallucination in Image Captioning
⭐code - Improve Image Captioning by Estimating the Gazing Patterns From the Caption
- All the Attention You Need: Global-Local, Spatial-Channel Attention for Image Retrieval
- Learning With Label Noise for Image Retrieval by Selecting Interactions
- SAC: Semantic Attention Composition for Text-Conditioned Image Retrieval
- Image-Text retrieval
- 图像搜索
- 视频文本匹配
- 绘图检索
- 视频检索
- 自动驾驶
- 车辆定位
- Vehicle Detection(交通检测)
- Lane Detection(车道线检测)
- NUTA: Non-Uniform Temporal Aggregation for Action Recognition
- MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition
- Dual-Head Contrastive Domain Adaptation for Video Action Recognition
⭐code - Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition
⭐code - SWAG-V: Explanations for Video Using Superpixels Weighted by Average Gradients
- Pose and Joint-Aware Action Recognition
⭐code - Domain Generalization Through Audio-Visual Relative Norm Alignment in First Person Action Recognition
- 3D动作识别
- 动作定位
- 时序动作分割
- Surrogate Model-Based Explainability Methods for Point Cloud NNs
⭐code - StickyLocalization: Robust End-to-End Relocalization on Point Clouds Using Graph Neural Networks
- 3D 点云
- 分类与分割
- Billion-Scale Pretraining with Vision Transformers for Multi-Task Visual Representations
- Visualizing Paired Image Similarity in Transformer Networks
⭐code - S2-MLP: Spatial-Shift MLP Architecture for Vision
- 图像分类
- 图像超级补全
- 模型压缩
- 知识蒸馏
- 剪枝
- Approximate Neural Architecture Search via Operation Distribution Learning
- Neural Architecture Search for Efficient Uncalibrated Deep Photometric Stereo
- Towards a Robust Differentiable Architecture Search Under Label Noise
- Lightweight Monocular Depth With a Novel Neural Architecture Search Method
- Progressive Automatic Design of Search Space for One-Shot Neural Architecture Search
- Post-OCR Paragraph Recognition by Graph Convolutional Networks
- 不规则场景文本识别
- LOGO识别
- 手写文本识别
- 表格结构识别
- Normalizing Flow as a Flexible Fidelity Objective for Photo-Realistic Super-Resolution
⭐code - Multi-Dimensional Dynamic Model Compression for Efficient Image Super-Resolution
- edge-SR: Super-Resolution for the Masses
- DAQ: Channel-Wise Distribution-Aware Quantization for Deep Image Super-Resolution Networks
- Hyperspectral Image Super-Resolution With RGB Image Super-Resolution as an Auxiliary Task
⭐code - VSR
- 图像生成
- sketch-to-photo
- Image-to-Image Translation
- 半监督
- 自监督
- 无监督
- Semantically Stealthy Adversarial Attacks Against Segmentation Models
- 视频分割
- VOS(视频目标分割)
- 动作分割
- 语义分割
- Plugging Self-Supervised Monocular Depth Into Unsupervised Domain Adaptation for Semantic Segmentation
⭐code - Adversarial Semantic Hallucination for Domain Generalized Semantic Segmentation
⭐code - Shallow Features Guide Unsupervised Domain Adaptation for Semantic Segmentation at Class Boundaries
⭐code - Evaluating the Robustness of Semantic Segmentation for Autonomous Driving Against Real-World Adversarial Patch Attacks
- Multi-Domain Incremental Learning for Semantic Segmentation
⭐code - Active Learning for Improved Semi-Supervised Semantic Segmentation in Satellite Images
⭐code - Multi-Domain Semantic Segmentation With Overlapping Labels
- Mixed-Dual-Head Meets Box Priors: A Robust Framework for Semi-Supervised Segmentation
- 视频语义分割
- 弱监督语义分割
- 无监督语义分割
- 半监督语义分割
- 小样本语义分割
- Plugging Self-Supervised Monocular Depth Into Unsupervised Domain Adaptation for Semantic Segmentation
- 实例分割
- 全景分割
- Foreground-Background 分割
- 超像素分割
- 道路分割
- 抠图
- 域适应
- Unsupervised Robust Domain Adaptation Without Source Data
- 半监督域适应
- 无监督域适应
- 开集域适应
- 多源域适应
- 多目标域适应
- 域泛化
- 小样本学习
- Contextual Gradient Scaling for Few-Shot Learning
⭐code - Calibrating CNNs for Few-Shot Meta Learning
- SEGA: Semantic Guided Attention on Visual Prototype for Few-Shot Learning
⭐code - Tensor Feature Hallucination for Few-Shot Learning
⭐code - Ortho-Shot: Low Displacement Rank Regularization With Data Augmentation for Few-Shot Learning
- Contextual Gradient Scaling for Few-Shot Learning
- Domain Shift
- 单样本学习
- 3D Facial
- 基于皱纹的人体识别
- 人脸活体检测
- 人脸表情
- 人脸检测
- PAD人脸呈现攻击检测
- 年龄预测
- Face verification(人脸验证)
- 人脸去模糊
- facial forgery detection
- 人脸图像质量苹果
- 人脸补全
- 妆容迁移
- 人脸恢复
- 人脸识别
- 黑盒攻击
- 对抗样本
- 对抗攻击
- Lane-Level Street Map Extraction From Aerial Imagery
- An Experimental Comparison of Multi-View Stereo Approaches on Satellite Images
- 小样本开放集识别
- 检测
- 跟踪
- Extracting Vignetting and Grain Filter Effects From Photos
- 去噪
- 去雨
- 去模糊
- 去马赛克
- 图像着色
- 图像裁剪
- 图像恢复
- 图像修复
- 图像降质
- 图像增强
- 图像质量评估
- Image reenactment(图像重演)
- Image decomposition(图像分解)
- Auto white balance(自动白平衡)
- 人体动作合成
- 3D人体
- 人体姿态估计
- 3D人体姿态估计
- 3D手部姿势估计
- 头部姿势估计
- 三维人体模型
- 人体形状
- 无监督视频域适应
- Partial Video Copy Detection(局部视频拷贝检测)
- 异常检测
- Discrete Neural Representations for Explainable Anomaly Detection
🏠project📺video - Rethinking Video Anomaly Detection - A Continual Learning Approach
- A Modular and Unified Framework for Detecting and Localizing Video Anomalies
- FastAno: Fast Anomaly Detection via Spatio-Temporal Patch Transformation
- Multi-Branch Neural Networks for Video Anomaly Detection in Adverse Lighting and Weather Conditions
- Discrete Neural Representations for Explainable Anomaly Detection
- sarcasm and humor detection(讽刺与幽默检测)
- 视频表征学习
- 视频字幕
- 视频人物定位
- 视频稳定
- 视频理解
- 视频分类
- 视频摘要
- 有声视频合成
- 视频帧插值
- 视频时刻定位
- Temporal Video Segmentation(时序视频分割)
- Meta-UDA: Unsupervised Domain Adaptive Thermal Object Detection Using Meta-Learning
- ADC: Adversarial Attacks Against Object Detection That Evade Context Consistency Checks
- TricubeNet: 2D Kernel-Based Object Representation for Weakly-Occluded Oriented Object Detection
⭐code - Detecting Tear Gas Canisters With Limited Training Data
- Learned Event-Based Visual Perception for Improved Space Object Detection
- Densely-Packed Object Detection via Hard Negative-Aware Anchor Attention
- PICA: Point-Wise Instance and Centroid Alignment Based Few-Shot Domain Adaptive Object Detection With Loose Annotations
- Improving Object Detection by Label Assignment Distillation
⭐code - Fusion Point Pruning for Optimized 2D Object Detection With Radar-Camera Fusion
- YOLO-ReT: Towards High Accuracy Real-Time Object Detection on Edge GPUs
⭐code - SC-UDA: Style and Content Gaps Aware Unsupervised Domain Adaptation for Object Detection
- To Miss-Attend Is to Misalign! Residual Self-Attentive Feature Alignment for Adapting Object Detectors
⭐code - 目标定位
- MOD(移动目标检测)
- 路标检测
- 零样本检测
- 小样本目标检测
- 图像异常检测
- 弱监督目标检测
- Few-Shot Weakly-Supervised Object Detection via Directional Statistics
- 海上障碍物检测
- 人造卫星识别
- Object Anti-Spoofing
- 3D目标检测
- 显著目标检测
- 伪装目标检测
- 球员检测
- Wireframe Detection(线框检测)
- GraN-GAN: Piecewise Gradient Normalization for Generative Adversarial Networks
- Latent to Latent: A Learned Mapper for Identity Preserving Editing of Multiple Face Attributes in StyleGAN-Generated Images
- AE-StyleGAN: Improved Training of Style-Based Auto-Encoders
⭐code - GANs Spatial Control via Inference-Time Adaptive Normalization
- Latent Reweighting, an Almost Free Improvement for GANs
- PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression
- Controlled GAN-Based Creature Synthesis via a Challenging Game Art Dataset - Addressing the Noise-Latent Trade-Off
- Data InStance Prior (DISP) in Generative Adversarial Networks
- Sketch-To-Face草图到人脸图像翻译
- 基于关键点重新合成新姿势
- MRI重建
- 深度估计
- stereo images
- 三维重建
- Single-Shot Dense Active Stereo With Pixel-Wise Phase Estimation Based on Grid-Structure Using CNN and Correspondence Estimation Using GCN
- Style Agnostic 3D Reconstruction via Adversarial Style Transfer
⭐code - 3D Modeling Beneath Ground: Plant Root Detection and Reconstruction Based on Ground-Penetrating Radar
- Mending Neural Implicit Modeling for 3D Vehicle Reconstruction in the Wild
- Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image
⭐code - Tensor-Based Non-Rigid Structure From Motion
- stereo vision(立体视觉)
- 网格重建
- 分割
- UNETR: Transformers for 3D Medical Image Segmentation
⭐code - Self-Supervised Generative Style Transfer for One-Shot Medical Image Segmentation
⭐code - AFTer-UNet: Axial Fusion Transformer UNet for Medical Image Segmentation
- Co-Net: A Collaborative Region-Contour-Driven Network for Fine-to-Finer Medical Image Segmentation
- T-Net: A Resource-Constrained Tiny Convolutional Neural Network for Medical Image Segmentation
- Hyper-Convolution Networks for Biomedical Image Segmentation
⭐code - 血管分割
- 腺体分割
- UNETR: Transformers for 3D Medical Image Segmentation
- 检索
- 配准
- 分类
- 自动生成医学报告
- 手术器械定位
- 胸部X光片的异常分类和定位
- Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To Reduce Model Bias
⭐code - The Untapped Potential of Off-the-Shelf Convolutional Neural Networks
- Unveiling Real-Life Effects of Online Photo Sharing
- Shadow Art Revisited: A Differentiable Rendering Based Approach
- Towards Class-Oriented Poisoning Attacks Against Neural Networks
- Predicting Levels of Household Electricity Consumption in Low-Access Settings
- Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo
- PRECODE - A Generic Model Extension To Prevent Deep Gradient Leakage
- Discovering Underground Maps From Fashion
- On the Maximum Radius of Polynomial Lens Distortion
⭐code - The Hitchhiker's Guide to Prior-Shift Adaptation
⭐code - FalCon: Fine-Grained Feature Map Sparsity Computing With Decomposed Convolutions for Inference Optimization
- METGAN: Generative Tumour Inpainting and Modality Synthesis in Light Sheet Microscopy
- Agree To Disagree: When Deep Learning Models With Identical Architectures Produce Distinct Explanations
⭐code - REFICS: A Step Towards Linking Vision With Hardware Assurance
- Deep Optimization Prior for THz Model Parameter Estimation
- Sharing Decoders: Network Fission for Multi-Task Pixel Prediction
- Fair Visual Recognition in Limited Data Regime using Self-Supervision and Self-Distillation
- Low-Cost Multispectral Scene Analysis With Modality Distillation
- Self-Supervised Pretraining Improves Self-Supervised Pretraining
- PROVES: Establishing Image Provenance Using Semantic Signatures
- Addressing Out-of-Distribution Label Noise in Webly-Labelled Data
⭐code - Towards Durability Estimation of Bioprosthetic Heart Valves via Motion Symmetry Analysis
- Network Generalization Prediction for Safety Critical Tasks in Novel Operating Domains
- Generalized Clustering and Multi-Manifold Learning With Geometric Structure Preservation
⭐code - Batch Normalization Tells You Which Filter Is Important
- Sandwich Batch Normalization: A Drop-In Replacement for Feature Distribution Heterogeneity
⭐code - Parsing Line Chart Images Using Linear Programming
- CrossLocate: Cross-Modal Large-Scale Visual Geo-Localization in Natural Environments Using Rendered Modalities
🏠project - Symmetric-Light Photometric Stereo
- REGroup: Rank-Aggregating Ensemble of Generative Classifiers for Robust Predictions
🏠project⭐code - Leveraging Test-Time Consensus Prediction for Robustness Against Unseen Noise
- Supervised Compression for Resource-Constrained Edge Computing Systems
⭐code - Action Anticipation Using Latent Goal Learning
⭐code - Non-Semantic Evaluation of Image Forensics Tools: Methodology and Database
- Inpaint2Learn: A Self-Supervised Framework for Affordance Learning
- RGL-NET: A Recurrent Graph Learning Framework for Progressive Part Assembly
- Self-Supervised Knowledge Transfer via Loosely Supervised Auxiliary Tasks
⭐code - Novel Ensemble Diversification Methods for Open-Set Scenarios
- Contrast To Divide: Self-Supervised Pre-Training for Learning With Noisy Labels
⭐code - Typenet: Towards Camera Enabled Touch Typing on Flat Surfaces Through Self-Refinement
⭐code - Nonnegative Low-Rank Tensor Completion via Dual Formulation With Applications to Image and Video Completion
- MisConv: Convolutional Neural Networks for Missing Data
- MAPS: Multimodal Attention for Product Similarity
- Global Assists Local: Effective Aerial Representations for Field of View Constrained Image Geo-Localization
- Self-Supervised Test-Time Adaptation on Video Data
- FT-DeepNets: Fault-Tolerant Convolutional Neural Networks With Kernel-Based Duplication
- Short-Term Solar Irradiance Prediction From Sky Images With a Clear Sky Model
- Reconstructing Training Data From Diverse ML Models by Ensemble Inversion
- How Good Is Your Explanation? Algorithmic Stability Measures To Assess the Quality of Explanations for Deep Neural Networks
- Seeing Implicit Neural Representations As Fourier Series
- Human-Aided Saliency Maps Improve Generalization of Deep Learning
- Cross-Modal Adversarial Reprogramming
- Learning From the CNN-Based Compressed Domain
- Spatiotemporal Initialization for 3D CNNs With Generated Motion Patterns
🏠project - DAD: Data-Free Adversarial Defense at Test Time
- Geometry-Inspired Top-K Adversarial Perturbations
- Shape-Coded ArUco: Fiducial Marker for Bridging 2D and 3D Modalities
- Interpretable Semantic Photo Geolocation
⭐code - Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes
⭐code - Geometry-Aware Hierarchical Bayesian Learning on Manifolds
- Transferable 3D Adversarial Textures Using End-to-End Optimization
- Improving Fractal Pre-Training