CVPR 2025 论文和开源项目合集(Papers with Code)

CVPR 2025 decisions are now available on OpenReview！22.1% = 2878 / 13008

注1：欢迎各位大佬提交issue，分享CVPR 2025论文和开源项目！

注2：关于往年CV顶会论文以及其他优质CV论文和大盘点，详见： https://github.com/amusi/daily-paper-computer-vision

ECCV 2024

CVPR 2024

欢迎扫码加入【CVer学术交流群】，可以获取CVPR 2025等最前沿工作！这是最大的计算机视觉AI知识星球！每日更新，第一时间分享最新最前沿的计算机视觉、AIGC、扩散模型、多模态、深度学习、自动驾驶、医疗影像和遥感等方向的学习资料，快加入学起来！

【CVPR 2025 论文开源目录】

3DGS(Gaussian Splatting)
Avatars
Backbone
CLIP
Mamba
Embodied AI
GAN
GNN
多模态大语言模型(MLLM)
大语言模型(LLM)
NAS
OCR
NeRF
DETR
扩散模型(Diffusion Models)
ReID(重识别)
长尾分布(Long-Tail)
Vision Transformer
视觉和语言(Vision-Language)
自监督学习(Self-supervised Learning)
数据增强(Data Augmentation)
目标检测(Object Detection)
异常检测(Anomaly Detection)
目标跟踪(Visual Tracking)
语义分割(Semantic Segmentation)
实例分割(Instance Segmentation)
全景分割(Panoptic Segmentation)
医学图像(Medical Image)
医学图像分割(Medical Image Segmentation)
视频目标分割(Video Object Segmentation)
视频实例分割(Video Instance Segmentation)
参考图像分割(Referring Image Segmentation)
图像抠图(Image Matting)
图像编辑(Image Editing)
Low-level Vision
超分辨率(Super-Resolution)
去噪(Denoising)
去模糊(Deblur)
自动驾驶(Autonomous Driving)
3D点云(3D Point Cloud)
3D目标检测(3D Object Detection)
3D语义分割(3D Semantic Segmentation)
3D目标跟踪(3D Object Tracking)
3D语义场景补全(3D Semantic Scene Completion)
3D配准(3D Registration)
3D人体姿态估计(3D Human Pose Estimation)
3D人体Mesh估计(3D Human Mesh Estimation)
医学图像(Medical Image)
图像生成(Image Generation)
视频生成(Video Generation)
3D生成(3D Generation)
视频理解(Video Understanding)
行为检测(Action Detection)
文本检测(Text Detection)
知识蒸馏(Knowledge Distillation)
模型剪枝(Model Pruning)
图像压缩(Image Compression)
三维重建(3D Reconstruction)
深度估计(Depth Estimation)
轨迹预测(Trajectory Prediction)
车道线检测(Lane Detection)
图像描述(Image Captioning)
视觉问答(Visual Question Answering)
手语识别(Sign Language Recognition)
视频预测(Video Prediction)
新视点合成(Novel View Synthesis)
Zero-Shot Learning(零样本学习)
立体匹配(Stereo Matching)
特征匹配(Feature Matching)
场景图生成(Scene Graph Generation)
隐式神经表示(Implicit Neural Representations)
图像质量评价(Image Quality Assessment)
视频质量评价(Video Quality Assessment)
数据集(Datasets)
新任务(New Tasks)
其他(Others)

3DGS(Gaussian Splatting)

Avatars

Backbone

CLIP

Mamba

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Paper: https://arxiv.org/abs/2407.08083
Code: https://github.com/NVlabs/MambaVision

MobileMamba: Lightweight Multi-Receptive Visual Mamba Network

Embodied AI

GAN

OCR

NeRF

DETR

Prompt

多模态大语言模型(MLLM)

大语言模型(LLM)

NAS

ReID(重识别)

扩散模型(Diffusion Models)

TinyFusion: Diffusion Transformers Learned Shallow

Paper: https://arxiv.org/abs/2412.01199
Code: https://github.com/VainF/TinyFusion

Vision Transformer

视觉和语言(Vision-Language)

目标检测(Object Detection)

LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models

Paper: https://arxiv.org/abs/2501.18954
Code：https://github.com/iSEE-Laboratory/LLMDet

异常检测(Anomaly Detection)

目标跟踪(Object Tracking)

Multiple Object Tracking as ID Prediction

Paper：https://arxiv.org/abs/2403.16848
Code: https://github.com/MCG-NJU/MOTIP

医学图像(Medical Image)

医学图像分割(Medical Image Segmentation)

自动驾驶(Autonomous Driving)

3D点云(3D-Point-Cloud)

3D目标检测(3D Object Detection)

3D语义分割(3D Semantic Segmentation)

图像编辑(Image Editing)

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Paper: https://arxiv.org/abs/2411.16832
Code: https://github.com/taco-group/FaceLock

视频编辑(Video Editing)

Low-level Vision

超分辨率(Super-Resolution)

AESOP: Auto-Encoded Supervision for Perceptual Image Super-Resolution

去噪(Denoising)

图像去噪(Image Denoising)

3D人体姿态估计(3D Human Pose Estimation)

图像生成(Image Generation)

Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper: https://arxiv.org/abs/2501.01423
Code: https://github.com/hustvl/LightningDiT

SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models

TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation

Homepage: https://byteflow-ai.github.io/TokenFlow/
Code: https://github.com/ByteFlow-AI/TokenFlow
Paper:https://arxiv.org/abs/2412.03069

视频生成(Video Generation)

Identity-Preserving Text-to-Video Generation by Frequency Decomposition

Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models

Name		Name	Last commit message	Last commit date
Latest commit History 653 Commits
CVPR2019-Papers-with-Code.md		CVPR2019-Papers-with-Code.md
CVPR2020-Papers-with-Code.md		CVPR2020-Papers-with-Code.md
CVPR2021-Papers-with-Code.md		CVPR2021-Papers-with-Code.md
CVPR2022-Papers-with-Code.md		CVPR2022-Papers-with-Code.md
CVPR2023-Papers-with-Code.md		CVPR2023-Papers-with-Code.md
CVPR2024-Papers-with-Code.md		CVPR2024-Papers-with-Code.md
CVer学术交流群.png		CVer学术交流群.png
README.md		README.md
master		master

amusi/CVPR2025-Papers-with-Code

Folders and files

Latest commit

History

Repository files navigation

CVPR 2025 论文和开源项目合集(Papers with Code)

【CVPR 2025 论文开源目录】

3DGS(Gaussian Splatting)

Avatars

Backbone

CLIP

Mamba

Embodied AI

GAN

OCR

NeRF

DETR

Prompt

多模态大语言模型(MLLM)

大语言模型(LLM)

NAS

ReID(重识别)

扩散模型(Diffusion Models)

Vision Transformer

视觉和语言(Vision-Language)

目标检测(Object Detection)

异常检测(Anomaly Detection)

目标跟踪(Object Tracking)

医学图像(Medical Image)

医学图像分割(Medical Image Segmentation)

自动驾驶(Autonomous Driving)

3D点云(3D-Point-Cloud)

3D目标检测(3D Object Detection)

3D语义分割(3D Semantic Segmentation)

图像编辑(Image Editing)

视频编辑(Video Editing)

Low-level Vision

超分辨率(Super-Resolution)

去噪(Denoising)

图像去噪(Image Denoising)

3D人体姿态估计(3D Human Pose Estimation)

图像生成(Image Generation)

视频生成(Video Generation)

3D生成

视频理解(Video Understanding)

知识蒸馏(Knowledge Distillation)

立体匹配(Stereo Matching)

场景图生成(Scene Graph Generation)

视频质量评价(Video Quality Assessment)

数据集(Datasets)

其他(Others)

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Packages