Skip to content

Latest commit

 

History

History
executable file
·
82 lines (65 loc) · 11.4 KB

2024-02-04.md

File metadata and controls

executable file
·
82 lines (65 loc) · 11.4 KB

[UPDATED!] 2024-02-04 (Publish Time)

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-02-04 DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing DiffEditor:提高基于扩散的图像编辑的准确性和灵活性 Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, Jian Zhang http://arxiv.org/pdf/2402.02583v1 null
2024-02-04 AI Art Neural Constellation: Revealing the Collective and Contrastive State of AI-Generated and Human Art 人工智能艺术神经星座:揭示人工智能生成艺术和人类艺术的集体和对比状态 Faizan Farooq Khan, Diana Kim, Divyansh Jha, Youssef Mohamed, Hanna H Chang, Ahmed Elgammal, Luba Elliott, Mohamed Elhoseiny http://arxiv.org/pdf/2402.02453v1 null
2024-02-04 PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal PromptRR:扩散模型作为单图像反射去除的提示生成器 Tao Wang, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Tae-Kyun Kim, Tong Lu, Hongdong Li, Ming-Hsuan Yang http://arxiv.org/pdf/2402.02374v1 null
2024-02-04 Closed-Loop Unsupervised Representation Disentanglement with $β$-VAE Distillation and Diffusion Probabilistic Feedback 使用 $β$-VAE 蒸馏和扩散概率反馈进行闭环无监督表示解耦 Xin Jin, Bohan Li, BAAO Xie, Wenyao Zhang, Jinming Liu, Ziqiang Li, Tao Yang, Wenjun Zeng http://arxiv.org/pdf/2402.02346v1 null
2024-02-04 Your Diffusion Model is Secretly a Certifiably Robust Classifier 您的扩散模型实际上是一个经过验证的稳健分类器 Huanran Chen, Yinpeng Dong, Shitong Shao, Zhongkai Hao, Xiao Yang, Hang Su, Jun Zhu http://arxiv.org/pdf/2402.02316v1 null

多模态

Publish Date Title Title_CN Authors PDF Code
2024-02-04 Generalizable Entity Grounding via Assistance of Large Language Model 通过大语言模型的辅助进行泛化实体基础 Lu Qi, Yi-Wen Chen, Lehan Yang, Tiancheng Shen, Xiangtai Li, Weidong Guo, Yu Xu, Ming-Hsuan Yang http://arxiv.org/pdf/2402.02555v1 null
2024-02-04 LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model LHRS-Bot:利用 VGI 增强型大型多模态语言模型增强遥感能力 Dilxat Muhtar, Zhenshi Li, Feng Gu, Xueliang Zhang, Pengfeng Xiao http://arxiv.org/pdf/2402.02544v1 link
2024-02-04 Knowledge Generation for Zero-shot Knowledge-based VQA 零样本基于知识的 VQA 的知识生成 Rui Cao, Jing Jiang http://arxiv.org/pdf/2402.02541v1 null
2024-02-04 GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering GeReA:基于知识的视觉问答的问题感知提示字幕 Ziyu Ma, Shutao Li, Bin Sun, Jianfei Cai, Zuxiang Long, Fuyan Ma http://arxiv.org/pdf/2402.02503v1 null
2024-02-04 M$^3$Face: A Unified Multi-Modal Multilingual Framework for Human Face Generation and Editing M$^3$Face:用于人脸生成和编辑的统一多模态多语言框架 Mohammadreza Mofayezi, Reza Alipour, Mohammad Ali Kakavand, Ehsaneddin Asgari http://arxiv.org/pdf/2402.02369v1 null
2024-02-04 Vision Transformer-based Multimodal Feature Fusion Network for Lymphoma Segmentation on PET/CT Images 基于 Vision Transformer 的多模态特征融合网络,用于 PET/CT 图像上的淋巴瘤分割 Huan Huang, Liheng Qiu, Shenmiao Yang, Longxi Li, Jiaofen Nan, Yanting Li, Chuang Han, Fubao Zhu, Chen Zhao, Weihua Zhou http://arxiv.org/pdf/2402.02349v1 null
2024-02-04 Bootstrapping Audio-Visual Segmentation by Strengthening Audio Cues 通过加强音频提示引导视听分割 Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Le Lu, Jieping Ye, Nenghai Yu http://arxiv.org/pdf/2402.02327v1 null

分类/检测/识别/分割/...

Publish Date Title Title_CN Authors PDF Code
2024-02-04 Spatio-temporal Prompting Network for Robust Video Feature Extraction 用于鲁棒视频特征提取的时空提示网络 Guanxiong Sun, Chi Wang, Zhaoyu Zhang, Jiankang Deng, Stefanos Zafeiriou, Yang Hua http://arxiv.org/pdf/2402.02574v1 null
2024-02-04 DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms in Vision Transformers DeSparsify:针对 Vision Transformer 中代币稀疏化机制的对抗性攻击 Oryan Yehezkel, Alon Zolfi, Amit Baras, Yuval Elovici, Asaf Shabtai http://arxiv.org/pdf/2402.02554v1 null
2024-02-04 Classification of Tennis Actions Using Deep Learning 使用深度学习对网球动作进行分类 Emil Hovad, Therese Hougaard-Jensen, Line Katrine Harder Clemmensen http://arxiv.org/pdf/2402.02545v1 null
2024-02-04 Embedding Non-Distortive Cancelable Face Template Generation 嵌入非扭曲可取消面部模板生成 Dmytro Zakharov, Oleksandr Kuznetsov, Emanuele Frontoni, Natalia Kryvinska http://arxiv.org/pdf/2402.02540v1 null
2024-02-04 Deep Supervision by Gaussian Pseudo-label-based Morphological Attention for Abdominal Aorta Segmentation in Non-Contrast CTs 基于高斯伪标签的形态学关注对非造影 CT 腹主动脉分割的深度监督 Qixiang Ma, Antoine Lucas, Adrien Kaladji, Pascal Haigron http://arxiv.org/pdf/2402.02514v1 null
2024-02-04 VM-UNet: Vision Mamba UNet for Medical Image Segmentation VM-UNet:用于医学图像分割的 Vision Mamba UNet Jiacheng Ruan, Suncheng Xiang http://arxiv.org/pdf/2402.02491v1 null
2024-02-04 Deep Spectral Improvement for Unsupervised Image Instance Segmentation 无监督图像实例分割的深度光谱改进 Farnoosh Arefi, Amir M. Mansourian, Shohreh Kasaei http://arxiv.org/pdf/2402.02474v1 null
2024-02-04 Learning Mutual Excitation for Hand-to-Hand and Human-to-Human Interaction Recognition 学习手拉手和人与人交互识别的互激励 Mengyuan Liu, Chen Chen, Songtao Wu, Fanyang Meng, Hong Liu http://arxiv.org/pdf/2402.02431v1 null
2024-02-04 Exploiting Low-level Representations for Ultra-Fast Road Segmentation 利用低级表示进行超快速道路分段 Huan Zhou, Feng Xue, Yucong Li, Shi Gong, Yiqun Li, Yu Zhou http://arxiv.org/pdf/2402.02430v1 null
2024-02-04 NOAH: Learning Pairwise Object Category Attentions for Image Classification NOAH:学习图像分类的成对对象类别注意力 Chao Li, Aojun Zhou, Anbang Yao http://arxiv.org/pdf/2402.02377v1 null
2024-02-04 Exploring Intrinsic Properties of Medical Images for Self-Supervised Binary Semantic Segmentation 探索医学图像的内在属性以进行自监督二元语义分割 Pranav Singh, Jacopo Cirrone http://arxiv.org/pdf/2402.02367v1 null
2024-02-04 Region-Based Representations Revisited 重新审视基于区域的表示 Michal Shlapentokh-Rothman, Ansel Blume, Yao Xiao, Yuqun Wu, Sethuraman T V, Heyi Tao, Jae Yong Lee, Wilfredo Torres, Yu-Xiong Wang, Derek Hoiem http://arxiv.org/pdf/2402.02352v1 null

图像理解

Publish Date Title Title_CN Authors PDF Code
2024-02-04 Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning 点云很重要:重新思考不同观察空间对机器人学习的影响 Haoyi Zhu, Yating Wang, Di Huang, Weicai Ye, Wanli Ouyang, Tong He http://arxiv.org/pdf/2402.02500v1 null

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-02-04 Key-Graph Transformer for Image Restoration 用于图像恢复的关键图转换器 Bin Ren, Yawei Li, Jingyun Liang, Rakesh Ranjan, Mengyuan Liu, Rita Cucchiara, Luc Van Gool, Nicu Sebe http://arxiv.org/pdf/2402.02634v1 null
2024-02-04 Fully Differentiable Correlation-driven 2D/3D Registration for X-ray to CT Image Fusion 用于 X 射线到 CT 图像融合的完全可微相关驱动的 2D/3D 配准 Minheng Chen, Zhirun Zhang, Shuheng Gu, Zhangyang Ge, Youyong Kong http://arxiv.org/pdf/2402.02498v1 null
2024-02-04 Angle Robustness Unmanned Aerial Vehicle Navigation in GNSS-Denied Scenarios GNSS 拒绝场景中的角度鲁棒性无人机导航 Yuxin Wang, Zunlei Feng, Haofei Zhang, Yang Gao, Jie Lei, Li Sun, Mingli Song http://arxiv.org/pdf/2402.02405v1 null
2024-02-04 Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning 从视觉提示中学习语义代理,以在深度度量学习中进行参数高效的微调 Li Ren, Chen Chen, Liqiang Wang, Kien Hua http://arxiv.org/pdf/2402.02340v1 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-02-04 Multiplexed all-optical permutation operations using a reconfigurable diffractive optical network 使用可重构衍射光网络的多路复用全光排列运算 Guangdong Ma, Xilin Yang, Bijie Bai, Jingxi Li, Yuhang Li, Tianyi Gan, Che-Yung Shen, Yijie Zhang, Yuzhu Li, Mona Jarrahi, et.al. http://arxiv.org/pdf/2402.02397v1 null
2024-02-04 Uncertainty-Aware Testing-Time Optimization for 3D Human Pose Estimation 3D 人体姿势估计的不确定性感知测试时间优化 Ti Wang, Mengyuan Liu, Hong Liu, Bin Ren, Yingxuan You, Wenhao Li, Nicu Sebe, Xia Li http://arxiv.org/pdf/2402.02339v1 null
2024-02-04 CNS-Edit: 3D Shape Editing via Coupled Neural Shape Optimization CNS-Edit:通过耦合神经形状优化进行 3D 形状编辑 Jingyu Hu, Ka-Hei Hui, Zhengzhe Liu, Hao Zhang, Chi-Wing Fu http://arxiv.org/pdf/2402.02313v1 null

各类学习方式

Publish Date Title Title_CN Authors PDF Code
2024-02-04 BECLR: Batch Enhanced Contrastive Few-Shot Learning BECLR:批量增强对比小样本学习 Stylianos Poulakakis-Daktylidis, Hadi Jamali-Rad http://arxiv.org/pdf/2402.02444v1 null

其他

Publish Date Title Title_CN Authors PDF Code
2024-02-04 SIMPL: A Simple and Efficient Multi-agent Motion Prediction Baseline for Autonomous Driving SIMPL:用于自动驾驶的简单高效的多智能体运动预测基准 Lu Zhang, Peiliang Li, Sikang Liu, Shaojie Shen http://arxiv.org/pdf/2402.02519v1 null
2024-02-04 Uncertainty-Aware Perceiver 不确定性感知感知器 EuiYul Song http://arxiv.org/pdf/2402.02433v1 null
2024-02-04 Physics-Inspired Degradation Models for Hyperspectral Image Fusion 用于高光谱图像融合的物理启发退化模型 Jie Lian, Lizhi Wang, Lin Zhu, Renwei Dian, Zhiwei Xiong, Hua Huang http://arxiv.org/pdf/2402.02411v1 null
2024-02-04 AI-Generated Content Enhanced Computer-Aided Diagnosis Model for Thyroid Nodules: A ChatGPT-Style Assistant AI 生成的内容增强型甲状腺结节计算机辅助诊断模型:ChatGPT 式助手 Jincao Yao, Yunpeng Wang, Zhikai Lei, Kai Wang, Xiaoxian Li, Jianhua Zhou, Xiang Hao, Jiafei Shen, Zhenping Wang, Rongrong Ru, et.al. http://arxiv.org/pdf/2402.02401v1 null
2024-02-04 Revisiting the Power of Prompt for Visual Tuning 重新审视视觉调整提示的力量 Yuzhu Wang, Lechao Cheng, Chaowei Fang, Dingwen Zhang, Manni Duan, Meng Wang http://arxiv.org/pdf/2402.02382v1 null
2024-02-04 Stereographic Spherical Sliced Wasserstein Distances 立体球面切片 Wasserstein 距离 Huy Tran, Yikun Bai, Abihith Kothapalli, Ashkan Shahbazi, Xinran Liu, Rocio Diaz Martin, Soheil Kolouri http://arxiv.org/pdf/2402.02345v1 null
2024-02-04 Video Editing for Video Retrieval 用于视频检索的视频编辑 Bin Zhu, Kevin Flanagan, Adriano Fragomeni, Michael Wray, Dima Damen http://arxiv.org/pdf/2402.02335v1 null