Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-04 | DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing | DiffEditor:提高基于扩散的图像编辑的准确性和灵活性 | Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, Jian Zhang | http://arxiv.org/pdf/2402.02583v1 | null |
2024-02-04 | AI Art Neural Constellation: Revealing the Collective and Contrastive State of AI-Generated and Human Art | 人工智能艺术神经星座:揭示人工智能生成艺术和人类艺术的集体和对比状态 | Faizan Farooq Khan, Diana Kim, Divyansh Jha, Youssef Mohamed, Hanna H Chang, Ahmed Elgammal, Luba Elliott, Mohamed Elhoseiny | http://arxiv.org/pdf/2402.02453v1 | null |
2024-02-04 | PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal | PromptRR:扩散模型作为单图像反射去除的提示生成器 | Tao Wang, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Tae-Kyun Kim, Tong Lu, Hongdong Li, Ming-Hsuan Yang | http://arxiv.org/pdf/2402.02374v1 | null |
2024-02-04 | Closed-Loop Unsupervised Representation Disentanglement with |
使用 |
Xin Jin, Bohan Li, BAAO Xie, Wenyao Zhang, Jinming Liu, Ziqiang Li, Tao Yang, Wenjun Zeng | http://arxiv.org/pdf/2402.02346v1 | null |
2024-02-04 | Your Diffusion Model is Secretly a Certifiably Robust Classifier | 您的扩散模型实际上是一个经过验证的稳健分类器 | Huanran Chen, Yinpeng Dong, Shitong Shao, Zhongkai Hao, Xiao Yang, Hang Su, Jun Zhu | http://arxiv.org/pdf/2402.02316v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-04 | Generalizable Entity Grounding via Assistance of Large Language Model | 通过大语言模型的辅助进行泛化实体基础 | Lu Qi, Yi-Wen Chen, Lehan Yang, Tiancheng Shen, Xiangtai Li, Weidong Guo, Yu Xu, Ming-Hsuan Yang | http://arxiv.org/pdf/2402.02555v1 | null |
2024-02-04 | LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model | LHRS-Bot:利用 VGI 增强型大型多模态语言模型增强遥感能力 | Dilxat Muhtar, Zhenshi Li, Feng Gu, Xueliang Zhang, Pengfeng Xiao | http://arxiv.org/pdf/2402.02544v1 | link |
2024-02-04 | Knowledge Generation for Zero-shot Knowledge-based VQA | 零样本基于知识的 VQA 的知识生成 | Rui Cao, Jing Jiang | http://arxiv.org/pdf/2402.02541v1 | null |
2024-02-04 | GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering | GeReA:基于知识的视觉问答的问题感知提示字幕 | Ziyu Ma, Shutao Li, Bin Sun, Jianfei Cai, Zuxiang Long, Fuyan Ma | http://arxiv.org/pdf/2402.02503v1 | null |
2024-02-04 | M$^3$Face: A Unified Multi-Modal Multilingual Framework for Human Face Generation and Editing | M$^3$Face:用于人脸生成和编辑的统一多模态多语言框架 | Mohammadreza Mofayezi, Reza Alipour, Mohammad Ali Kakavand, Ehsaneddin Asgari | http://arxiv.org/pdf/2402.02369v1 | null |
2024-02-04 | Vision Transformer-based Multimodal Feature Fusion Network for Lymphoma Segmentation on PET/CT Images | 基于 Vision Transformer 的多模态特征融合网络,用于 PET/CT 图像上的淋巴瘤分割 | Huan Huang, Liheng Qiu, Shenmiao Yang, Longxi Li, Jiaofen Nan, Yanting Li, Chuang Han, Fubao Zhu, Chen Zhao, Weihua Zhou | http://arxiv.org/pdf/2402.02349v1 | null |
2024-02-04 | Bootstrapping Audio-Visual Segmentation by Strengthening Audio Cues | 通过加强音频提示引导视听分割 | Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Le Lu, Jieping Ye, Nenghai Yu | http://arxiv.org/pdf/2402.02327v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-04 | Spatio-temporal Prompting Network for Robust Video Feature Extraction | 用于鲁棒视频特征提取的时空提示网络 | Guanxiong Sun, Chi Wang, Zhaoyu Zhang, Jiankang Deng, Stefanos Zafeiriou, Yang Hua | http://arxiv.org/pdf/2402.02574v1 | null |
2024-02-04 | DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms in Vision Transformers | DeSparsify:针对 Vision Transformer 中代币稀疏化机制的对抗性攻击 | Oryan Yehezkel, Alon Zolfi, Amit Baras, Yuval Elovici, Asaf Shabtai | http://arxiv.org/pdf/2402.02554v1 | null |
2024-02-04 | Classification of Tennis Actions Using Deep Learning | 使用深度学习对网球动作进行分类 | Emil Hovad, Therese Hougaard-Jensen, Line Katrine Harder Clemmensen | http://arxiv.org/pdf/2402.02545v1 | null |
2024-02-04 | Embedding Non-Distortive Cancelable Face Template Generation | 嵌入非扭曲可取消面部模板生成 | Dmytro Zakharov, Oleksandr Kuznetsov, Emanuele Frontoni, Natalia Kryvinska | http://arxiv.org/pdf/2402.02540v1 | null |
2024-02-04 | Deep Supervision by Gaussian Pseudo-label-based Morphological Attention for Abdominal Aorta Segmentation in Non-Contrast CTs | 基于高斯伪标签的形态学关注对非造影 CT 腹主动脉分割的深度监督 | Qixiang Ma, Antoine Lucas, Adrien Kaladji, Pascal Haigron | http://arxiv.org/pdf/2402.02514v1 | null |
2024-02-04 | VM-UNet: Vision Mamba UNet for Medical Image Segmentation | VM-UNet:用于医学图像分割的 Vision Mamba UNet | Jiacheng Ruan, Suncheng Xiang | http://arxiv.org/pdf/2402.02491v1 | null |
2024-02-04 | Deep Spectral Improvement for Unsupervised Image Instance Segmentation | 无监督图像实例分割的深度光谱改进 | Farnoosh Arefi, Amir M. Mansourian, Shohreh Kasaei | http://arxiv.org/pdf/2402.02474v1 | null |
2024-02-04 | Learning Mutual Excitation for Hand-to-Hand and Human-to-Human Interaction Recognition | 学习手拉手和人与人交互识别的互激励 | Mengyuan Liu, Chen Chen, Songtao Wu, Fanyang Meng, Hong Liu | http://arxiv.org/pdf/2402.02431v1 | null |
2024-02-04 | Exploiting Low-level Representations for Ultra-Fast Road Segmentation | 利用低级表示进行超快速道路分段 | Huan Zhou, Feng Xue, Yucong Li, Shi Gong, Yiqun Li, Yu Zhou | http://arxiv.org/pdf/2402.02430v1 | null |
2024-02-04 | NOAH: Learning Pairwise Object Category Attentions for Image Classification | NOAH:学习图像分类的成对对象类别注意力 | Chao Li, Aojun Zhou, Anbang Yao | http://arxiv.org/pdf/2402.02377v1 | null |
2024-02-04 | Exploring Intrinsic Properties of Medical Images for Self-Supervised Binary Semantic Segmentation | 探索医学图像的内在属性以进行自监督二元语义分割 | Pranav Singh, Jacopo Cirrone | http://arxiv.org/pdf/2402.02367v1 | null |
2024-02-04 | Region-Based Representations Revisited | 重新审视基于区域的表示 | Michal Shlapentokh-Rothman, Ansel Blume, Yao Xiao, Yuqun Wu, Sethuraman T V, Heyi Tao, Jae Yong Lee, Wilfredo Torres, Yu-Xiong Wang, Derek Hoiem | http://arxiv.org/pdf/2402.02352v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-04 | Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning | 点云很重要:重新思考不同观察空间对机器人学习的影响 | Haoyi Zhu, Yating Wang, Di Huang, Weicai Ye, Wanli Ouyang, Tong He | http://arxiv.org/pdf/2402.02500v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-04 | Key-Graph Transformer for Image Restoration | 用于图像恢复的关键图转换器 | Bin Ren, Yawei Li, Jingyun Liang, Rakesh Ranjan, Mengyuan Liu, Rita Cucchiara, Luc Van Gool, Nicu Sebe | http://arxiv.org/pdf/2402.02634v1 | null |
2024-02-04 | Fully Differentiable Correlation-driven 2D/3D Registration for X-ray to CT Image Fusion | 用于 X 射线到 CT 图像融合的完全可微相关驱动的 2D/3D 配准 | Minheng Chen, Zhirun Zhang, Shuheng Gu, Zhangyang Ge, Youyong Kong | http://arxiv.org/pdf/2402.02498v1 | null |
2024-02-04 | Angle Robustness Unmanned Aerial Vehicle Navigation in GNSS-Denied Scenarios | GNSS 拒绝场景中的角度鲁棒性无人机导航 | Yuxin Wang, Zunlei Feng, Haofei Zhang, Yang Gao, Jie Lei, Li Sun, Mingli Song | http://arxiv.org/pdf/2402.02405v1 | null |
2024-02-04 | Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning | 从视觉提示中学习语义代理,以在深度度量学习中进行参数高效的微调 | Li Ren, Chen Chen, Liqiang Wang, Kien Hua | http://arxiv.org/pdf/2402.02340v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-04 | Multiplexed all-optical permutation operations using a reconfigurable diffractive optical network | 使用可重构衍射光网络的多路复用全光排列运算 | Guangdong Ma, Xilin Yang, Bijie Bai, Jingxi Li, Yuhang Li, Tianyi Gan, Che-Yung Shen, Yijie Zhang, Yuzhu Li, Mona Jarrahi, et.al. | http://arxiv.org/pdf/2402.02397v1 | null |
2024-02-04 | Uncertainty-Aware Testing-Time Optimization for 3D Human Pose Estimation | 3D 人体姿势估计的不确定性感知测试时间优化 | Ti Wang, Mengyuan Liu, Hong Liu, Bin Ren, Yingxuan You, Wenhao Li, Nicu Sebe, Xia Li | http://arxiv.org/pdf/2402.02339v1 | null |
2024-02-04 | CNS-Edit: 3D Shape Editing via Coupled Neural Shape Optimization | CNS-Edit:通过耦合神经形状优化进行 3D 形状编辑 | Jingyu Hu, Ka-Hei Hui, Zhengzhe Liu, Hao Zhang, Chi-Wing Fu | http://arxiv.org/pdf/2402.02313v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-04 | BECLR: Batch Enhanced Contrastive Few-Shot Learning | BECLR:批量增强对比小样本学习 | Stylianos Poulakakis-Daktylidis, Hadi Jamali-Rad | http://arxiv.org/pdf/2402.02444v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-04 | SIMPL: A Simple and Efficient Multi-agent Motion Prediction Baseline for Autonomous Driving | SIMPL:用于自动驾驶的简单高效的多智能体运动预测基准 | Lu Zhang, Peiliang Li, Sikang Liu, Shaojie Shen | http://arxiv.org/pdf/2402.02519v1 | null |
2024-02-04 | Uncertainty-Aware Perceiver | 不确定性感知感知器 | EuiYul Song | http://arxiv.org/pdf/2402.02433v1 | null |
2024-02-04 | Physics-Inspired Degradation Models for Hyperspectral Image Fusion | 用于高光谱图像融合的物理启发退化模型 | Jie Lian, Lizhi Wang, Lin Zhu, Renwei Dian, Zhiwei Xiong, Hua Huang | http://arxiv.org/pdf/2402.02411v1 | null |
2024-02-04 | AI-Generated Content Enhanced Computer-Aided Diagnosis Model for Thyroid Nodules: A ChatGPT-Style Assistant | AI 生成的内容增强型甲状腺结节计算机辅助诊断模型:ChatGPT 式助手 | Jincao Yao, Yunpeng Wang, Zhikai Lei, Kai Wang, Xiaoxian Li, Jianhua Zhou, Xiang Hao, Jiafei Shen, Zhenping Wang, Rongrong Ru, et.al. | http://arxiv.org/pdf/2402.02401v1 | null |
2024-02-04 | Revisiting the Power of Prompt for Visual Tuning | 重新审视视觉调整提示的力量 | Yuzhu Wang, Lechao Cheng, Chaowei Fang, Dingwen Zhang, Manni Duan, Meng Wang | http://arxiv.org/pdf/2402.02382v1 | null |
2024-02-04 | Stereographic Spherical Sliced Wasserstein Distances | 立体球面切片 Wasserstein 距离 | Huy Tran, Yikun Bai, Abihith Kothapalli, Ashkan Shahbazi, Xinran Liu, Rocio Diaz Martin, Soheil Kolouri | http://arxiv.org/pdf/2402.02345v1 | null |
2024-02-04 | Video Editing for Video Retrieval | 用于视频检索的视频编辑 | Bin Zhu, Kevin Flanagan, Adriano Fragomeni, Michael Wray, Dima Damen | http://arxiv.org/pdf/2402.02335v1 | null |