Skip to content

Latest commit

 

History

History
executable file
·
75 lines (58 loc) · 9.82 KB

2024-02-18.md

File metadata and controls

executable file
·
75 lines (58 loc) · 9.82 KB

[UPDATED!] 2024-02-18 (Publish Time)

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-02-18 SDiT: Spiking Diffusion Model with Transformer SDiT:带变压器的尖峰扩散模型 Shu Yang, Hanzhi Ma, Chengting Yu, Aili Wang, Er-Ping Li http://arxiv.org/pdf/2402.11588v1 null
2024-02-18 GenAD: Generative End-to-End Autonomous Driving GenAD:生成式端到端自动驾驶 Wenzhao Zheng, Ruiqi Song, Xianda Guo, Long Chen http://arxiv.org/pdf/2402.11502v1 null
2024-02-18 IRFundusSet: An Integrated Retinal Rundus Dataset with a Harmonized Healthy Label IRFundusSet:具有统一健康标签的综合视网膜 Rundus 数据集 P. Bilha Githinji, Keming Zhao, Jiantao Wang, Peiwu Qin http://arxiv.org/pdf/2402.11488v1 null
2024-02-18 Visual Concept-driven Image Generation with Text-to-Image Diffusion Model 使用文本到图像扩散模型的视觉概念驱动的图像生成 Tanzila Rahman, Shweta Mahajan, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Leonid Sigal http://arxiv.org/pdf/2402.11487v1 null
2024-02-18 Data Distribution Distilled Generative Model for Generalized Zero-Shot Recognition 用于广义零样本识别的数据分布蒸馏生成模型 Yijie Wang, Mingjian Hong, Luwen Huangfu, Sheng Huang http://arxiv.org/pdf/2402.11424v1 null

多模态

Publish Date Title Title_CN Authors PDF Code
2024-02-18 MultiCorrupt: A Multi-Modal Robustness Dataset and Benchmark of LiDAR-Camera Fusion for 3D Object Detection MultiCorrupt:用于 3D 物体检测的多模态鲁棒性数据集和 LiDAR-相机融合的基准 Till Beemelmanns, Quan Zhang, Lutz Eckstein http://arxiv.org/pdf/2402.11677v1 null
2024-02-18 Efficient Multimodal Learning from Data-centric Perspective 以数据为中心的高效多模态学习 Muyang He, Yexin Liu, Boya Wu, Jianhao Yuan, Yueze Wang, Tiejun Huang, Bo Zhao http://arxiv.org/pdf/2402.11530v1 null

模型压缩/优化

Publish Date Title Title_CN Authors PDF Code
2024-02-18 MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation MAL:具有时间和蒸馏提示的运动感知损失,用于自监督深度估计 Yup-Jiang Dong, Fang-Lue Zhang, Song-Hai Zhang http://arxiv.org/pdf/2402.11507v1 null

分类/检测/识别/分割/...

Publish Date Title Title_CN Authors PDF Code
2024-02-18 LiRaFusion: Deep Adaptive LiDAR-Radar Fusion for 3D Object Detection LiRaFusion:用于 3D 物体检测的深度自适应 LiDAR-雷达融合 Jingyu Song, Lingjun Zhao, Katherine A. Skinner http://arxiv.org/pdf/2402.11735v1 null
2024-02-18 Challenging the Black Box: A Comprehensive Evaluation of Attribution Maps of CNN Applications in Agriculture and Forestry 挑战黑匣子:CNN农林应用归因图综合评价 Lars Nieradzik, Henrike Stephani, Jördis Sieburg-Rockel, Stephanie Helmling, Andrea Olbrich, Janis Keuper http://arxiv.org/pdf/2402.11670v1 null
2024-02-18 Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models 逻辑闭环:揭示大型视觉语言模型中的物体幻觉 Junfei Wu, Qiang Liu, Ding Wang, Jinghao Zhang, Shu Wu, Liang Wang, Tieniu Tan http://arxiv.org/pdf/2402.11622v1 null
2024-02-18 PolypNextLSTM: A lightweight and fast polyp video segmentation network using ConvNext and ConvLSTM PolypNextLSTM:使用 ConvNext 和 ConvLSTM 的轻量级快速息肉视频分割网络 Debayan Bhattacharya, Konrad Reuter, Finn Behrendnt, Lennart Maack, Sarah Grube, Alexander Schlaefer http://arxiv.org/pdf/2402.11585v1 null
2024-02-18 A novel Fourier neural operator framework for classification of multi-sized images: Application to 3D digital porous media 用于多尺寸图像分类的新型傅立叶神经算子框架:在 3D 数字多孔介质中的应用 Ali Kashefi, Tapan Mukerji http://arxiv.org/pdf/2402.11568v1 null
2024-02-18 CPN: Complementary Proposal Network for Unconstrained Text Detection CPN:用于无约束文本检测的补充提案网络 Longhuang Wu, Shangxuan Tian, Youxin Wang, Pengfei Xiong http://arxiv.org/pdf/2402.11540v1 null
2024-02-18 Cross-Attention Fusion of Visual and Geometric Features for Large Vocabulary Arabic Lipreading 大词汇量阿拉伯语唇读的视觉和几何特征的交叉注意融合 Samar Daou, Ahmed Rekik, Achraf Ben-Hamadou, Abdelaziz Kallel http://arxiv.org/pdf/2402.11520v1 null
2024-02-18 Underestimation of lung regions on chest X-ray segmentation masks assessed by comparison with total lung volume evaluated on computed tomography 通过与计算机断层扫描评估的总肺体积进行比较来评估胸部 X 射线分割掩模上的肺部区域低估 Przemysław Bombiński, Patryk Szatkowski, Bartłomiej Sobieski, Tymoteusz Kwieciński, Szymon Płotka, Mariusz Adamek, Marcin Banasiuk, Mariusz I. Furmanek, Przemysław Biecek http://arxiv.org/pdf/2402.11510v1 null
2024-02-18 Thyroid ultrasound diagnosis improvement via multi-view self-supervised learning and two-stage pre-training 通过多视角自监督学习和两阶段预训练提高甲状腺超声诊断 Jian Wang, Xin Yang, Xiaohong Jia, Wufeng Xue, Rusi Chen, Yanlin Chen, Xiliang Zhu, Lian Liu, Yan Cao, Jianqiao Zhou, et.al. http://arxiv.org/pdf/2402.11497v1 null
2024-02-18 EndoOOD: Uncertainty-aware Out-of-distribution Detection in Capsule Endoscopy Diagnosis EndoOOD:胶囊内窥镜诊断中的不确定性分布外检测 Qiaozhi Tan, Long Bai, Guankun Wang, Mobarakol Islam, Hongliang Ren http://arxiv.org/pdf/2402.11476v1 null
2024-02-18 Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery Detection 有毒的伪造人脸:针对人脸伪造检测的后门攻击 Jiawei Liang, Siyuan Liang, Aishan Liu, Xiaojun Jia, Junhao Kuang, Xiaochun Cao http://arxiv.org/pdf/2402.11473v1 null
2024-02-18 Key Patch Proposer: Key Patches Contain Rich Information 关键补丁提议者:关键补丁包含丰富信息 Jing Xu, Beiwen Tian, Hao Zhao http://arxiv.org/pdf/2402.11458v1 null
2024-02-18 Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning Momentor:利用细粒度时序推理推进视频大语言模型 Long Qian, Juncheng Li, Yu Wu, Yaobo Ye, Hao Fei, Tat-Seng Chua, Yueting Zhuang, Siliang Tang http://arxiv.org/pdf/2402.11435v1 null
2024-02-18 A Multispectral Automated Transfer Technique (MATT) for machine-driven image labeling utilizing the Segment Anything Model (SAM) 利用分段任意模型 (SAM) 进行机器驱动图像标记的多光谱自动传输技术 (MATT) James E. Gallagher, Aryav Gogia, Edward J. Oughton http://arxiv.org/pdf/2402.11413v1 null

LLM

Publish Date Title Title_CN Authors PDF Code
2024-02-18 Aligning Modalities in Vision Large Language Models via Preference Fine-tuning 通过偏好微调来调整视觉大语言模型中的模态 Yiyang Zhou, Chenhang Cui, Rafael Rafailov, Chelsea Finn, Huaxiu Yao http://arxiv.org/pdf/2402.11411v1 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-02-18 3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods 使用递归神经网络和图像压缩方法进行 3D 点云压缩 Till Beemelmanns, Yuchen Tao, Bastian Lampe, Lennart Reiher, Raphael van Kempen, Timo Woopen, Lutz Eckstein http://arxiv.org/pdf/2402.11680v1 null
2024-02-18 Neuromorphic Face Analysis: a Survey 神经形态面部分析:一项调查 Federico Becattini, Lorenzo Berlincioni, Luca Cultrera, Alberto Del Bimbo http://arxiv.org/pdf/2402.11631v1 null
2024-02-18 A Robust Error-Resistant View Selection Method for 3D Reconstruction 一种鲁棒、抗错的 3D 重建视图选择方法 Shaojie Zhang, Yinghui Wang, Bin Nan, Jinlong Yang, Tao Yan, Liangyi Huang, Mingfeng Wang http://arxiv.org/pdf/2402.11431v1 null

各类学习方式

Publish Date Title Title_CN Authors PDF Code
2024-02-18 Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data Augmentation and Consistency Training 通过重新审视数据增强和一致性训练来促进半监督二维人体姿势估计 Huayi Zhou, Mukun Luo, Fei Jiang, Yue Ding, Hongtao Lu http://arxiv.org/pdf/2402.11566v1 null

其他

Publish Date Title Title_CN Authors PDF Code
2024-02-18 The Effectiveness of Random Forgetting for Robust Generalization 随机遗忘对鲁棒泛化的有效性 Vijaya Raghavan T Ramkumar, Bahram Zonooz, Elahe Arani http://arxiv.org/pdf/2402.11733v1 null
2024-02-18 Learning Conditional Invariances through Non-Commutativity 通过非交换性学习条件不变性 Abhra Chaudhuri, Serban Georgescu, Anjan Dutta http://arxiv.org/pdf/2402.11682v1 null
2024-02-18 Interactive Garment Recommendation with User in the Loop 与用户互动的服装推荐 Federico Becattini, Xiaolin Chen, Andrea Puccia, Haokun Wen, Xuemeng Song, Liqiang Nie, Alberto Del Bimbo http://arxiv.org/pdf/2402.11627v1 null
2024-02-18 Visual In-Context Learning for Large Vision-Language Models 大型视觉语言模型的视觉上下文学习 Yucheng Zhou, Xiang Li, Qianning Wang, Jianbing Shen http://arxiv.org/pdf/2402.11574v1 null
2024-02-18 Evaluating Adversarial Robustness of Low dose CT Recovery 评估低剂量 CT 恢复的对抗鲁棒性 Kanchana Vaishnavi Gandikota, Paramanand Chandramouli, Hannah Droege, Michael Moeller http://arxiv.org/pdf/2402.11557v1 null
2024-02-18 To use or not to use proprietary street view images in (health and place) research? That is the question 在(健康和场所)研究中使用或不使用专有街景图像?就是那个问题 Marco Helbich, Matthew Danish, SM Labib, Britta Ricker http://arxiv.org/pdf/2402.11504v1 null