Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-18 | SDiT: Spiking Diffusion Model with Transformer | SDiT:带变压器的尖峰扩散模型 | Shu Yang, Hanzhi Ma, Chengting Yu, Aili Wang, Er-Ping Li | http://arxiv.org/pdf/2402.11588v1 | null |
2024-02-18 | GenAD: Generative End-to-End Autonomous Driving | GenAD:生成式端到端自动驾驶 | Wenzhao Zheng, Ruiqi Song, Xianda Guo, Long Chen | http://arxiv.org/pdf/2402.11502v1 | null |
2024-02-18 | IRFundusSet: An Integrated Retinal Rundus Dataset with a Harmonized Healthy Label | IRFundusSet:具有统一健康标签的综合视网膜 Rundus 数据集 | P. Bilha Githinji, Keming Zhao, Jiantao Wang, Peiwu Qin | http://arxiv.org/pdf/2402.11488v1 | null |
2024-02-18 | Visual Concept-driven Image Generation with Text-to-Image Diffusion Model | 使用文本到图像扩散模型的视觉概念驱动的图像生成 | Tanzila Rahman, Shweta Mahajan, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Leonid Sigal | http://arxiv.org/pdf/2402.11487v1 | null |
2024-02-18 | Data Distribution Distilled Generative Model for Generalized Zero-Shot Recognition | 用于广义零样本识别的数据分布蒸馏生成模型 | Yijie Wang, Mingjian Hong, Luwen Huangfu, Sheng Huang | http://arxiv.org/pdf/2402.11424v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-18 | MultiCorrupt: A Multi-Modal Robustness Dataset and Benchmark of LiDAR-Camera Fusion for 3D Object Detection | MultiCorrupt:用于 3D 物体检测的多模态鲁棒性数据集和 LiDAR-相机融合的基准 | Till Beemelmanns, Quan Zhang, Lutz Eckstein | http://arxiv.org/pdf/2402.11677v1 | null |
2024-02-18 | Efficient Multimodal Learning from Data-centric Perspective | 以数据为中心的高效多模态学习 | Muyang He, Yexin Liu, Boya Wu, Jianhao Yuan, Yueze Wang, Tiejun Huang, Bo Zhao | http://arxiv.org/pdf/2402.11530v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-18 | MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation | MAL:具有时间和蒸馏提示的运动感知损失,用于自监督深度估计 | Yup-Jiang Dong, Fang-Lue Zhang, Song-Hai Zhang | http://arxiv.org/pdf/2402.11507v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-18 | LiRaFusion: Deep Adaptive LiDAR-Radar Fusion for 3D Object Detection | LiRaFusion:用于 3D 物体检测的深度自适应 LiDAR-雷达融合 | Jingyu Song, Lingjun Zhao, Katherine A. Skinner | http://arxiv.org/pdf/2402.11735v1 | null |
2024-02-18 | Challenging the Black Box: A Comprehensive Evaluation of Attribution Maps of CNN Applications in Agriculture and Forestry | 挑战黑匣子:CNN农林应用归因图综合评价 | Lars Nieradzik, Henrike Stephani, Jördis Sieburg-Rockel, Stephanie Helmling, Andrea Olbrich, Janis Keuper | http://arxiv.org/pdf/2402.11670v1 | null |
2024-02-18 | Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models | 逻辑闭环:揭示大型视觉语言模型中的物体幻觉 | Junfei Wu, Qiang Liu, Ding Wang, Jinghao Zhang, Shu Wu, Liang Wang, Tieniu Tan | http://arxiv.org/pdf/2402.11622v1 | null |
2024-02-18 | PolypNextLSTM: A lightweight and fast polyp video segmentation network using ConvNext and ConvLSTM | PolypNextLSTM:使用 ConvNext 和 ConvLSTM 的轻量级快速息肉视频分割网络 | Debayan Bhattacharya, Konrad Reuter, Finn Behrendnt, Lennart Maack, Sarah Grube, Alexander Schlaefer | http://arxiv.org/pdf/2402.11585v1 | null |
2024-02-18 | A novel Fourier neural operator framework for classification of multi-sized images: Application to 3D digital porous media | 用于多尺寸图像分类的新型傅立叶神经算子框架:在 3D 数字多孔介质中的应用 | Ali Kashefi, Tapan Mukerji | http://arxiv.org/pdf/2402.11568v1 | null |
2024-02-18 | CPN: Complementary Proposal Network for Unconstrained Text Detection | CPN:用于无约束文本检测的补充提案网络 | Longhuang Wu, Shangxuan Tian, Youxin Wang, Pengfei Xiong | http://arxiv.org/pdf/2402.11540v1 | null |
2024-02-18 | Cross-Attention Fusion of Visual and Geometric Features for Large Vocabulary Arabic Lipreading | 大词汇量阿拉伯语唇读的视觉和几何特征的交叉注意融合 | Samar Daou, Ahmed Rekik, Achraf Ben-Hamadou, Abdelaziz Kallel | http://arxiv.org/pdf/2402.11520v1 | null |
2024-02-18 | Underestimation of lung regions on chest X-ray segmentation masks assessed by comparison with total lung volume evaluated on computed tomography | 通过与计算机断层扫描评估的总肺体积进行比较来评估胸部 X 射线分割掩模上的肺部区域低估 | Przemysław Bombiński, Patryk Szatkowski, Bartłomiej Sobieski, Tymoteusz Kwieciński, Szymon Płotka, Mariusz Adamek, Marcin Banasiuk, Mariusz I. Furmanek, Przemysław Biecek | http://arxiv.org/pdf/2402.11510v1 | null |
2024-02-18 | Thyroid ultrasound diagnosis improvement via multi-view self-supervised learning and two-stage pre-training | 通过多视角自监督学习和两阶段预训练提高甲状腺超声诊断 | Jian Wang, Xin Yang, Xiaohong Jia, Wufeng Xue, Rusi Chen, Yanlin Chen, Xiliang Zhu, Lian Liu, Yan Cao, Jianqiao Zhou, et.al. | http://arxiv.org/pdf/2402.11497v1 | null |
2024-02-18 | EndoOOD: Uncertainty-aware Out-of-distribution Detection in Capsule Endoscopy Diagnosis | EndoOOD:胶囊内窥镜诊断中的不确定性分布外检测 | Qiaozhi Tan, Long Bai, Guankun Wang, Mobarakol Islam, Hongliang Ren | http://arxiv.org/pdf/2402.11476v1 | null |
2024-02-18 | Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery Detection | 有毒的伪造人脸:针对人脸伪造检测的后门攻击 | Jiawei Liang, Siyuan Liang, Aishan Liu, Xiaojun Jia, Junhao Kuang, Xiaochun Cao | http://arxiv.org/pdf/2402.11473v1 | null |
2024-02-18 | Key Patch Proposer: Key Patches Contain Rich Information | 关键补丁提议者:关键补丁包含丰富信息 | Jing Xu, Beiwen Tian, Hao Zhao | http://arxiv.org/pdf/2402.11458v1 | null |
2024-02-18 | Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning | Momentor:利用细粒度时序推理推进视频大语言模型 | Long Qian, Juncheng Li, Yu Wu, Yaobo Ye, Hao Fei, Tat-Seng Chua, Yueting Zhuang, Siliang Tang | http://arxiv.org/pdf/2402.11435v1 | null |
2024-02-18 | A Multispectral Automated Transfer Technique (MATT) for machine-driven image labeling utilizing the Segment Anything Model (SAM) | 利用分段任意模型 (SAM) 进行机器驱动图像标记的多光谱自动传输技术 (MATT) | James E. Gallagher, Aryav Gogia, Edward J. Oughton | http://arxiv.org/pdf/2402.11413v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-18 | Aligning Modalities in Vision Large Language Models via Preference Fine-tuning | 通过偏好微调来调整视觉大语言模型中的模态 | Yiyang Zhou, Chenhang Cui, Rafael Rafailov, Chelsea Finn, Huaxiu Yao | http://arxiv.org/pdf/2402.11411v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-18 | 3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods | 使用递归神经网络和图像压缩方法进行 3D 点云压缩 | Till Beemelmanns, Yuchen Tao, Bastian Lampe, Lennart Reiher, Raphael van Kempen, Timo Woopen, Lutz Eckstein | http://arxiv.org/pdf/2402.11680v1 | null |
2024-02-18 | Neuromorphic Face Analysis: a Survey | 神经形态面部分析:一项调查 | Federico Becattini, Lorenzo Berlincioni, Luca Cultrera, Alberto Del Bimbo | http://arxiv.org/pdf/2402.11631v1 | null |
2024-02-18 | A Robust Error-Resistant View Selection Method for 3D Reconstruction | 一种鲁棒、抗错的 3D 重建视图选择方法 | Shaojie Zhang, Yinghui Wang, Bin Nan, Jinlong Yang, Tao Yan, Liangyi Huang, Mingfeng Wang | http://arxiv.org/pdf/2402.11431v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-18 | Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data Augmentation and Consistency Training | 通过重新审视数据增强和一致性训练来促进半监督二维人体姿势估计 | Huayi Zhou, Mukun Luo, Fei Jiang, Yue Ding, Hongtao Lu | http://arxiv.org/pdf/2402.11566v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-18 | The Effectiveness of Random Forgetting for Robust Generalization | 随机遗忘对鲁棒泛化的有效性 | Vijaya Raghavan T Ramkumar, Bahram Zonooz, Elahe Arani | http://arxiv.org/pdf/2402.11733v1 | null |
2024-02-18 | Learning Conditional Invariances through Non-Commutativity | 通过非交换性学习条件不变性 | Abhra Chaudhuri, Serban Georgescu, Anjan Dutta | http://arxiv.org/pdf/2402.11682v1 | null |
2024-02-18 | Interactive Garment Recommendation with User in the Loop | 与用户互动的服装推荐 | Federico Becattini, Xiaolin Chen, Andrea Puccia, Haokun Wen, Xuemeng Song, Liqiang Nie, Alberto Del Bimbo | http://arxiv.org/pdf/2402.11627v1 | null |
2024-02-18 | Visual In-Context Learning for Large Vision-Language Models | 大型视觉语言模型的视觉上下文学习 | Yucheng Zhou, Xiang Li, Qianning Wang, Jianbing Shen | http://arxiv.org/pdf/2402.11574v1 | null |
2024-02-18 | Evaluating Adversarial Robustness of Low dose CT Recovery | 评估低剂量 CT 恢复的对抗鲁棒性 | Kanchana Vaishnavi Gandikota, Paramanand Chandramouli, Hannah Droege, Michael Moeller | http://arxiv.org/pdf/2402.11557v1 | null |
2024-02-18 | To use or not to use proprietary street view images in (health and place) research? That is the question | 在(健康和场所)研究中使用或不使用专有街景图像?就是那个问题 | Marco Helbich, Matthew Danish, SM Labib, Britta Ricker | http://arxiv.org/pdf/2402.11504v1 | null |