Skip to content

Latest commit

 

History

History
executable file
·
101 lines (84 loc) · 16.7 KB

2024-01-24.md

File metadata and controls

executable file
·
101 lines (84 loc) · 16.7 KB

[UPDATED!] 2024-01-24 (Publish Time)

分类/检测/识别/分割

Publish Date Title Title_CN Authors PDF Code
2024-01-24 Algebraic methods for solving recognition problems with non-crossing classes 解决非交叉类识别问题的代数方法 Anvar Kabulov, Alimdzhan Babadzhanov, Islambek Saymanov http://arxiv.org/pdf/2401.13666v1 null
2024-01-24 Tyche: Stochastic In-Context Learning for Medical Image Segmentation Tyche:医学图像分割的随机上下文学习 Marianne Rakic, Hallee E. Wong, Jose Javier Gonzalez Ortiz, Beth Cimini, John Guttag, Adrian V. Dalca http://arxiv.org/pdf/2401.13650v1 null
2024-01-24 How Good is ChatGPT at Face Biometrics? A First Look into Recognition, Soft Biometrics, and Explainability ChatGPT 在人脸生物识别方面有多出色?初步探讨识别、软生物识别技术和可解释性 Ivan DeAndres-Tame, Ruben Tolosana, Ruben Vera-Rodriguez, Aythami Morales, Julian Fierrez, Javier Ortega-Garcia http://arxiv.org/pdf/2401.13641v1 null
2024-01-24 Enhancing Image Retrieval : A Comprehensive Study on Photo Search using the CLIP Mode 增强图像检索:使用CLIP模式进行照片搜索的综合研究 Naresh Kumar Lahajal, Harini S http://arxiv.org/pdf/2401.13613v1 null
2024-01-24 PLATE: A perception-latency aware estimator, PLATE:感知延迟感知估计器, Rodrigo Aldana-López, Rosario Aragüés, Carlos Sagüés http://arxiv.org/pdf/2401.13596v1 null
2024-01-24 SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation SegMamba:用于 3D 医学图像分割的远程顺序建模 Mamba Zhaohu Xing, Tian Ye, Yijun Yang, Guang Liu, Lei Zhu http://arxiv.org/pdf/2401.13560v1 link
2024-01-24 PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition PanAf20K:用于野生猿检测和行为识别的大型视频数据集 Otto Brookes, Majid Mirmehdi, Colleen Stephens, Samuel Angedakin, Katherine Corogenes, Dervla Dowd, Paula Dieguez, Thurston C. Hicks, Sorrel Jones, Kevin Lee, et.al. http://arxiv.org/pdf/2401.13554v1 null
2024-01-24 Interleaving One-Class and Weakly-Supervised Models with Adaptive Thresholding for Unsupervised Video Anomaly Detection 将一类模型和弱监督模型与自适应阈值交错进行无监督视频异常检测 Yongwei Nie, Hao Huang, Chengjiang Long, Qing Zhang, Pradipta Maji, Hongmin Cai http://arxiv.org/pdf/2401.13551v1 null
2024-01-24 QAGait: Revisit Gait Recognition from a Quality Perspective QGait:从质量角度重新审视步态识别 Zengbin Wang, Saihui Hou, Man Zhang, Xu Liu, Chunshui Cao, Yongzhen Huang, Peipei Li, Shibiao Xu http://arxiv.org/pdf/2401.13531v1 link
2024-01-24 Delocate: Detection and Localization for Deepfake Videos with Randomly-Located Tampered Traces Delocate:具有随机位置的篡改痕迹的 Deepfake 视频的检测和定位 Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou http://arxiv.org/pdf/2401.13516v1 null
2024-01-24 Tissue Cross-Section and Pen Marking Segmentation in Whole Slide Images 整个幻灯片图像中的组织横截面和笔标记分割 Ruben T. Lucassen, Willeke A. M. Blokx, Mitko Veta http://arxiv.org/pdf/2401.13511v1 null
2024-01-24 Research about the Ability of LLM in the Tamper-Detection Area 法学硕士在篡改检测领域的能力研究 Xinyu Yang, Jizhe Zhou http://arxiv.org/pdf/2401.13504v1 null
2024-01-24 LDCA: Local Descriptors with Contextual Augmentation for Few-Shot Learning LDCA:具有上下文增强的局部描述符,用于少样本学习 Maofa Wang, Bingchen Yan http://arxiv.org/pdf/2401.13499v1 null
2024-01-24 Segmenting Cardiac Muscle Z-disks with Deep Neural Networks 使用深度神经网络分割心肌 Z 盘 Mihaela Croitor Ibrahim, Nishant Ravikumar, Alistair Curd, Joanna Leng, Oliver Umney, Michelle Peckham http://arxiv.org/pdf/2401.13472v1 null
2024-01-24 GTAutoAct: An Automatic Datasets Generation Framework Based on Game Engine Redevelopment for Action Recognition GTAutoAct:基于游戏引擎重新开发的动作识别自动数据集生成框架 Xingyu Song, Zhan Li, Shi Chen, Kazuyuki Demachi http://arxiv.org/pdf/2401.13414v1 null
2024-01-24 Synthetic data enables faster annotation and robust segmentation for multi-object grasping in clutter 合成数据可以实现更快的注释和强大的分割,以实现杂乱中的多对象抓取 Dongmyoung Lee, Wei Chen, Nicolas Rojas http://arxiv.org/pdf/2401.13405v1 null
2024-01-24 SEDNet: Shallow Encoder-Decoder Network for Brain Tumor Segmentation SEDNet:用于脑肿瘤分割的浅层编码器-解码器网络 Chollette C. Olisah http://arxiv.org/pdf/2401.13403v1 null
2024-01-24 UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion UNIMO-G:通过多模态条件扩散生成统一图像 Wei Li, Xue Xu, Jiachen Liu, Xinyan Xiao http://arxiv.org/pdf/2401.13388v1 null
2024-01-24 Privacy-Preserving Face Recognition in Hybrid Frequency-Color Domain 混合频色域中的隐私保护人脸识别 Dong Han, Yong Li, Joachim Denzler http://arxiv.org/pdf/2401.13386v1 null
2024-01-24 NACHOS: Neural Architecture Search for Hardware Constrained Early Exit Neural Networks NACHOS:硬件约束提前退出神经网络的神经架构搜索 Matteo Gambella, Jary Pomponi, Simone Scardapane, Manuel Roveri http://arxiv.org/pdf/2401.13330v1 null
2024-01-24 Memory Consistency Guided Divide-and-Conquer Learning for Generalized Category Discovery 用于广义类别发现的内存一致性引导分而治之学习 Yuanpeng Tu, Zhun Zhong, Yuxi Li, Hengshuang Zhao http://arxiv.org/pdf/2401.13325v1 null
2024-01-24 Deep Learning for Improved Polyp Detection from Synthetic Narrow-Band Imaging 通过深度学习改进合成窄带成像息肉检测 Mathias Ramm Haugland, Hemin Ali Qadir, Ilangko Balasingham http://arxiv.org/pdf/2401.13315v1 null
2024-01-24 Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region LiDAR 点云中的小物体跟踪:学习目标感知原型和细粒度搜索区域 Shengjing Tian, Yinan Han, Xiuping Liu, Xiantong Zhao http://arxiv.org/pdf/2401.13285v1 null
2024-01-24 DDI-CoCo: A Dataset For Understanding The Effect Of Color Contrast In Machine-Assisted Skin Disease Detection DDI-CoCo:用于了解机器辅助皮肤病检测中颜色对比度效果的数据集 Ming-Chang Chiu, Yingfei Wang, Yen-Ju Kuo, Pin-Yu Chen http://arxiv.org/pdf/2401.13280v1 link
2024-01-24 Enhancing cross-domain detection: adaptive class-aware contrastive transformer 增强跨域检测:自适应类感知对比变压器 Ziru Zeng, Yue Ding, Hongtao Lu http://arxiv.org/pdf/2401.13264v1 null
2024-01-24 Segment Any Cell: A SAM-based Auto-prompting Fine-tuning Framework for Nuclei Segmentation 分割任意细胞:基于 SAM 的细胞核分割自动提示微调框架 Saiyang Na, Yuzhi Guo, Feng Jiang, Hehuan Ma, Junzhou Huang http://arxiv.org/pdf/2401.13220v1 null
2024-01-24 AMANet: Advancing SAR Ship Detection with Adaptive Multi-Hierarchical Attention Network AMANet:利用自适应多层次注意力网络推进 SAR 船舶检测 Xiaolin Ma, Junkai Cheng, Aihua Li, Yuhua Zhang, Zhilong Lin http://arxiv.org/pdf/2401.13214v1 null
2024-01-24 Common-Sense Bias Discovery and Mitigation for Classification Tasks 分类任务的常识性偏差发现和缓解 Miao Zhang, Zee fryer, Ben Colman, Ali Shahriyari, Gaurav Bharaj http://arxiv.org/pdf/2401.13213v1 null
2024-01-24 AdCorDA: Classifier Refinement via Adversarial Correction and Domain Adaptation AdCorDA:通过对抗性校正和域适应进行分类器细化 Lulan Shen, Ali Edalati, Brett Meyer, Warren Gross, James J. Clark http://arxiv.org/pdf/2401.13212v1 null
2024-01-24 Boosting the Transferability of Adversarial Examples via Local Mixup and Adaptive Step Size 通过局部混合和自适应步长提高对抗性示例的可迁移性 Junlin Liu, Xinchen Lyu http://arxiv.org/pdf/2401.13205v1 null
2024-01-24 Catch-Up Mix: Catch-Up Class for Struggling Filters in CNN Catch-Up Mix:CNN 中陷入困境的过滤器的 Catch-Up 类 Minsoo Kang, Minkoo Kang, Suhyun Kim http://arxiv.org/pdf/2401.13193v1 null
2024-01-24 Towards Multi-domain Face Landmark Detection with Synthetic Data from Diffusion model 利用扩散模型的合成数据进行多域人脸特征点检测 Yuanming Li, Gwantae Kim, Jeong-gi Kwak, Bon-hwa Ku, Hanseok Ko http://arxiv.org/pdf/2401.13191v1 null
2024-01-24 Boundary and Relation Distillation for Semantic Segmentation 语义分割的边界和关系蒸馏 Dong Zhang, Pingcheng Dong, Xinting Hu, Long Chen, Kwang-Ting Cheng http://arxiv.org/pdf/2401.13174v1 null

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-01-24 Towards Efficient and Effective Deep Clustering with Dynamic Grouping and Prototype Aggregation 通过动态分组和原型聚合实现高效且有效的深度聚类 Haixin Zhang, Dong Huang http://arxiv.org/pdf/2401.13581v1 null
2024-01-24 Benchmarking the Fairness of Image Upsampling Methods 图像上采样方法的公平性基准测试 Mike Laszkiewicz, Imant Daunhawer, Julia E. Vogt, Asja Fischer, Johannes Lederer http://arxiv.org/pdf/2401.13555v1 null
2024-01-24 Generative Human Motion Stylization in Latent Space 潜在空间中的生成人体运动风格化 Chuan Guo, Yuxuan Mu, Xinxin Zuo, Peng Dai, Youliang Yan, Juwei Lu, Li Cheng http://arxiv.org/pdf/2401.13505v1 null
2024-01-24 Learning Representations for Clustering via Partial Information Discrimination and Cross-Level Interaction 通过部分信息辨别和跨级交互学习聚类表示 Hai-Xin Zhang, Dong Huang, Hua-Bao Ling, Guang-Yu Zhang, Wei-jun Sun, Zi-hao Wen http://arxiv.org/pdf/2401.13503v1 link

多模态

Publish Date Title Title_CN Authors PDF Code
2024-01-24 VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks VisualWebArena:在实际视觉 Web 任务上评估多模式代理 Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, Daniel Fried http://arxiv.org/pdf/2401.13649v1 null
2024-01-24 Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild 扩展至卓越:实践模型扩展以在野外恢复照片般真实的图像 Fanghua Yu, Jinjin Gu, Zheyuan Li, Jinfan Hu, Xiangtao Kong, Xintao Wang, Jingwen He, Yu Qiao, Chao Dong http://arxiv.org/pdf/2401.13627v1 null
2024-01-24 SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval SciMMIR:科学多模态信息检索基准测试 Siwei Wu, Yizhi Li, Kang Zhu, Ge Zhang, Yiming Liang, Kaijing Ma, Chenghao Xiao, Haoran Zhang, Bohao Yang, Wenhu Chen, et.al. http://arxiv.org/pdf/2401.13478v1 null
2024-01-24 Serial fusion of multi-modal biometric systems 多模态生物识别系统的串行融合 Gian Luca Marcialis, Paolo Mastinu, Fabio Roli http://arxiv.org/pdf/2401.13418v1 null
2024-01-24 Generative Video Diffusion for Unseen Cross-Domain Video Moment Retrieval 用于看不见的跨域视频时刻检索的生成视频扩散 Dezhao Luo, Jiabo Huang, Shaogang Gong, Hailin Jin, Yang Liu http://arxiv.org/pdf/2401.13329v1 null
2024-01-24 InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions InstructDoc:带有指令的视觉文档理解零样本泛化数据集 Ryota Tanaka, Taichi Iki, Kyosuke Nishida, Kuniko Saito, Jun Suzuki http://arxiv.org/pdf/2401.13313v1 link
2024-01-24 ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models ConTextual:评估大型多模态模型中的上下文敏感文本丰富的视觉推理 Rohan Wadhawan, Hritik Bansal, Kai-Wei Chang, Nanyun Peng http://arxiv.org/pdf/2401.13311v1 null
2024-01-24 ChatterBox: Multi-round Multimodal Referring and Grounding ChatterBox:多轮多模态参考和接地 Yunjie Tian, Tianren Ma, Lingxi Xie, Jihao Qiu, Xi Tang, Yuan Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye http://arxiv.org/pdf/2401.13307v1 link
2024-01-24 MLLMReID: Multimodal Large Language Model-based Person Re-identification MLLMReID:基于多模态大语言模型的行人重新识别 Shan Yang, Yongfei Zhang http://arxiv.org/pdf/2401.13201v1 null

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-01-24 Unified-Width Adaptive Dynamic Network for All-In-One Image Restoration 用于一体化图像恢复的统一宽度自适应动态网络 Yimin Xu, Nanxi Gao, Zhongyun Shan, Fei Chao, Rongrong Ji http://arxiv.org/pdf/2401.13221v1 link
2024-01-24 ADMap: Anti-disturbance framework for reconstructing online vectorized HD map ADMap:重建在线矢量化高精地图的抗干扰框架 Haotian Hu, Fanyi Wang, Yaonong Wang, Laifeng Hu, Jingwei Xu, Zhiwang Zhang http://arxiv.org/pdf/2401.13172v1 link

Nerf

Publish Date Title Title_CN Authors PDF Code
2024-01-24 EndoGaussians: Single View Dynamic Gaussian Splatting for Deformable Endoscopic Tissues Reconstruction EndoGaussians:用于变形内窥镜组织重建的单视图动态高斯溅射 Yangsen Chen, Hao Wang http://arxiv.org/pdf/2401.13352v1 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-01-24 Style-Consistent 3D Indoor Scene Synthesis with Decoupled Objects 具有解耦对象的风格一致的 3D 室内场景合成 Yunfan Zhang, Hong Huang, Zhiwei Xiong, Zhiqi Shen, Guosheng Lin, Hao Wang, Nicholas Vun http://arxiv.org/pdf/2401.13203v1 null

各类学习方式

Publish Date Title Title_CN Authors PDF Code
2024-01-24 Semi-Supervised Coupled Thin-Plate Spline Model for Rotation Correction and Beyond 用于旋转校正及其他的半监督耦合薄板样条模型 Lang Nie, Chunyu Lin, Kang Liao, Shuaicheng Liu, Yao Zhao http://arxiv.org/pdf/2401.13432v1 link
2024-01-24 Do You Guys Want to Dance: Zero-Shot Compositional Human Dance Generation with Multiple Persons 你们想跳舞吗:多人零镜头组合人类舞蹈生成 Zhe Xu, Kun Wei, Xu Yang, Cheng Deng http://arxiv.org/pdf/2401.13363v1 null

其他

Publish Date Title Title_CN Authors PDF Code
2024-01-24 FLLIC: Functionally Lossless Image Compression FLLIC:功能无损图像压缩 Xi Zhang, Xiaolin Wu http://arxiv.org/pdf/2401.13616v1 null
2024-01-24 Linear Relative Pose Estimation Founded on Pose-only Imaging Geometry 基于仅位姿成像几何的线性相对位姿估计 Qi Cai, Xinrui Li, Yuanxin Wu http://arxiv.org/pdf/2401.13357v1 null
2024-01-24 Visual Objectification in Films: Towards a New AI Task for Video Interpretation 电影中的视觉对象化:迈向视频解读的新人工智能任务 Julie Tores, Lucile Sassatelli, Hui-Yin Wu, Clement Bergman, Lea Andolfi, Victor Ecrement, Frederic Precioso, Thierry Devars, Magali Guaresi, Virginie Julliard, et.al. http://arxiv.org/pdf/2401.13296v1 null
2024-01-24 Audio-Infused Automatic Image Colorization by Exploiting Audio Scene Semantics 利用音频场景语义注入音频的自动图像着色 Pengcheng Zhao, Yanxiang Chen, Yang Zhao, Wei Jia, Zhao Zhang, Ronggang Wang, Richang Hong http://arxiv.org/pdf/2401.13270v1 null
2024-01-24 Dual-modal Dynamic Traceback Learning for Medical Report Generation 用于生成医疗报告的双模态动态回溯学习 Shuchang Ye, Mingyuan Meng, Mingjian Li, Dagan Feng, Jinman Kim http://arxiv.org/pdf/2401.13267v1 null
2024-01-24 Predicting Mitral Valve mTEER Surgery Outcomes Using Machine Learning and Deep Learning Techniques 使用机器学习和深度学习技术预测二尖瓣 mTEER 手术结果 Tejas Vyas, Mohsena Chowdhury, Xiaojiao Xiao, Mathias Claeys, Géraldine Ong, Guanghui Wang http://arxiv.org/pdf/2401.13197v1 null
2024-01-24 A Generalized Multiscale Bundle-Based Hyperspectral Sparse Unmixing Algorithm 一种广义的基于多尺度束的高光谱稀疏解混算法 Luciano Carvalho Ayres, Ricardo Augusto Borsoi, José Carlos Moreira Bermudez, Sérgio José Melo de Almeida http://arxiv.org/pdf/2401.13161v1 link