Skip to content

Latest commit

 

History

History
executable file
·
109 lines (88 loc) · 16.8 KB

2024-01-23.md

File metadata and controls

executable file
·
109 lines (88 loc) · 16.8 KB

[UPDATED!] 2024-01-23 (Publish Time)

分类/检测/识别/分割

Publish Date Title Title_CN Authors PDF Code
2024-01-23 GALA: Generating Animatable Layered Assets from a Single Scan GALA:通过单次扫描生成可动画化的分层资源 Taeksoo Kim, Byungjun Kim, Shunsuke Saito, Hanbyul Joo http://arxiv.org/pdf/2401.12979v1 null
2024-01-23 SegmentAnyBone: A Universal Model that Segments Any Bone at Any Location on MRI SegmentAnyBone:一种通用模型,可在 MRI 上的任何位置分割任何骨骼 Hanxue Gu, Roy Colglazier, Haoyu Dong, Jikai Zhang, Yaqian Chen, Zafer Yildiz, Yuwen Chen, Lin Li, Jichen Yang, Jay Willhite, et.al. http://arxiv.org/pdf/2401.12974v1 null
2024-01-23 Neural deformation fields for template-based reconstruction of cortical surfaces from MRI 用于基于 MRI 皮质表面模板重建的神经变形场 Fabian Bongratz, Anne-Marie Rickmann, Christian Wachinger http://arxiv.org/pdf/2401.12938v1 null
2024-01-23 Segmentation of tibiofemoral joint tissues from knee MRI using MtRA-Unet and incorporating shape information: Data from the Osteoarthritis Initiative 使用 MtRA-Unet 对膝 MRI 中的胫股关节组织进行分割并结合形状信息:来自骨关节炎倡议的数据 Akshay Daydar, Alik Pramanick, Arijit Sur, Subramani Kanagaraj http://arxiv.org/pdf/2401.12932v1 null
2024-01-23 Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning? 面对房间里的大象:视觉提示调整还是全面微调? Cheng Han, Qifan Wang, Yiming Cui, Wenguan Wang, Lifu Huang, Siyuan Qi, Dongfang Liu http://arxiv.org/pdf/2401.12902v1 null
2024-01-23 Unlocking the Potential: Multi-task Deep Learning for Spaceborne Quantitative Monitoring of Fugitive Methane Plumes 释放潜力:用于星载逃逸甲烷羽流定量监测的多任务深度学习 Guoxin Si, Shiliang Fu, Wei Yao http://arxiv.org/pdf/2401.12870v1 null
2024-01-23 Classification of grapevine varieties using UAV hyperspectral imaging 利用无人机高光谱成像对葡萄品种进行分类 Alfonso López, Carlos Javier Ogayar, Francisco Ramón Feito, Joaquim João Sousa http://arxiv.org/pdf/2401.12851v1 null
2024-01-23 DatUS^2: Data-driven Unsupervised Semantic Segmentation with Pre-trained Self-supervised Vision Transformer DatUS^2:数据驱动的无监督语义分割与预训练的自监督视觉 Transformer Sonal Kumar, Arijit Sur, Rashmi Dutta Baruah http://arxiv.org/pdf/2401.12820v1 null
2024-01-23 MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty MUSES:用于不确定性驾驶的多传感器语义感知数据集 Tim Brödermann, David Bruggemann, Christos Sakaridis, Kevin Ta, Odysseas Liagouris, Jason Corkill, Luc Van Gool http://arxiv.org/pdf/2401.12761v1 null
2024-01-23 Correlation-Embedded Transformer Tracking: A Single-Branch Framework 相关嵌入式变压器跟踪:单分支框架 Fei Xie, Wankou Yang, Chunyu Wang, Lei Chu, Yue Cao, Chao Ma, Wenjun Zeng http://arxiv.org/pdf/2401.12743v1 null
2024-01-23 Enhancing Object Detection Performance for Small Objects through Synthetic Data Generation and Proportional Class-Balancing Technique: A Comparative Study in Industrial Scenarios 通过合成数据生成和比例类平衡技术增强小物体的物体检测性能:工业场景的比较研究 Jibinraj Antony, Vinit Hegiste, Ali Nazeri, Hooman Tavakoli, Snehal Walunj, Christiane Plociennik, Martin Ruskowski http://arxiv.org/pdf/2401.12729v1 null
2024-01-23 Two-View Topogram-Based Anatomy-Guided CT Reconstruction for Prospective Risk Minimization 基于双视图拓扑图的解剖引导 CT 重建,实现前瞻性风险最小化 Chang Liu, Laura Klein, Yixing Huang, Edith Baader, Michael Lell, Marc Kachelrieß, Andreas Maier http://arxiv.org/pdf/2401.12725v1 null
2024-01-23 Pragmatic Communication in Multi-Agent Collaborative Perception 多智能体协作感知中的语用沟通 Yue Hu, Xianghe Pang, Xiaoqi Qin, Yonina C. Eldar, Siheng Chen, Ping Zhang, Wenjun Zhang http://arxiv.org/pdf/2401.12694v1 null
2024-01-23 Energy-based Automated Model Evaluation 基于能量的自动化模型评估 Ru Peng, Heming Zou, Haobo Wang, Yawen Zeng, Zenan Huang, Junbo Zhao http://arxiv.org/pdf/2401.12689v1 link
2024-01-23 ClipSAM: CLIP and SAM Collaboration for Zero-Shot Anomaly Segmentation ClipSAM:CLIP 和 SAM 协作进行零样本异常分割 Shengze Li, Jianjian Cao, Peng Ye, Yuhan Ding, Chongjun Tu, Tao Chen http://arxiv.org/pdf/2401.12665v1 null
2024-01-23 Self-Supervised Vision Transformers Are Efficient Segmentation Learners for Imperfect Labels 自监督视觉变压器是不完美标签的有效分割学习器 Seungho Lee, Seoungyoon Kang, Hyunjung Shim http://arxiv.org/pdf/2401.12535v1 null
2024-01-23 Detecting and recognizing characters in Greek papyri with YOLOv8, DeiT and SimCLR 使用 YOLOv8、DeiT 和 SimCLR 检测和识别希腊纸莎草中的字符 Robert Turnbull, Evelyn Mannix http://arxiv.org/pdf/2401.12513v1 null
2024-01-23 Open-Set Facial Expression Recognition 开放集面部表情识别 Yuhang Zhang, Yue Yao, Xuannan Liu, Lixiong Qin, Wenjing Wang, Weihong Deng http://arxiv.org/pdf/2401.12507v1 null
2024-01-23 Small Language Model Meets with Reinforced Vision Vocabulary 小语言模型与强化视觉词汇的结合 Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, En Yu, Jianjian Sun, Chunrui Han, Xiangyu Zhang http://arxiv.org/pdf/2401.12503v1 null
2024-01-23 An Automated Real-Time Approach for Image Processing and Segmentation of Fluoroscopic Images and Videos Using a Single Deep Learning Network 使用单个深度学习网络对荧光图像和视频进行图像处理和分割的自动化实时方法 Viet Dung Nguyen, Michael T. LaCour, Richard D. Komistek http://arxiv.org/pdf/2401.12488v1 null
2024-01-23 Explore Synergistic Interaction Across Frames for Interactive Video Object Segmentation 探索交互式视频对象分割的跨帧协同交互 Kexin Li, Tao Jiang, Zongxin Yang, Yi Yang, Yueting Zhuang, Jun Xiao http://arxiv.org/pdf/2401.12480v1 null
2024-01-23 TD^2-Net: Toward Denoising and Debiasing for Dynamic Scene Graph Generation TD^2-Net:动态场景图生成的去噪和去偏 Xin Lin, Chong Shi, Yibing Zhan, Zuopeng Yang, Yaqi Wu, Dacheng Tao http://arxiv.org/pdf/2401.12479v1 null
2024-01-23 Zero Shot Open-ended Video Inference 零镜头开放式视频推理 Ee Yeo Keat, Zhang Hao, Alexander Matyasko, Basura Fernando http://arxiv.org/pdf/2401.12471v1 null
2024-01-23 Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration 通过 2D-3D 神经校准进行 LiDAR 3D 点云的自监督学习 Yifan Zhang, Siyu Ren, Junhui Hou, Jinjian Wu, Guangming Shi http://arxiv.org/pdf/2401.12452v1 null
2024-01-23 NIV-SSD: Neighbor IoU-Voting Single-Stage Object Detector From Point Cloud NIV-SSD:来自点云的邻居 IoU 投票单级物体检测器 Shuai Liu, Di Wang, Quan Wang, Kai Huang http://arxiv.org/pdf/2401.12447v1 link
2024-01-23 MAST: Video Polyp Segmentation with a Mixture-Attention Siamese Transformer MAST:使用混合注意力 Siamese Transformer 进行视频息肉分割 Geng Chen, Junqing Yang, Xiaozhou Pu, Ge-Peng Ji, Huan Xiong, Yongsheng Pan, Hengfei Cui, Yong Xia http://arxiv.org/pdf/2401.12439v1 link
2024-01-23 The Neglected Tails of Vision-Language Models 视觉语言模型被忽视的尾巴 Shubham Parashar, Zhiqiu Lin, Tian Liu, Xiangjue Dong, Yanan Li, Deva Ramanan, James Caverlee, Shu Kong http://arxiv.org/pdf/2401.12425v1 null

模型压缩/优化

Publish Date Title Title_CN Authors PDF Code
2024-01-23 A Novel Garment Transfer Method Supervised by Distilled Knowledge of Virtual Try-on Model 虚拟试穿模型蒸馏知识监督下的新型服装传输方法 Naiyu Fang, Lemiao Qiu, Shuyou Zhang, Zili Wang, Kerui Hu, Jianrong Tan http://arxiv.org/pdf/2401.12433v1 null
2024-01-23 Icy Moon Surface Simulation and Stereo Depth Estimation for Sampling Autonomy 用于采样自主性的冰月表面模拟和立体深度估计 Ramchander Bhaskara, Georgios Georgakis, Jeremy Nash, Marissa Cameron, Joseph Bowkett, Adnan Ansar, Manoranjan Majji, Paul Backes http://arxiv.org/pdf/2401.12414v1 link

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-01-23 Zero-Shot Learning for the Primitives of 3D Affordance in General Objects 一般对象中 3D 可供性基元的零样本学习 Hyeonwoo Kim, Sookwan Han, Patrick Kwon, Hanbyul Joo http://arxiv.org/pdf/2401.12978v1 null
2024-01-23 Lumiere: A Space-Time Diffusion Model for Video Generation Lumiere:用于视频生成的时空扩散模型 Omer Bar-Tal, Hila Chefer, Omer Tov, Charles Herrmann, Roni Paiss, Shiran Zada, Ariel Ephrat, Junhwa Hur, Yuanzhen Li, Tomer Michaeli, et.al. http://arxiv.org/pdf/2401.12945v1 null
2024-01-23 UniHDA: Towards Universal Hybrid Domain Adaptation of Image Generators UniHDA:迈向图像生成器的通用混合域适应 Hengjia Li, Yang Liu, Yuqi Lin, Zhanwei Zhang, Yibo Zhao, weihang Pan, Tu Zheng, Zheng Yang, Yuchun Jiang, Boxi Wu, et.al. http://arxiv.org/pdf/2401.12596v1 null
2024-01-23 Exploration and Improvement of Nerf-based 3D Scene Editing Techniques 基于Nerf的3D场景编辑技术的探索与改进 Shun Fang, Ming Cui, Xing Feng, Yanan Zhang http://arxiv.org/pdf/2401.12456v1 null

多模态

Publish Date Title Title_CN Authors PDF Code
2024-01-23 On the Efficacy of Text-Based Input Modalities for Action Anticipation 基于文本的输入方式对动作预期的功效 Apoorva Beedu, Karan Samel, Irfan Essa http://arxiv.org/pdf/2401.12972v1 null
2024-01-23 Red Teaming Visual Language Models 红队视觉语言模型 Mukai Li, Lei Li, Yuwei Yin, Masood Ahmed, Zhenguang Liu, Qi Liu http://arxiv.org/pdf/2401.12915v1 null
2024-01-23 FedRSU: Federated Learning for Scene Flow Estimation on Roadside Units FedRSU:路边场景流估计的联邦学习 Shaoheng Fang, Rui Ye, Wenhao Wang, Zuhong Liu, Yuxiao Wang, Yafei Wang, Siheng Chen, Yanfeng Wang http://arxiv.org/pdf/2401.12862v1 null
2024-01-23 NeRF-AD: Neural Radiance Field with Attention-based Disentanglement for Talking Face Synthesis NeRF-AD:具有基于注意力的解开的神经辐射场,用于说话人脸合成 Chongke Bi, Xiaoxing Liu, Zhilei Liu http://arxiv.org/pdf/2401.12568v1 null
2024-01-23 Multi-modal News Understanding with Professionally Labelled Videos (ReutersViLNews) 通过专业标记的视频进行多模式新闻理解 (ReutersViLNews) Shih-Han Chou, Matthew Kowal, Yasmin Niknam, Diana Moyano, Shayaan Mehdi, Richard Pito, Cheng Zhang, Ian Knopke, Sedef Akinli Kocak, Leonid Sigal, et.al. http://arxiv.org/pdf/2401.12419v1 null

LLM

Publish Date Title Title_CN Authors PDF Code
2024-01-23 HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments HAZARD 挑战:动态变化环境中的具体决策 Qinhong Zhou, Sunli Chen, Yisong Wang, Haozhe Xu, Weihua Du, Hongxin Zhang, Yilun Du, Joshua B. Tenenbaum, Chuang Gan http://arxiv.org/pdf/2401.12975v1 link
2024-01-23 AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents AutoRT:机器人代理大规模编排的具体基础模型 Michael Ahn, Debidatta Dwibedi, Chelsea Finn, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Karol Hausman, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, et.al. http://arxiv.org/pdf/2401.12963v1 null

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-01-23 SGTR+: End-to-end Scene Graph Generation with Transformer SGTR+:使用 Transformer 生成端到端场景图 Rongjie Li, Songyang Zhang, Xuming He http://arxiv.org/pdf/2401.12835v1 link
2024-01-23 Shift-ConvNets: Small Convolutional Kernel with Large Kernel Effects Shift-ConvNets:具有大核效应的小卷积核 Dachong Li, Li Li, Zhuangzhuang Chen, Jianqiang Li http://arxiv.org/pdf/2401.12736v1 link
2024-01-23 Convolutional Initialization for Data-Efficient Vision Transformers 数据高效视觉转换器的卷积初始化 Jianqiao Zheng, Xueqian Li, Simon Lucey http://arxiv.org/pdf/2401.12511v1 link
2024-01-23 Methods and strategies for improving the novel view synthesis quality of neural radiation field 提高神经辐射场新视合成质量的方法与策略 Shun Fang, Ming Cui, Xing Feng, Yanna Lv http://arxiv.org/pdf/2401.12451v1 null
2024-01-23 InverseMatrixVT3D: An Efficient Projection Matrix-Based Approach for 3D Occupancy Prediction InverseMatrixVT3D:一种基于投影矩阵的高效 3D 占用预测方法 Zhenxing Ming, Julie Stephany Berrio, Mao Shan, Stewart Worrall http://arxiv.org/pdf/2401.12422v1 null

3DGS

Publish Date Title Title_CN Authors PDF Code
2024-01-23 PSAvatar: A Point-based Morphable Shape Model for Real-Time Head Avatar Creation with 3D Gaussian Splatting PSAvatar:基于点的可变形形状模型,用于通过 3D 高斯泼溅创建实时头部头像 Zhongyuan Zhao, Zhenyu Bao, Qing Li, Guoping Qiu, Kanglin Liu http://arxiv.org/pdf/2401.12900v1 null
2024-01-23 EndoGaussian: Gaussian Splatting for Deformable Surgical Scene Reconstruction EndoGaussian:用于可变形手术场景重建的高斯喷射 Yifan Liu, Chenxin Li, Chen Yang, Yixuan Yuan http://arxiv.org/pdf/2401.12561v1 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-01-23 IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images IRIS:低动态范围图像的室内场景逆渲染 Zhi-Hao Lin, Jia-Bin Huang, Zhengqin Li, Zhao Dong, Christian Richardt, Tuotuo Li, Michael Zollhöfer, Johannes Kopf, Shenlong Wang, Changil Kim http://arxiv.org/pdf/2401.12977v1 null
2024-01-23 Coverage Axis++: Efficient Inner Point Selection for 3D Shape Skeletonization Coverage Axis++:3D 形状骨架化的高效内点选择 Zimeng Wang, Zhiyang Dou, Rui Xu, Cheng Lin, Yuan Liu, Xiaoxiao Long, Shiqing Xin, Lingjie Liu, Taku Komura, Xiaoming Yuan, et.al. http://arxiv.org/pdf/2401.12946v1 null
2024-01-23 PSDF: Prior-Driven Neural Implicit Surface Learning for Multi-view Reconstruction PSDF:用于多视图重建的先验驱动神经隐式表面学习 Wanjuan Su, Chen Zhang, Qingshan Xu, Wenbing Tao http://arxiv.org/pdf/2401.12751v1 null
2024-01-23 RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos 野外 RGBD 对象:通过 RGB-D 视频缩放真实世界 3D 对象学习 Hongchi Xia, Yang Fu, Sifei Liu, Xiaolong Wang http://arxiv.org/pdf/2401.12592v1 null

各类学习方式

Publish Date Title Title_CN Authors PDF Code
2024-01-23 Consistency Enhancement-Based Deep Multiview Clustering via Contrastive Learning 通过对比学习进行基于一致性增强的深度多视图聚类 Hao Yang, Hua Mao, Wai Lok Woo, Jie Chen, Xi Peng http://arxiv.org/pdf/2401.12648v1 null
2024-01-23 Fast Semi-supervised Unmixing using Non-convex Optimization 使用非凸优化的快速半监督分解 Behnood Rasti, Alexandre Zouaoui, Julien Mairal, Jocelyn Chanussot http://arxiv.org/pdf/2401.12609v1 null
2024-01-23 AdaEmbed: Semi-supervised Domain Adaptation in the Embedding Space AdaEmbed:嵌入空间中的半监督域适应 Ali Mottaghi, Mohammad Abdullah Jamal, Serena Yeung, Omid Mohareri http://arxiv.org/pdf/2401.12421v1 null

其他

Publish Date Title Title_CN Authors PDF Code
2024-01-23 Data-Centric Evolution in Autonomous Driving: A Comprehensive Survey of Big Data System, Data Mining, and Closed-Loop Technologies 自动驾驶中以数据为中心的演进:大数据系统、数据挖掘和闭环技术的全面综述 Lincan Li, Wei Shao, Wei Dong, Yijun Tian, Kaixiang Yang, Wenjie Zhang http://arxiv.org/pdf/2401.12888v1 null
2024-01-23 Fast Implicit Neural Representation Image Codec in Resource-limited Devices 资源有限设备中的快速隐式神经表示图像编解码器 Xiang Liu, Jiahong Chen, Bin Chen, Zimo Liu, Baoyi An, Shu-Tao Xia http://arxiv.org/pdf/2401.12587v1 null
2024-01-23 Secure Federated Learning Approaches to Diagnosing COVID-19 用于诊断 COVID-19 的安全联合学习方法 Rittika Adhikari, Christopher Settles http://arxiv.org/pdf/2401.12438v1 null