Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-23 | GALA: Generating Animatable Layered Assets from a Single Scan | GALA:通过单次扫描生成可动画化的分层资源 | Taeksoo Kim, Byungjun Kim, Shunsuke Saito, Hanbyul Joo | http://arxiv.org/pdf/2401.12979v1 | null |
2024-01-23 | SegmentAnyBone: A Universal Model that Segments Any Bone at Any Location on MRI | SegmentAnyBone:一种通用模型,可在 MRI 上的任何位置分割任何骨骼 | Hanxue Gu, Roy Colglazier, Haoyu Dong, Jikai Zhang, Yaqian Chen, Zafer Yildiz, Yuwen Chen, Lin Li, Jichen Yang, Jay Willhite, et.al. | http://arxiv.org/pdf/2401.12974v1 | null |
2024-01-23 | Neural deformation fields for template-based reconstruction of cortical surfaces from MRI | 用于基于 MRI 皮质表面模板重建的神经变形场 | Fabian Bongratz, Anne-Marie Rickmann, Christian Wachinger | http://arxiv.org/pdf/2401.12938v1 | null |
2024-01-23 | Segmentation of tibiofemoral joint tissues from knee MRI using MtRA-Unet and incorporating shape information: Data from the Osteoarthritis Initiative | 使用 MtRA-Unet 对膝 MRI 中的胫股关节组织进行分割并结合形状信息:来自骨关节炎倡议的数据 | Akshay Daydar, Alik Pramanick, Arijit Sur, Subramani Kanagaraj | http://arxiv.org/pdf/2401.12932v1 | null |
2024-01-23 | Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning? | 面对房间里的大象:视觉提示调整还是全面微调? | Cheng Han, Qifan Wang, Yiming Cui, Wenguan Wang, Lifu Huang, Siyuan Qi, Dongfang Liu | http://arxiv.org/pdf/2401.12902v1 | null |
2024-01-23 | Unlocking the Potential: Multi-task Deep Learning for Spaceborne Quantitative Monitoring of Fugitive Methane Plumes | 释放潜力:用于星载逃逸甲烷羽流定量监测的多任务深度学习 | Guoxin Si, Shiliang Fu, Wei Yao | http://arxiv.org/pdf/2401.12870v1 | null |
2024-01-23 | Classification of grapevine varieties using UAV hyperspectral imaging | 利用无人机高光谱成像对葡萄品种进行分类 | Alfonso López, Carlos Javier Ogayar, Francisco Ramón Feito, Joaquim João Sousa | http://arxiv.org/pdf/2401.12851v1 | null |
2024-01-23 | DatUS^2: Data-driven Unsupervised Semantic Segmentation with Pre-trained Self-supervised Vision Transformer | DatUS^2:数据驱动的无监督语义分割与预训练的自监督视觉 Transformer | Sonal Kumar, Arijit Sur, Rashmi Dutta Baruah | http://arxiv.org/pdf/2401.12820v1 | null |
2024-01-23 | MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty | MUSES:用于不确定性驾驶的多传感器语义感知数据集 | Tim Brödermann, David Bruggemann, Christos Sakaridis, Kevin Ta, Odysseas Liagouris, Jason Corkill, Luc Van Gool | http://arxiv.org/pdf/2401.12761v1 | null |
2024-01-23 | Correlation-Embedded Transformer Tracking: A Single-Branch Framework | 相关嵌入式变压器跟踪:单分支框架 | Fei Xie, Wankou Yang, Chunyu Wang, Lei Chu, Yue Cao, Chao Ma, Wenjun Zeng | http://arxiv.org/pdf/2401.12743v1 | null |
2024-01-23 | Enhancing Object Detection Performance for Small Objects through Synthetic Data Generation and Proportional Class-Balancing Technique: A Comparative Study in Industrial Scenarios | 通过合成数据生成和比例类平衡技术增强小物体的物体检测性能:工业场景的比较研究 | Jibinraj Antony, Vinit Hegiste, Ali Nazeri, Hooman Tavakoli, Snehal Walunj, Christiane Plociennik, Martin Ruskowski | http://arxiv.org/pdf/2401.12729v1 | null |
2024-01-23 | Two-View Topogram-Based Anatomy-Guided CT Reconstruction for Prospective Risk Minimization | 基于双视图拓扑图的解剖引导 CT 重建,实现前瞻性风险最小化 | Chang Liu, Laura Klein, Yixing Huang, Edith Baader, Michael Lell, Marc Kachelrieß, Andreas Maier | http://arxiv.org/pdf/2401.12725v1 | null |
2024-01-23 | Pragmatic Communication in Multi-Agent Collaborative Perception | 多智能体协作感知中的语用沟通 | Yue Hu, Xianghe Pang, Xiaoqi Qin, Yonina C. Eldar, Siheng Chen, Ping Zhang, Wenjun Zhang | http://arxiv.org/pdf/2401.12694v1 | null |
2024-01-23 | Energy-based Automated Model Evaluation | 基于能量的自动化模型评估 | Ru Peng, Heming Zou, Haobo Wang, Yawen Zeng, Zenan Huang, Junbo Zhao | http://arxiv.org/pdf/2401.12689v1 | link |
2024-01-23 | ClipSAM: CLIP and SAM Collaboration for Zero-Shot Anomaly Segmentation | ClipSAM:CLIP 和 SAM 协作进行零样本异常分割 | Shengze Li, Jianjian Cao, Peng Ye, Yuhan Ding, Chongjun Tu, Tao Chen | http://arxiv.org/pdf/2401.12665v1 | null |
2024-01-23 | Self-Supervised Vision Transformers Are Efficient Segmentation Learners for Imperfect Labels | 自监督视觉变压器是不完美标签的有效分割学习器 | Seungho Lee, Seoungyoon Kang, Hyunjung Shim | http://arxiv.org/pdf/2401.12535v1 | null |
2024-01-23 | Detecting and recognizing characters in Greek papyri with YOLOv8, DeiT and SimCLR | 使用 YOLOv8、DeiT 和 SimCLR 检测和识别希腊纸莎草中的字符 | Robert Turnbull, Evelyn Mannix | http://arxiv.org/pdf/2401.12513v1 | null |
2024-01-23 | Open-Set Facial Expression Recognition | 开放集面部表情识别 | Yuhang Zhang, Yue Yao, Xuannan Liu, Lixiong Qin, Wenjing Wang, Weihong Deng | http://arxiv.org/pdf/2401.12507v1 | null |
2024-01-23 | Small Language Model Meets with Reinforced Vision Vocabulary | 小语言模型与强化视觉词汇的结合 | Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, En Yu, Jianjian Sun, Chunrui Han, Xiangyu Zhang | http://arxiv.org/pdf/2401.12503v1 | null |
2024-01-23 | An Automated Real-Time Approach for Image Processing and Segmentation of Fluoroscopic Images and Videos Using a Single Deep Learning Network | 使用单个深度学习网络对荧光图像和视频进行图像处理和分割的自动化实时方法 | Viet Dung Nguyen, Michael T. LaCour, Richard D. Komistek | http://arxiv.org/pdf/2401.12488v1 | null |
2024-01-23 | Explore Synergistic Interaction Across Frames for Interactive Video Object Segmentation | 探索交互式视频对象分割的跨帧协同交互 | Kexin Li, Tao Jiang, Zongxin Yang, Yi Yang, Yueting Zhuang, Jun Xiao | http://arxiv.org/pdf/2401.12480v1 | null |
2024-01-23 | TD^2-Net: Toward Denoising and Debiasing for Dynamic Scene Graph Generation | TD^2-Net:动态场景图生成的去噪和去偏 | Xin Lin, Chong Shi, Yibing Zhan, Zuopeng Yang, Yaqi Wu, Dacheng Tao | http://arxiv.org/pdf/2401.12479v1 | null |
2024-01-23 | Zero Shot Open-ended Video Inference | 零镜头开放式视频推理 | Ee Yeo Keat, Zhang Hao, Alexander Matyasko, Basura Fernando | http://arxiv.org/pdf/2401.12471v1 | null |
2024-01-23 | Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration | 通过 2D-3D 神经校准进行 LiDAR 3D 点云的自监督学习 | Yifan Zhang, Siyu Ren, Junhui Hou, Jinjian Wu, Guangming Shi | http://arxiv.org/pdf/2401.12452v1 | null |
2024-01-23 | NIV-SSD: Neighbor IoU-Voting Single-Stage Object Detector From Point Cloud | NIV-SSD:来自点云的邻居 IoU 投票单级物体检测器 | Shuai Liu, Di Wang, Quan Wang, Kai Huang | http://arxiv.org/pdf/2401.12447v1 | link |
2024-01-23 | MAST: Video Polyp Segmentation with a Mixture-Attention Siamese Transformer | MAST:使用混合注意力 Siamese Transformer 进行视频息肉分割 | Geng Chen, Junqing Yang, Xiaozhou Pu, Ge-Peng Ji, Huan Xiong, Yongsheng Pan, Hengfei Cui, Yong Xia | http://arxiv.org/pdf/2401.12439v1 | link |
2024-01-23 | The Neglected Tails of Vision-Language Models | 视觉语言模型被忽视的尾巴 | Shubham Parashar, Zhiqiu Lin, Tian Liu, Xiangjue Dong, Yanan Li, Deva Ramanan, James Caverlee, Shu Kong | http://arxiv.org/pdf/2401.12425v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-23 | A Novel Garment Transfer Method Supervised by Distilled Knowledge of Virtual Try-on Model | 虚拟试穿模型蒸馏知识监督下的新型服装传输方法 | Naiyu Fang, Lemiao Qiu, Shuyou Zhang, Zili Wang, Kerui Hu, Jianrong Tan | http://arxiv.org/pdf/2401.12433v1 | null |
2024-01-23 | Icy Moon Surface Simulation and Stereo Depth Estimation for Sampling Autonomy | 用于采样自主性的冰月表面模拟和立体深度估计 | Ramchander Bhaskara, Georgios Georgakis, Jeremy Nash, Marissa Cameron, Joseph Bowkett, Adnan Ansar, Manoranjan Majji, Paul Backes | http://arxiv.org/pdf/2401.12414v1 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-23 | Zero-Shot Learning for the Primitives of 3D Affordance in General Objects | 一般对象中 3D 可供性基元的零样本学习 | Hyeonwoo Kim, Sookwan Han, Patrick Kwon, Hanbyul Joo | http://arxiv.org/pdf/2401.12978v1 | null |
2024-01-23 | Lumiere: A Space-Time Diffusion Model for Video Generation | Lumiere:用于视频生成的时空扩散模型 | Omer Bar-Tal, Hila Chefer, Omer Tov, Charles Herrmann, Roni Paiss, Shiran Zada, Ariel Ephrat, Junhwa Hur, Yuanzhen Li, Tomer Michaeli, et.al. | http://arxiv.org/pdf/2401.12945v1 | null |
2024-01-23 | UniHDA: Towards Universal Hybrid Domain Adaptation of Image Generators | UniHDA:迈向图像生成器的通用混合域适应 | Hengjia Li, Yang Liu, Yuqi Lin, Zhanwei Zhang, Yibo Zhao, weihang Pan, Tu Zheng, Zheng Yang, Yuchun Jiang, Boxi Wu, et.al. | http://arxiv.org/pdf/2401.12596v1 | null |
2024-01-23 | Exploration and Improvement of Nerf-based 3D Scene Editing Techniques | 基于Nerf的3D场景编辑技术的探索与改进 | Shun Fang, Ming Cui, Xing Feng, Yanan Zhang | http://arxiv.org/pdf/2401.12456v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-23 | On the Efficacy of Text-Based Input Modalities for Action Anticipation | 基于文本的输入方式对动作预期的功效 | Apoorva Beedu, Karan Samel, Irfan Essa | http://arxiv.org/pdf/2401.12972v1 | null |
2024-01-23 | Red Teaming Visual Language Models | 红队视觉语言模型 | Mukai Li, Lei Li, Yuwei Yin, Masood Ahmed, Zhenguang Liu, Qi Liu | http://arxiv.org/pdf/2401.12915v1 | null |
2024-01-23 | FedRSU: Federated Learning for Scene Flow Estimation on Roadside Units | FedRSU:路边场景流估计的联邦学习 | Shaoheng Fang, Rui Ye, Wenhao Wang, Zuhong Liu, Yuxiao Wang, Yafei Wang, Siheng Chen, Yanfeng Wang | http://arxiv.org/pdf/2401.12862v1 | null |
2024-01-23 | NeRF-AD: Neural Radiance Field with Attention-based Disentanglement for Talking Face Synthesis | NeRF-AD:具有基于注意力的解开的神经辐射场,用于说话人脸合成 | Chongke Bi, Xiaoxing Liu, Zhilei Liu | http://arxiv.org/pdf/2401.12568v1 | null |
2024-01-23 | Multi-modal News Understanding with Professionally Labelled Videos (ReutersViLNews) | 通过专业标记的视频进行多模式新闻理解 (ReutersViLNews) | Shih-Han Chou, Matthew Kowal, Yasmin Niknam, Diana Moyano, Shayaan Mehdi, Richard Pito, Cheng Zhang, Ian Knopke, Sedef Akinli Kocak, Leonid Sigal, et.al. | http://arxiv.org/pdf/2401.12419v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-23 | HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments | HAZARD 挑战:动态变化环境中的具体决策 | Qinhong Zhou, Sunli Chen, Yisong Wang, Haozhe Xu, Weihua Du, Hongxin Zhang, Yilun Du, Joshua B. Tenenbaum, Chuang Gan | http://arxiv.org/pdf/2401.12975v1 | link |
2024-01-23 | AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents | AutoRT:机器人代理大规模编排的具体基础模型 | Michael Ahn, Debidatta Dwibedi, Chelsea Finn, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Karol Hausman, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, et.al. | http://arxiv.org/pdf/2401.12963v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-23 | SGTR+: End-to-end Scene Graph Generation with Transformer | SGTR+:使用 Transformer 生成端到端场景图 | Rongjie Li, Songyang Zhang, Xuming He | http://arxiv.org/pdf/2401.12835v1 | link |
2024-01-23 | Shift-ConvNets: Small Convolutional Kernel with Large Kernel Effects | Shift-ConvNets:具有大核效应的小卷积核 | Dachong Li, Li Li, Zhuangzhuang Chen, Jianqiang Li | http://arxiv.org/pdf/2401.12736v1 | link |
2024-01-23 | Convolutional Initialization for Data-Efficient Vision Transformers | 数据高效视觉转换器的卷积初始化 | Jianqiao Zheng, Xueqian Li, Simon Lucey | http://arxiv.org/pdf/2401.12511v1 | link |
2024-01-23 | Methods and strategies for improving the novel view synthesis quality of neural radiation field | 提高神经辐射场新视合成质量的方法与策略 | Shun Fang, Ming Cui, Xing Feng, Yanna Lv | http://arxiv.org/pdf/2401.12451v1 | null |
2024-01-23 | InverseMatrixVT3D: An Efficient Projection Matrix-Based Approach for 3D Occupancy Prediction | InverseMatrixVT3D:一种基于投影矩阵的高效 3D 占用预测方法 | Zhenxing Ming, Julie Stephany Berrio, Mao Shan, Stewart Worrall | http://arxiv.org/pdf/2401.12422v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-23 | PSAvatar: A Point-based Morphable Shape Model for Real-Time Head Avatar Creation with 3D Gaussian Splatting | PSAvatar:基于点的可变形形状模型,用于通过 3D 高斯泼溅创建实时头部头像 | Zhongyuan Zhao, Zhenyu Bao, Qing Li, Guoping Qiu, Kanglin Liu | http://arxiv.org/pdf/2401.12900v1 | null |
2024-01-23 | EndoGaussian: Gaussian Splatting for Deformable Surgical Scene Reconstruction | EndoGaussian:用于可变形手术场景重建的高斯喷射 | Yifan Liu, Chenxin Li, Chen Yang, Yixuan Yuan | http://arxiv.org/pdf/2401.12561v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-23 | IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images | IRIS:低动态范围图像的室内场景逆渲染 | Zhi-Hao Lin, Jia-Bin Huang, Zhengqin Li, Zhao Dong, Christian Richardt, Tuotuo Li, Michael Zollhöfer, Johannes Kopf, Shenlong Wang, Changil Kim | http://arxiv.org/pdf/2401.12977v1 | null |
2024-01-23 | Coverage Axis++: Efficient Inner Point Selection for 3D Shape Skeletonization | Coverage Axis++:3D 形状骨架化的高效内点选择 | Zimeng Wang, Zhiyang Dou, Rui Xu, Cheng Lin, Yuan Liu, Xiaoxiao Long, Shiqing Xin, Lingjie Liu, Taku Komura, Xiaoming Yuan, et.al. | http://arxiv.org/pdf/2401.12946v1 | null |
2024-01-23 | PSDF: Prior-Driven Neural Implicit Surface Learning for Multi-view Reconstruction | PSDF:用于多视图重建的先验驱动神经隐式表面学习 | Wanjuan Su, Chen Zhang, Qingshan Xu, Wenbing Tao | http://arxiv.org/pdf/2401.12751v1 | null |
2024-01-23 | RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos | 野外 RGBD 对象:通过 RGB-D 视频缩放真实世界 3D 对象学习 | Hongchi Xia, Yang Fu, Sifei Liu, Xiaolong Wang | http://arxiv.org/pdf/2401.12592v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-23 | Consistency Enhancement-Based Deep Multiview Clustering via Contrastive Learning | 通过对比学习进行基于一致性增强的深度多视图聚类 | Hao Yang, Hua Mao, Wai Lok Woo, Jie Chen, Xi Peng | http://arxiv.org/pdf/2401.12648v1 | null |
2024-01-23 | Fast Semi-supervised Unmixing using Non-convex Optimization | 使用非凸优化的快速半监督分解 | Behnood Rasti, Alexandre Zouaoui, Julien Mairal, Jocelyn Chanussot | http://arxiv.org/pdf/2401.12609v1 | null |
2024-01-23 | AdaEmbed: Semi-supervised Domain Adaptation in the Embedding Space | AdaEmbed:嵌入空间中的半监督域适应 | Ali Mottaghi, Mohammad Abdullah Jamal, Serena Yeung, Omid Mohareri | http://arxiv.org/pdf/2401.12421v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-23 | Data-Centric Evolution in Autonomous Driving: A Comprehensive Survey of Big Data System, Data Mining, and Closed-Loop Technologies | 自动驾驶中以数据为中心的演进:大数据系统、数据挖掘和闭环技术的全面综述 | Lincan Li, Wei Shao, Wei Dong, Yijun Tian, Kaixiang Yang, Wenjie Zhang | http://arxiv.org/pdf/2401.12888v1 | null |
2024-01-23 | Fast Implicit Neural Representation Image Codec in Resource-limited Devices | 资源有限设备中的快速隐式神经表示图像编解码器 | Xiang Liu, Jiahong Chen, Bin Chen, Zimo Liu, Baoyi An, Shu-Tao Xia | http://arxiv.org/pdf/2401.12587v1 | null |
2024-01-23 | Secure Federated Learning Approaches to Diagnosing COVID-19 | 用于诊断 COVID-19 的安全联合学习方法 | Rittika Adhikari, Christopher Settles | http://arxiv.org/pdf/2401.12438v1 | null |