Skip to content

Latest commit

 

History

History
executable file
·
103 lines (86 loc) · 17.4 KB

2024-01-30.md

File metadata and controls

executable file
·
103 lines (86 loc) · 17.4 KB

[UPDATED!] 2024-01-30 (Publish Time)

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-01-30 You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation 您只需一步:通过刻度蒸馏实现快速超分辨率和稳定扩散 Mehdi Noroozi, Isma Hadji, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos http://arxiv.org/pdf/2401.17258v1 null
2024-01-30 ContactGen: Contact-Guided Interactive 3D Human Generation for Partners ContactGen:为合作伙伴提供接触引导的交互式 3D 人类生成 Dongjun Gu, Jaehyeok Shim, Jaehoon Jang, Changwoo Kang, Kyungdon Joo http://arxiv.org/pdf/2401.17212v1 null
2024-01-30 Self-Supervised Representation Learning for Nerve Fiber Distribution Patterns in 3D-PLI 3D-PLI 中神经纤维分布模式的自监督表示学习 Alexander Oberstrass, Sascha E. A. Muenzing, Meiqi Niu, Nicola Palomero-Gallagher, Christian Schiffer, Markus Axer, Katrin Amunts, Timo Dickscheid http://arxiv.org/pdf/2401.17207v1 null
2024-01-30 An Open Software Suite for Event-Based Video 用于基于事件的视频的开放软件套件 Andrew C. Freeman http://arxiv.org/pdf/2401.17151v1 null
2024-01-30 Repositioning the Subject within Image 重新定位图像中的主体 Yikai Wang, Chenjie Cao, Qiaole Dong, Yifan Li, Yanwei Fu http://arxiv.org/pdf/2401.16861v1 null
2024-01-30 BoostDream: Efficient Refining for High-Quality Text-to-3D Generation from Multi-View Diffusion BoostDream:通过多视图扩散高效细化高质量文本到 3D 生成 Yonghao Yu, Shunan Zhu, Huai Qin, Haorui Li http://arxiv.org/pdf/2401.16764v1 null
2024-01-30 Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image Personalization Pick-and-Draw:用于文本到图像个性化的免训练语义指导 Henglei Lv, Jiayu Xiao, Liang Li, Qingming Huang http://arxiv.org/pdf/2401.16762v1 null

多模态

Publish Date Title Title_CN Authors PDF Code
2024-01-30 GazeGPT: Augmenting Human Capabilities using Gaze-contingent Contextual AI for Smart Eyewear GazeGPT:使用智能眼镜的注视相关情境 AI 增强人类能力 Robert Konrad, Nitish Padmanaban, J. Gabriel Buckmaster, Kevin C. Boyle, Gordon Wetzstein http://arxiv.org/pdf/2401.17217v1 null
2024-01-30 Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning 通过持续语言学习,拥抱 CLIP 中的语言包容性和多样性 Bang Yang, Yong Dai, Xuxin Cheng, Yaowei Li, Asif Raza, Yuexian Zou http://arxiv.org/pdf/2401.17186v1 null
2024-01-30 M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic Manipulation M2CURL:通过用于机器人操作的自监督表示学习实现样本高效的多模态强化学习 Fotios Lygerakis, Vedant Dave, Elmar Rueckert http://arxiv.org/pdf/2401.17032v1 null
2024-01-30 Multi-modal Representation Learning for Cross-modal Prediction of Continuous Weather Patterns from Discrete Low-Dimensional Data 基于离散低维数据的连续天气模式跨模态预测的多模态表示学习 Alif Bin Abdul Qayyum, Xihaier Luo, Nathan M. Urban, Xiaoning Qian, Byung-Jun Yoon http://arxiv.org/pdf/2401.16936v1 null
2024-01-30 Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation 模态不完整场景分割的傅立叶快速调整 Ruiping Liu, Jiaming Zhang, Kunyu Peng, Yufan Chen, Ke Cao, Junwei Zheng, M. Saquib Sarfraz, Kailun Yang, Rainer Stiefelhagen http://arxiv.org/pdf/2401.16923v1 null
2024-01-30 EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain EarthGPT:遥感领域多传感器图像理解的通用多模态大语言模型 Wei Zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao http://arxiv.org/pdf/2401.16822v1 null

LLM

Publish Date Title Title_CN Authors PDF Code
2024-01-30 MouSi: Poly-Visual-Expert Vision-Language Models MouSi:多视觉专家视觉语言模型 Xiaoran Fan, Tao Ji, Changhao Jiang, Shuo Li, Senjie Jin, Sirui Song, Junke Wang, Boyang Hong, Lu Chen, Guodong Zheng, et.al. http://arxiv.org/pdf/2401.17221v1 null
2024-01-30 StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis StrokeNUWA:用于矢量图形合成的笔画标记化 Zecheng Tang, Chenfei Wu, Zekai Zhang, Mingheng Ni, Shengming Yin, Yu Liu, Zhengyuan Yang, Lijuan Wang, Zicheng Liu, Juntao Li, et.al. http://arxiv.org/pdf/2401.17093v1 null

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-01-30 CPR++: Object Localization via Single Coarse Point Supervision CPR++:通过单粗点监督进行对象定位 Xuehui Yu, Pengfei Chen, Kuiran Wang, Xumeng Han, Guorong Li, Zhenjun Han, Qixiang Ye, Jianbin Jiao http://arxiv.org/pdf/2401.17203v1 null
2024-01-30 OmniSCV: An Omnidirectional Synthetic Image Generator for Computer Vision OmniSCV:用于计算机视觉的全方位合成图像生成器 Bruno Berenguel-Baeta, Jesus Bermudez-Cameo, Jose J. Guerrero http://arxiv.org/pdf/2401.17061v1 null
2024-01-30 ViTree: Single-path Neural Tree for Step-wise Interpretable Fine-grained Visual Categorization ViTree:用于逐步可解释的细粒度视觉分类的单路径神经树 Danning Lao, Qi Liu, Jiazi Bu, Junchi Yan, Wei Shen http://arxiv.org/pdf/2401.17050v1 null
2024-01-30 Deep 3D World Models for Multi-Image Super-Resolution Beyond Optical Flow 超越光流的多图像超分辨率深度 3D 世界模型 Luca Savant Aira, Diego Valsesia, Andrea Bordone Molini, Giulia Fracastoro, Enrico Magli, Andrea Mirabile http://arxiv.org/pdf/2401.16972v1 null
2024-01-30 CAFCT: Contextual and Attentional Feature Fusions of Convolutional Neural Networks and Transformer for Liver Tumor Segmentation CAFCT:用于肝脏肿瘤分割的卷积神经网络和 Transformer 的上下文和注意力特征融合 Ming Kang, Chee-Ming Ting, Fung Fung Ting, Raphaël Phan http://arxiv.org/pdf/2401.16886v1 null
2024-01-30 SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing SmartFRZ:使用基于注意力的层冻结的高效训练框架 Sheng Li, Geng Yuan, Yue Dai, Youtao Zhang, Yanzhi Wang, Xulong Tang http://arxiv.org/pdf/2401.16720v1 null
2024-01-30 Towards Precise 3D Human Pose Estimation with Multi-Perspective Spatial-Temporal Relational Transformers 利用多视角时空关系变换器实现精确的 3D 人体姿势估计 Jianbin Jiao, Xina Cheng, Weijie Chen, Xiaoting Yin, Hao Shi, Kailun Yang http://arxiv.org/pdf/2401.16700v1 null

3DGS

Publish Date Title Title_CN Authors PDF Code
2024-01-30 VR-GS: A Physical Dynamics-Aware Interactive Gaussian Splatting System in Virtual Reality VR-GS:虚拟现实中的物理动力学感知交互式高斯溅射系统 Ying Jiang, Chang Yu, Tianyi Xie, Xuan Li, Yutao Feng, Huamin Wang, Minchen Li, Henry Lau, Feng Gao, Yin Yang, et.al. http://arxiv.org/pdf/2401.16663v1 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-01-30 YOLO-World: Real-Time Open-Vocabulary Object Detection YOLO-World:实时开放词汇目标检测 Tianheng Cheng, Lin Song, Yixiao Ge, Wenyu Liu, Xinggang Wang, Ying Shan http://arxiv.org/pdf/2401.17270v1 null
2024-01-30 Multi-Camera Asynchronous Ball Localization and Trajectory Prediction with Factor Graphs and Human Poses 使用因子图和人体姿势进行多摄像机异步球定位和轨迹预测 Qingyu Xiao, Zulfiqar Zaidi, Matthew Gombolay http://arxiv.org/pdf/2401.17185v1 null
2024-01-30 Non-central panorama indoor dataset 非中心全景室内数据集 Bruno Berenguel-Baeta, Jesus Bermudez-Cameo, Jose J. Guerrero http://arxiv.org/pdf/2401.17075v1 null
2024-01-30 Atlanta Scaled layouts from non-central panoramas 亚特兰大 非中心全景的比例布局 Bruno Berenguel-Baeta, Jesus Bermudez-Cameo, Jose J. Guerrero http://arxiv.org/pdf/2401.17058v1 null
2024-01-30 BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation BlockFusion:使用潜在三平面外推法生成可扩展的 3D 场景 Zhennan Wu, Yang Li, Han Yan, Taizhang Shang, Weixuan Sun, Senbo Wang, Ruikai Cui, Weizhe Liu, Hiroyuki Sato, Hongdong Li, et.al. http://arxiv.org/pdf/2401.17053v1 null
2024-01-30 An Embeddable Implicit IUVD Representation for Part-based 3D Human Surface Reconstruction 基于部件的 3D 人体表面重建的可嵌入隐式 IUVD 表示 Baoxing Li, Yong Deng, Yehui Yang, Xu Zhao http://arxiv.org/pdf/2401.16810v1 null
2024-01-30 All-optical complex field imaging using diffractive processors 使用衍射处理器的全光学复杂场成像 Jingxi Li, Yuhang Li, Tianyi Gan, Che-Yung Shen, Mona Jarrahi, Aydogan Ozcan http://arxiv.org/pdf/2401.16779v1 null
2024-01-30 Multi-granularity Correspondence Learning from Long-term Noisy Videos 从长期噪声视频中进行多粒度对应学习 Yijie Lin, Jie Zhang, Zhenyu Huang, Jia Liu, Zujie Wen, Xi Peng http://arxiv.org/pdf/2401.16702v1 null
2024-01-30 The Why, When, and How to Use Active Learning in Large-Data-Driven 3D Object Detection for Safe Autonomous Driving: An Empirical Exploration 为什么、何时以及如何在大数据驱动的 3D 物体检测中使用主动学习来实现安全自动驾驶:实证探索 Ross Greer, Bjørk Antoniussen, Mathias V. Andersen, Andreas Møgelmose, Mohan M. Trivedi http://arxiv.org/pdf/2401.16634v1 null

各类学习方式

Publish Date Title Title_CN Authors PDF Code
2024-01-30 Zero-shot Classification using Hyperdimensional Computing 使用超维计算的零样本分类 Samuele Ruffino, Geethan Karunaratne, Michael Hersche, Luca Benini, Abu Sebastian, Abbas Rahimi http://arxiv.org/pdf/2401.16876v1 null
2024-01-30 Reviving Undersampling for Long-Tailed Learning 恢复欠采样以实现长尾学习 Hao Yu, Yingxiao Du, Jianxin Wu http://arxiv.org/pdf/2401.16811v1 null
2024-01-30 Detection and Recovery Against Deep Neural Network Fault Injection Attacks Based on Contrastive Learning 基于对比学习的深度神经网络故障注入攻击检测与恢复 Chenan Wang, Pu Zhao, Siyue Wang, Xue Lin http://arxiv.org/pdf/2401.16766v1 null
2024-01-30 MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images MuSc:零样本工业异常分类和分割以及未标记图像的相互评分 Xurui Li, Ziming Huang, Feng Xue, Yu Zhou http://arxiv.org/pdf/2401.16753v1 link

其他

Publish Date Title Title_CN Authors PDF Code
2024-01-30 A simple, strong baseline for building damage detection on the xBD dataset 用于在 xBD 数据集上构建损伤检测的简单而强大的基线 Sebastian Gerard, Paul Borne-Pons, Josephine Sullivan http://arxiv.org/pdf/2401.17271v1 null
2024-01-30 Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks 防御语言模型免受越狱攻击的稳健提示优化 Andy Zhou, Bo Li, Haohan Wang http://arxiv.org/pdf/2401.17263v1 null
2024-01-30 SLIC: A Learned Image Codec Using Structure and Color SLIC:使用结构和颜色的学习图像编解码器 Srivatsa Prativadibhayankaram, Mahadev Prasad Panda, Thomas Richter, Heiko Sparenberg, Siegfried Fößel, André Kaup http://arxiv.org/pdf/2401.17246v1 null
2024-01-30 ReAlnet: Achieving More Human Brain-Like Vision via Human Neural Representational Alignment ReAlnet:通过人类神经表征对齐实现更像人脑的视觉 Zitong Lu, Yile Wang, Julie D. Golomb http://arxiv.org/pdf/2401.17231v1 null
2024-01-30 NormEnsembleXAI: Unveiling the Strengths and Weaknesses of XAI Ensemble Techniques NormEnsembleXAI:揭示 XAI 集成技术的优点和缺点 Weronika Hryniewska-Guzik, Bartosz Sawicki, Przemysław Biecek http://arxiv.org/pdf/2401.17200v1 null
2024-01-30 Evaluation in Neural Style Transfer: A Review 神经风格迁移评估:回顾 Eleftherios Ioannou, Steve Maddock http://arxiv.org/pdf/2401.17109v1 null
2024-01-30 H-SynEx: Using synthetic images and ultra-high resolution ex vivo MRI for hypothalamus subregion segmentation H-SynEx:使用合成图像和超高分辨率离体 MRI 进行下丘脑分区分割 Livia Rodrigues, Martina Bocchetta, Oula Puonti, Douglas Greve, Ana Carolina Londe, Marcondes França, Simone Appenzeller, Juan Eugenio Iglesias, Leticia Rittner http://arxiv.org/pdf/2401.17104v1 null
2024-01-30 CharNet: Generalized Approach for High-Complexity Character Classification CharNet:高复杂性字符分类的通用方法 Boris Kriuk http://arxiv.org/pdf/2401.17098v1 null
2024-01-30 Active Generation Network of Human Skeleton for Action Recognition 用于动作识别的人体骨骼主动生成网络 Long Liu, Xin Wang, Fangming Li, Jiayu Chen http://arxiv.org/pdf/2401.17086v1 null
2024-01-30 Efficient Gesture Recognition on Spiking Convolutional Networks Through Sensor Fusion of Event-Based and Depth Data 通过基于事件和深度数据的传感器融合在尖峰卷积网络上进行高效手势识别 Lea Steffen, Thomas Trapp, Arne Roennau, Rüdiger Dillmann http://arxiv.org/pdf/2401.17064v1 null
2024-01-30 Floor extraction and door detection for visually impaired guidance 楼层提取和门检测,为视障人士提供引导 Bruno Berenguel-Baeta, Manuel Guerrero-Viu, Alejandro de Nova, Jesus Bermudez-Cameo, Alejandro Perez-Yus, Jose J. Guerrero http://arxiv.org/pdf/2401.17056v1 null
2024-01-30 Towards Assessing the Synthetic-to-Measured Adversarial Vulnerability of SAR ATR 评估 SAR ATR 的综合测量对抗漏洞 Bowen Peng, Bo Peng, Jingyuan Xia, Tianpeng Liu, Yongxiang Liu, Li Liu http://arxiv.org/pdf/2401.17038v1 null
2024-01-30 Multilayer Graph Approach to Deep Subspace Clustering 深层子空间聚类的多层图方法 Lovro Sindičić, Ivica Kopriva http://arxiv.org/pdf/2401.17033v1 null
2024-01-30 Static and Dynamic Synthesis of Bengali and Devanagari Signatures 孟加拉语和梵文签名的静态和动态合成 Miguel A. Ferrer, Sukalpa Chanda, Moises Diaz, Chayan Kr. Banerjee, Anirban Majumdar, Cristina Carmona-Duarte, Parikshit Acharya, Umapada Pal http://arxiv.org/pdf/2401.17026v1 null
2024-01-30 MF-MOS: A Motion-Focused Model for Moving Object Segmentation MF-MOS:用于运动物体分割的运动聚焦模型 Jintao Cheng, Kang Zeng, Zhuoxu Huang, Xiaoyu Tang, Jin Wu, Chengxi Zhang, Xieyuanli Chen, Rui Fan http://arxiv.org/pdf/2401.17023v1 null
2024-01-30 Evaluation of Out-of-Distribution Detection Performance on Autonomous Driving Datasets 自动驾驶数据集上的分布外检测性能评估 Jens Henriksson, Christian Berger, Stig Ursing, Markus Borg http://arxiv.org/pdf/2401.17013v1 null
2024-01-30 Category-wise Fine-Tuning: Resisting Incorrect Pseudo-Labels in Multi-Label Image Classification with Partial Labels 按类别微调:在具有部分标签的多标签图像分类中抵抗不正确的伪标签 Chak Fong Chong, Xinyi Fang, Jielong Guo, Yapeng Wang, Wei Ke, Chan-Tong Lam, Sio-Kei Im http://arxiv.org/pdf/2401.16991v1 null
2024-01-30 Segmentation and Characterization of Macerated Fibers and Vessels Using Deep Learning 使用深度学习对浸渍纤维和血管进行分割和表征 Saqib Qamar, Abu Imran Baba, Stéphane Verger, Magnus Andersson http://arxiv.org/pdf/2401.16937v1 null
2024-01-30 Dynamic MRI reconstruction using low-rank plus sparse decomposition with smoothness regularization 使用低秩加稀疏分解和平滑正则化进行动态 MRI 重建 Chee-Ming Ting, Fuad Noman, Raphaël C. -W. Phan, Hernando Ombao http://arxiv.org/pdf/2401.16928v1 null
2024-01-30 A Tournament of Transformation Models: B-Spline-based vs. Mesh-based Multi-Objective Deformable Image Registration 变换模型锦标赛:基于 B 样条与基于网格的多目标可变形图像配准 Georgios Andreadis, Joas I. Mulder, Anton Bouter, Peter A. N. Bosman, Tanja Alderliesten http://arxiv.org/pdf/2401.16867v1 null
2024-01-30 MESA: Matching Everything by Segmenting Anything MESA:通过分割任何内容来匹配所有内容 Yesheng Zhang, Xu Zhao http://arxiv.org/pdf/2401.16741v1 null
2024-01-30 Optimal-Landmark-Guided Image Blending for Face Morphing Attacks 用于面部变形攻击的最佳地标引导图像混合 Qiaoyun He, Zongyong Deng, Zuyuan He, Qijun Zhao http://arxiv.org/pdf/2401.16722v1 null
2024-01-30 LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field Cameras LF Tracy:用于光场相机中显着物体检测的统一单管道方法 Fei Teng, Jiaming Zhang, Jiawei Liu, Kunyu Peng, Xina Cheng, Zhiyong Li, Kailun Yang http://arxiv.org/pdf/2401.16712v1 null
2024-01-30 EdgeOL: Efficient in-situ Online Learning on Edge Devices EdgeOL:边缘设备上的高效原位在线学习 Sheng Li, Geng Yuan, Yawen Wu, Yue Dai, Chao Wu, Alex K. Jones, Jingtong Hu, Yanzhi Wang, Xulong Tang http://arxiv.org/pdf/2401.16694v1 null
2024-01-30 Characterization of Magnetic Labyrinthine Structures through Junctions and Terminals Detection using Template Matching and CNN 使用模板匹配和 CNN 通过连接和终端检测来表征磁性迷宫结构 Vinícius Yu Okubo, Kotaro Shimizu, B. S. Shivaram, Hae Yong Kim http://arxiv.org/pdf/2401.16688v1 null