Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-30 | You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation | 您只需一步:通过刻度蒸馏实现快速超分辨率和稳定扩散 | Mehdi Noroozi, Isma Hadji, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos | http://arxiv.org/pdf/2401.17258v1 | null |
2024-01-30 | ContactGen: Contact-Guided Interactive 3D Human Generation for Partners | ContactGen:为合作伙伴提供接触引导的交互式 3D 人类生成 | Dongjun Gu, Jaehyeok Shim, Jaehoon Jang, Changwoo Kang, Kyungdon Joo | http://arxiv.org/pdf/2401.17212v1 | null |
2024-01-30 | Self-Supervised Representation Learning for Nerve Fiber Distribution Patterns in 3D-PLI | 3D-PLI 中神经纤维分布模式的自监督表示学习 | Alexander Oberstrass, Sascha E. A. Muenzing, Meiqi Niu, Nicola Palomero-Gallagher, Christian Schiffer, Markus Axer, Katrin Amunts, Timo Dickscheid | http://arxiv.org/pdf/2401.17207v1 | null |
2024-01-30 | An Open Software Suite for Event-Based Video | 用于基于事件的视频的开放软件套件 | Andrew C. Freeman | http://arxiv.org/pdf/2401.17151v1 | null |
2024-01-30 | Repositioning the Subject within Image | 重新定位图像中的主体 | Yikai Wang, Chenjie Cao, Qiaole Dong, Yifan Li, Yanwei Fu | http://arxiv.org/pdf/2401.16861v1 | null |
2024-01-30 | BoostDream: Efficient Refining for High-Quality Text-to-3D Generation from Multi-View Diffusion | BoostDream:通过多视图扩散高效细化高质量文本到 3D 生成 | Yonghao Yu, Shunan Zhu, Huai Qin, Haorui Li | http://arxiv.org/pdf/2401.16764v1 | null |
2024-01-30 | Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image Personalization | Pick-and-Draw:用于文本到图像个性化的免训练语义指导 | Henglei Lv, Jiayu Xiao, Liang Li, Qingming Huang | http://arxiv.org/pdf/2401.16762v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-30 | GazeGPT: Augmenting Human Capabilities using Gaze-contingent Contextual AI for Smart Eyewear | GazeGPT:使用智能眼镜的注视相关情境 AI 增强人类能力 | Robert Konrad, Nitish Padmanaban, J. Gabriel Buckmaster, Kevin C. Boyle, Gordon Wetzstein | http://arxiv.org/pdf/2401.17217v1 | null |
2024-01-30 | Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning | 通过持续语言学习,拥抱 CLIP 中的语言包容性和多样性 | Bang Yang, Yong Dai, Xuxin Cheng, Yaowei Li, Asif Raza, Yuexian Zou | http://arxiv.org/pdf/2401.17186v1 | null |
2024-01-30 | M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic Manipulation | M2CURL:通过用于机器人操作的自监督表示学习实现样本高效的多模态强化学习 | Fotios Lygerakis, Vedant Dave, Elmar Rueckert | http://arxiv.org/pdf/2401.17032v1 | null |
2024-01-30 | Multi-modal Representation Learning for Cross-modal Prediction of Continuous Weather Patterns from Discrete Low-Dimensional Data | 基于离散低维数据的连续天气模式跨模态预测的多模态表示学习 | Alif Bin Abdul Qayyum, Xihaier Luo, Nathan M. Urban, Xiaoning Qian, Byung-Jun Yoon | http://arxiv.org/pdf/2401.16936v1 | null |
2024-01-30 | Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation | 模态不完整场景分割的傅立叶快速调整 | Ruiping Liu, Jiaming Zhang, Kunyu Peng, Yufan Chen, Ke Cao, Junwei Zheng, M. Saquib Sarfraz, Kailun Yang, Rainer Stiefelhagen | http://arxiv.org/pdf/2401.16923v1 | null |
2024-01-30 | EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain | EarthGPT:遥感领域多传感器图像理解的通用多模态大语言模型 | Wei Zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao | http://arxiv.org/pdf/2401.16822v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-30 | MouSi: Poly-Visual-Expert Vision-Language Models | MouSi:多视觉专家视觉语言模型 | Xiaoran Fan, Tao Ji, Changhao Jiang, Shuo Li, Senjie Jin, Sirui Song, Junke Wang, Boyang Hong, Lu Chen, Guodong Zheng, et.al. | http://arxiv.org/pdf/2401.17221v1 | null |
2024-01-30 | StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis | StrokeNUWA:用于矢量图形合成的笔画标记化 | Zecheng Tang, Chenfei Wu, Zekai Zhang, Mingheng Ni, Shengming Yin, Yu Liu, Zhengyuan Yang, Lijuan Wang, Zicheng Liu, Juntao Li, et.al. | http://arxiv.org/pdf/2401.17093v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-30 | CPR++: Object Localization via Single Coarse Point Supervision | CPR++:通过单粗点监督进行对象定位 | Xuehui Yu, Pengfei Chen, Kuiran Wang, Xumeng Han, Guorong Li, Zhenjun Han, Qixiang Ye, Jianbin Jiao | http://arxiv.org/pdf/2401.17203v1 | null |
2024-01-30 | OmniSCV: An Omnidirectional Synthetic Image Generator for Computer Vision | OmniSCV:用于计算机视觉的全方位合成图像生成器 | Bruno Berenguel-Baeta, Jesus Bermudez-Cameo, Jose J. Guerrero | http://arxiv.org/pdf/2401.17061v1 | null |
2024-01-30 | ViTree: Single-path Neural Tree for Step-wise Interpretable Fine-grained Visual Categorization | ViTree:用于逐步可解释的细粒度视觉分类的单路径神经树 | Danning Lao, Qi Liu, Jiazi Bu, Junchi Yan, Wei Shen | http://arxiv.org/pdf/2401.17050v1 | null |
2024-01-30 | Deep 3D World Models for Multi-Image Super-Resolution Beyond Optical Flow | 超越光流的多图像超分辨率深度 3D 世界模型 | Luca Savant Aira, Diego Valsesia, Andrea Bordone Molini, Giulia Fracastoro, Enrico Magli, Andrea Mirabile | http://arxiv.org/pdf/2401.16972v1 | null |
2024-01-30 | CAFCT: Contextual and Attentional Feature Fusions of Convolutional Neural Networks and Transformer for Liver Tumor Segmentation | CAFCT:用于肝脏肿瘤分割的卷积神经网络和 Transformer 的上下文和注意力特征融合 | Ming Kang, Chee-Ming Ting, Fung Fung Ting, Raphaël Phan | http://arxiv.org/pdf/2401.16886v1 | null |
2024-01-30 | SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing | SmartFRZ:使用基于注意力的层冻结的高效训练框架 | Sheng Li, Geng Yuan, Yue Dai, Youtao Zhang, Yanzhi Wang, Xulong Tang | http://arxiv.org/pdf/2401.16720v1 | null |
2024-01-30 | Towards Precise 3D Human Pose Estimation with Multi-Perspective Spatial-Temporal Relational Transformers | 利用多视角时空关系变换器实现精确的 3D 人体姿势估计 | Jianbin Jiao, Xina Cheng, Weijie Chen, Xiaoting Yin, Hao Shi, Kailun Yang | http://arxiv.org/pdf/2401.16700v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-30 | VR-GS: A Physical Dynamics-Aware Interactive Gaussian Splatting System in Virtual Reality | VR-GS:虚拟现实中的物理动力学感知交互式高斯溅射系统 | Ying Jiang, Chang Yu, Tianyi Xie, Xuan Li, Yutao Feng, Huamin Wang, Minchen Li, Henry Lau, Feng Gao, Yin Yang, et.al. | http://arxiv.org/pdf/2401.16663v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-30 | YOLO-World: Real-Time Open-Vocabulary Object Detection | YOLO-World:实时开放词汇目标检测 | Tianheng Cheng, Lin Song, Yixiao Ge, Wenyu Liu, Xinggang Wang, Ying Shan | http://arxiv.org/pdf/2401.17270v1 | null |
2024-01-30 | Multi-Camera Asynchronous Ball Localization and Trajectory Prediction with Factor Graphs and Human Poses | 使用因子图和人体姿势进行多摄像机异步球定位和轨迹预测 | Qingyu Xiao, Zulfiqar Zaidi, Matthew Gombolay | http://arxiv.org/pdf/2401.17185v1 | null |
2024-01-30 | Non-central panorama indoor dataset | 非中心全景室内数据集 | Bruno Berenguel-Baeta, Jesus Bermudez-Cameo, Jose J. Guerrero | http://arxiv.org/pdf/2401.17075v1 | null |
2024-01-30 | Atlanta Scaled layouts from non-central panoramas | 亚特兰大 非中心全景的比例布局 | Bruno Berenguel-Baeta, Jesus Bermudez-Cameo, Jose J. Guerrero | http://arxiv.org/pdf/2401.17058v1 | null |
2024-01-30 | BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation | BlockFusion:使用潜在三平面外推法生成可扩展的 3D 场景 | Zhennan Wu, Yang Li, Han Yan, Taizhang Shang, Weixuan Sun, Senbo Wang, Ruikai Cui, Weizhe Liu, Hiroyuki Sato, Hongdong Li, et.al. | http://arxiv.org/pdf/2401.17053v1 | null |
2024-01-30 | An Embeddable Implicit IUVD Representation for Part-based 3D Human Surface Reconstruction | 基于部件的 3D 人体表面重建的可嵌入隐式 IUVD 表示 | Baoxing Li, Yong Deng, Yehui Yang, Xu Zhao | http://arxiv.org/pdf/2401.16810v1 | null |
2024-01-30 | All-optical complex field imaging using diffractive processors | 使用衍射处理器的全光学复杂场成像 | Jingxi Li, Yuhang Li, Tianyi Gan, Che-Yung Shen, Mona Jarrahi, Aydogan Ozcan | http://arxiv.org/pdf/2401.16779v1 | null |
2024-01-30 | Multi-granularity Correspondence Learning from Long-term Noisy Videos | 从长期噪声视频中进行多粒度对应学习 | Yijie Lin, Jie Zhang, Zhenyu Huang, Jia Liu, Zujie Wen, Xi Peng | http://arxiv.org/pdf/2401.16702v1 | null |
2024-01-30 | The Why, When, and How to Use Active Learning in Large-Data-Driven 3D Object Detection for Safe Autonomous Driving: An Empirical Exploration | 为什么、何时以及如何在大数据驱动的 3D 物体检测中使用主动学习来实现安全自动驾驶:实证探索 | Ross Greer, Bjørk Antoniussen, Mathias V. Andersen, Andreas Møgelmose, Mohan M. Trivedi | http://arxiv.org/pdf/2401.16634v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-30 | Zero-shot Classification using Hyperdimensional Computing | 使用超维计算的零样本分类 | Samuele Ruffino, Geethan Karunaratne, Michael Hersche, Luca Benini, Abu Sebastian, Abbas Rahimi | http://arxiv.org/pdf/2401.16876v1 | null |
2024-01-30 | Reviving Undersampling for Long-Tailed Learning | 恢复欠采样以实现长尾学习 | Hao Yu, Yingxiao Du, Jianxin Wu | http://arxiv.org/pdf/2401.16811v1 | null |
2024-01-30 | Detection and Recovery Against Deep Neural Network Fault Injection Attacks Based on Contrastive Learning | 基于对比学习的深度神经网络故障注入攻击检测与恢复 | Chenan Wang, Pu Zhao, Siyue Wang, Xue Lin | http://arxiv.org/pdf/2401.16766v1 | null |
2024-01-30 | MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images | MuSc:零样本工业异常分类和分割以及未标记图像的相互评分 | Xurui Li, Ziming Huang, Feng Xue, Yu Zhou | http://arxiv.org/pdf/2401.16753v1 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-30 | A simple, strong baseline for building damage detection on the xBD dataset | 用于在 xBD 数据集上构建损伤检测的简单而强大的基线 | Sebastian Gerard, Paul Borne-Pons, Josephine Sullivan | http://arxiv.org/pdf/2401.17271v1 | null |
2024-01-30 | Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks | 防御语言模型免受越狱攻击的稳健提示优化 | Andy Zhou, Bo Li, Haohan Wang | http://arxiv.org/pdf/2401.17263v1 | null |
2024-01-30 | SLIC: A Learned Image Codec Using Structure and Color | SLIC:使用结构和颜色的学习图像编解码器 | Srivatsa Prativadibhayankaram, Mahadev Prasad Panda, Thomas Richter, Heiko Sparenberg, Siegfried Fößel, André Kaup | http://arxiv.org/pdf/2401.17246v1 | null |
2024-01-30 | ReAlnet: Achieving More Human Brain-Like Vision via Human Neural Representational Alignment | ReAlnet:通过人类神经表征对齐实现更像人脑的视觉 | Zitong Lu, Yile Wang, Julie D. Golomb | http://arxiv.org/pdf/2401.17231v1 | null |
2024-01-30 | NormEnsembleXAI: Unveiling the Strengths and Weaknesses of XAI Ensemble Techniques | NormEnsembleXAI:揭示 XAI 集成技术的优点和缺点 | Weronika Hryniewska-Guzik, Bartosz Sawicki, Przemysław Biecek | http://arxiv.org/pdf/2401.17200v1 | null |
2024-01-30 | Evaluation in Neural Style Transfer: A Review | 神经风格迁移评估:回顾 | Eleftherios Ioannou, Steve Maddock | http://arxiv.org/pdf/2401.17109v1 | null |
2024-01-30 | H-SynEx: Using synthetic images and ultra-high resolution ex vivo MRI for hypothalamus subregion segmentation | H-SynEx:使用合成图像和超高分辨率离体 MRI 进行下丘脑分区分割 | Livia Rodrigues, Martina Bocchetta, Oula Puonti, Douglas Greve, Ana Carolina Londe, Marcondes França, Simone Appenzeller, Juan Eugenio Iglesias, Leticia Rittner | http://arxiv.org/pdf/2401.17104v1 | null |
2024-01-30 | CharNet: Generalized Approach for High-Complexity Character Classification | CharNet:高复杂性字符分类的通用方法 | Boris Kriuk | http://arxiv.org/pdf/2401.17098v1 | null |
2024-01-30 | Active Generation Network of Human Skeleton for Action Recognition | 用于动作识别的人体骨骼主动生成网络 | Long Liu, Xin Wang, Fangming Li, Jiayu Chen | http://arxiv.org/pdf/2401.17086v1 | null |
2024-01-30 | Efficient Gesture Recognition on Spiking Convolutional Networks Through Sensor Fusion of Event-Based and Depth Data | 通过基于事件和深度数据的传感器融合在尖峰卷积网络上进行高效手势识别 | Lea Steffen, Thomas Trapp, Arne Roennau, Rüdiger Dillmann | http://arxiv.org/pdf/2401.17064v1 | null |
2024-01-30 | Floor extraction and door detection for visually impaired guidance | 楼层提取和门检测,为视障人士提供引导 | Bruno Berenguel-Baeta, Manuel Guerrero-Viu, Alejandro de Nova, Jesus Bermudez-Cameo, Alejandro Perez-Yus, Jose J. Guerrero | http://arxiv.org/pdf/2401.17056v1 | null |
2024-01-30 | Towards Assessing the Synthetic-to-Measured Adversarial Vulnerability of SAR ATR | 评估 SAR ATR 的综合测量对抗漏洞 | Bowen Peng, Bo Peng, Jingyuan Xia, Tianpeng Liu, Yongxiang Liu, Li Liu | http://arxiv.org/pdf/2401.17038v1 | null |
2024-01-30 | Multilayer Graph Approach to Deep Subspace Clustering | 深层子空间聚类的多层图方法 | Lovro Sindičić, Ivica Kopriva | http://arxiv.org/pdf/2401.17033v1 | null |
2024-01-30 | Static and Dynamic Synthesis of Bengali and Devanagari Signatures | 孟加拉语和梵文签名的静态和动态合成 | Miguel A. Ferrer, Sukalpa Chanda, Moises Diaz, Chayan Kr. Banerjee, Anirban Majumdar, Cristina Carmona-Duarte, Parikshit Acharya, Umapada Pal | http://arxiv.org/pdf/2401.17026v1 | null |
2024-01-30 | MF-MOS: A Motion-Focused Model for Moving Object Segmentation | MF-MOS:用于运动物体分割的运动聚焦模型 | Jintao Cheng, Kang Zeng, Zhuoxu Huang, Xiaoyu Tang, Jin Wu, Chengxi Zhang, Xieyuanli Chen, Rui Fan | http://arxiv.org/pdf/2401.17023v1 | null |
2024-01-30 | Evaluation of Out-of-Distribution Detection Performance on Autonomous Driving Datasets | 自动驾驶数据集上的分布外检测性能评估 | Jens Henriksson, Christian Berger, Stig Ursing, Markus Borg | http://arxiv.org/pdf/2401.17013v1 | null |
2024-01-30 | Category-wise Fine-Tuning: Resisting Incorrect Pseudo-Labels in Multi-Label Image Classification with Partial Labels | 按类别微调:在具有部分标签的多标签图像分类中抵抗不正确的伪标签 | Chak Fong Chong, Xinyi Fang, Jielong Guo, Yapeng Wang, Wei Ke, Chan-Tong Lam, Sio-Kei Im | http://arxiv.org/pdf/2401.16991v1 | null |
2024-01-30 | Segmentation and Characterization of Macerated Fibers and Vessels Using Deep Learning | 使用深度学习对浸渍纤维和血管进行分割和表征 | Saqib Qamar, Abu Imran Baba, Stéphane Verger, Magnus Andersson | http://arxiv.org/pdf/2401.16937v1 | null |
2024-01-30 | Dynamic MRI reconstruction using low-rank plus sparse decomposition with smoothness regularization | 使用低秩加稀疏分解和平滑正则化进行动态 MRI 重建 | Chee-Ming Ting, Fuad Noman, Raphaël C. -W. Phan, Hernando Ombao | http://arxiv.org/pdf/2401.16928v1 | null |
2024-01-30 | A Tournament of Transformation Models: B-Spline-based vs. Mesh-based Multi-Objective Deformable Image Registration | 变换模型锦标赛:基于 B 样条与基于网格的多目标可变形图像配准 | Georgios Andreadis, Joas I. Mulder, Anton Bouter, Peter A. N. Bosman, Tanja Alderliesten | http://arxiv.org/pdf/2401.16867v1 | null |
2024-01-30 | MESA: Matching Everything by Segmenting Anything | MESA:通过分割任何内容来匹配所有内容 | Yesheng Zhang, Xu Zhao | http://arxiv.org/pdf/2401.16741v1 | null |
2024-01-30 | Optimal-Landmark-Guided Image Blending for Face Morphing Attacks | 用于面部变形攻击的最佳地标引导图像混合 | Qiaoyun He, Zongyong Deng, Zuyuan He, Qijun Zhao | http://arxiv.org/pdf/2401.16722v1 | null |
2024-01-30 | LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field Cameras | LF Tracy:用于光场相机中显着物体检测的统一单管道方法 | Fei Teng, Jiaming Zhang, Jiawei Liu, Kunyu Peng, Xina Cheng, Zhiyong Li, Kailun Yang | http://arxiv.org/pdf/2401.16712v1 | null |
2024-01-30 | EdgeOL: Efficient in-situ Online Learning on Edge Devices | EdgeOL:边缘设备上的高效原位在线学习 | Sheng Li, Geng Yuan, Yawen Wu, Yue Dai, Chao Wu, Alex K. Jones, Jingtong Hu, Yanzhi Wang, Xulong Tang | http://arxiv.org/pdf/2401.16694v1 | null |
2024-01-30 | Characterization of Magnetic Labyrinthine Structures through Junctions and Terminals Detection using Template Matching and CNN | 使用模板匹配和 CNN 通过连接和终端检测来表征磁性迷宫结构 | Vinícius Yu Okubo, Kotaro Shimizu, B. S. Shivaram, Hae Yong Kim | http://arxiv.org/pdf/2401.16688v1 | null |