Skip to content

Latest commit

 

History

History
executable file
·
118 lines (95 loc) · 17.2 KB

2024-04-26.md

File metadata and controls

executable file
·
118 lines (95 loc) · 17.2 KB

[UPDATED!] 2024-04-26 (Publish Time)

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-04-26 MaPa: Text-driven Photorealistic Material Painting for 3D Shapes MaPa:文本驱动的 3D 形状真实感材质绘画 Shangzhan Zhang, Sida Peng, Tao Xu, Yuanbo Yang, Tianrun Chen, Nan Xue, Yujun Shen, Hujun Bao, Ruizhen Hu, Xiaowei Zhou http://arxiv.org/pdf/2404.17569v1 null
2024-04-26 Multi-view Image Prompted Multi-view Diffusion for Improved 3D Generation 多视图图像提示多视图扩散以改进 3D 生成 Seungwook Kim, Yichun Shi, Kejie Li, Minsu Cho, Peng Wang http://arxiv.org/pdf/2404.17419v1 null
2024-04-26 MV-VTON: Multi-View Virtual Try-On with Diffusion Models MV-VTON:使用扩散模型的多视图虚拟试戴 Haoyu Wang, Zhilu Zhang, Donglin Di, Shiliang Zhang, Wangmeng Zuo http://arxiv.org/pdf/2404.17364v1 null
2024-04-26 Simultaneous Tri-Modal Medical Image Fusion and Super-Resolution using Conditional Diffusion Model 使用条件扩散模型的同时三模态医学图像融合和超分辨率 Yushen Xu, Xiaosong Li, Yuchan Jie, Haishu Tan http://arxiv.org/pdf/2404.17357v1 null
2024-04-26 On the Road to Clarity: Exploring Explainable AI for World Models in a Driver Assistance System 走向清晰之路:在驾驶员辅助系统中探索可解释的人工智能世界模型 Mohamed Roshdi, Julian Petzold, Mostafa Wahby, Hussein Ebrahim, Mladen Berekovic, Heiko Hamann http://arxiv.org/pdf/2404.17350v1 null
2024-04-26 Trinity Detector:text-assisted and attention mechanisms based spectral fusion for diffusion generation image detection Trinity Detector:用于扩散生成图像检测的基于文本辅助和注意机制的光谱融合 Jiawei Song, Dengpan Ye, Yunming Zhang http://arxiv.org/pdf/2404.17254v1 null
2024-04-26 Few-shot Calligraphy Style Learning 少笔书法风格学习 Fangda Chen, Jiacheng Nie, Lichuan Jiang, Zhuoer Zeng http://arxiv.org/pdf/2404.17199v1 link
2024-04-26 Synthesizing Iris Images using Generative Adversarial Networks: Survey and Comparative Analysis 使用生成对抗网络合成虹膜图像:调查与比较分析 Shivangi Yadav, Arun Ross http://arxiv.org/pdf/2404.17105v1 null

多模态

Publish Date Title Title_CN Authors PDF Code
2024-04-26 Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models 探索大型视觉语言模型生成的描述的独特性和保真度 Yuhang Huang, Zihan Wu, Chongyang Gao, Jiawei Peng, Xu Yang http://arxiv.org/pdf/2404.17534v1 null
2024-04-26 UniRGB-IR: A Unified Framework for Visible-Infrared Downstream Tasks via Adapter Tuning UniRGB-IR:通过适配器调整实现可见红外下游任务的统一框架 Maoxun Yuan, Bo Cui, Tianyi Zhao, Xingxing Wei http://arxiv.org/pdf/2404.17360v1 null
2024-04-26 Dense Road Surface Grip Map Prediction from Multimodal Image Data 根据多模态图像数据预测密集路面抓地力地图 Jyri Maanpää, Julius Pesonen, Heikki Hyyti, Iaroslav Melekhov, Juho Kannala, Petri Manninen, Antero Kukko, Juha Hyyppä http://arxiv.org/pdf/2404.17324v1 null
2024-04-26 MovieChat+: Question-aware Sparse Memory for Long Video Question Answering MovieChat+:用于长视频问答的问题感知稀疏内存 Enxin Song, Wenhao Chai, Tian Ye, Jenq-Neng Hwang, Xi Li, Gaoang Wang http://arxiv.org/pdf/2404.17176v1 null

Nerf

Publish Date Title Title_CN Authors PDF Code
2024-04-26 Geometry-aware Reconstruction and Fusion-refined Rendering for Generalizable Neural Radiance Fields 可泛化神经辐射场的几何感知重建和融合细化渲染 Tianqi Liu, Xinyi Ye, Min Shi, Zihao Huang, Zhiyu Pan, Zhan Peng, Zhiguo Cao http://arxiv.org/pdf/2404.17528v1 null

3DGS

Publish Date Title Title_CN Authors PDF Code
2024-04-26 SLAM for Indoor Mapping of Wide Area Construction Environments 用于广域施工环境室内测绘的 SLAM Vincent Ress, Wei Zhang, David Skuddis, Norbert Haala, Uwe Soergel http://arxiv.org/pdf/2404.17215v1 null

模型压缩/优化

Publish Date Title Title_CN Authors PDF Code
2024-04-26 A Novel Spike Transformer Network for Depth Estimation from Event Cameras via Cross-modality Knowledge Distillation 一种通过跨模态知识提炼从事件相机进行深度估计的新型尖峰变换器网络 Xin Zhang, Liangxiu Han, Tam Sobeih, Lianghao Han, Darren Dancey http://arxiv.org/pdf/2404.17335v1 null
2024-04-26 CSCO: Connectivity Search of Convolutional Operators CSCO:卷积算子的连通性搜索 Tunhou Zhang, Shiyu Li, Hsin-Pai Cheng, Feng Yan, Hai Li, Yiran Chen http://arxiv.org/pdf/2404.17152v1 null

分类/检测/识别/分割/...

Publish Date Title Title_CN Authors PDF Code
2024-04-26 ChangeBind: A Hybrid Change Encoder for Remote Sensing Change Detection ChangeBind:用于遥感变化检测的混合变化编码器 Mubashir Noman, Mustansar Fiaz, Hisham Cholakkal http://arxiv.org/pdf/2404.17565v1 null
2024-04-26 Inhomogeneous illuminated image enhancement under extremely low visibility condition 极低能见度条件下的不均匀照明图像增强 Libang Chen, Yikun Liu, Jianying Zhou http://arxiv.org/pdf/2404.17503v1 null
2024-04-26 Low Cost Machine Vision for Insect Classification 用于昆虫分类的低成本机器视觉 Danja Brandt, Martin Tschaikner, Teodor Chiaburu, Henning Schmidt, Ilona Schrimpf, Alexandra Stadel, Ingeborg E. Beckers, Frank Haußer http://arxiv.org/pdf/2404.17488v1 null
2024-04-26 TextGaze: Gaze-Controllable Face Generation with Natural Language TextGaze:使用自然语言进行注视控制的面部生成 Hengfei Wang, Zhongqun Zhang, Yihua Cheng, Hyung Jin Chang http://arxiv.org/pdf/2404.17486v1 null
2024-04-26 Cost-Sensitive Uncertainty-Based Failure Recognition for Object Detection 用于目标检测的成本敏感的基于不确定性的故障识别 Moussa Kassem Sbeyti, Michelle Karg, Christian Wirth, Nadja Klein, Sahin Albayrak http://arxiv.org/pdf/2404.17427v1 null
2024-04-26 Frequency-Guided Multi-Level Human Action Anomaly Detection with Normalizing Flows 具有标准化流程的频率引导多级人体行为异常检测 Shun Maeda, Chunzhi Gu, Jun Yu, Shogo Tokai, Shangce Gao, Chao Zhang http://arxiv.org/pdf/2404.17381v1 null
2024-04-26 Masked Two-channel Decoupling Framework for Incomplete Multi-view Weak Multi-label Learning 用于不完全多视图弱多标签学习的屏蔽两通道解耦框架 Chengliang Liu, Jie Wen, Yabo Liu, Chao Huang, Zhihao Wu, Xiaoling Luo, Yong Xu http://arxiv.org/pdf/2404.17340v1 null
2024-04-26 Image Copy-Move Forgery Detection via Deep PatchMatch and Pairwise Ranking Learning 通过深度补丁匹配和成对排名学习进行图像复制移动伪造检测 Yuanman Li, Yingjie He, Changsheng Chen, Li Dong, Bin Li, Jiantao Zhou, Xia Li http://arxiv.org/pdf/2404.17310v1 null
2024-04-26 Part-Guided 3D RL for Sim2Real Articulated Object Manipulation 用于 Sim2Real 铰接式物体操作的部分引导 3D RL Pengwei Xie, Rui Chen, Siang Chen, Yuzhe Qin, Fanbo Xiang, Tianyu Sun, Jing Xu, Guijin Wang, Hao Su http://arxiv.org/pdf/2404.17302v1 null
2024-04-26 Adversarial Reweighting with $α$-Power Maximization for Domain Adaptation 通过 $α$ 功率最大化进行对抗性重新加权以实现域适应 Xiang Gu, Xi Yu, Yan Yang, Jian Sun, Zongben Xu http://arxiv.org/pdf/2404.17275v1 null
2024-04-26 3SHNet: Boosting Image-Sentence Retrieval via Visual Semantic-Spatial Self-Highlighting 3SHNet:通过视觉语义空间自我突出增强图像句子检索 Xuri Ge, Songpei Xu, Fuhai Chen, Jie Wang, Guoxin Wang, Shan An, Joemon M. Jose http://arxiv.org/pdf/2404.17273v1 null
2024-04-26 SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes SDFD:构建具有多种属性的多功能合成人脸图像数据集 Georgia Baltsou, Ioannis Sarridis, Christos Koutlis, Symeon Papadopoulos http://arxiv.org/pdf/2404.17255v1 null
2024-04-26 Comparison of self-supervised in-domain and supervised out-domain transfer learning for bird species recognition 用于鸟类物种识别的自监督域内和监督域外迁移学习的比较 Houtan Ghaffari, Paul Devos http://arxiv.org/pdf/2404.17252v1 null
2024-04-26 Optimizing Universal Lesion Segmentation: State Space Model-Guided Hierarchical Networks with Feature Importance Adjustment 优化通用病变分割:状态空间模型引导的具有特征重要性调整的分层网络 Kazi Shahriar Sanjid, Md. Tanzim Hossain, Md. Shakib Shahariar Junayed, M. Monir Uddin http://arxiv.org/pdf/2404.17235v1 null
2024-04-26 SAGHOG: Self-Supervised Autoencoder for Generating HOG Features for Writer Retrieval SAGHOG:用于生成用于作家检索的 HOG 特征的自监督自动编码器 Marco Peer, Florian Kleber, Robert Sablatnig http://arxiv.org/pdf/2404.17221v1 null
2024-04-26 Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer 二合一:具有解耦主题上下文转换器的单级情感识别 Xinpeng Li, Teng Wang, Jian Zhao, Shuyi Mao, Jinbao Wang, Feng Zheng, Xiaojiang Peng, Xuelong Li http://arxiv.org/pdf/2404.17205v1 null
2024-04-26 MCSDNet: Mesoscale Convective System Detection Network via Multi-scale Spatiotemporal Information MCSDNet:基于多尺度时空信息的中尺度对流系统检测网络 Jiajun Liang, Baoquan Zhang, Yunming Ye, Xutao Li, Chuyao Luo, Xukai Fu http://arxiv.org/pdf/2404.17186v1 null
2024-04-26 Exploring Beyond Logits: Hierarchical Dynamic Labeling Based on Embeddings for Semi-Supervised Classification 探索 Logits 之外:基于半监督分类嵌入的分层动态标记 Yanbiao Ma, Licheng Jiao, Fang Liu, Lingling Li, Shuyuan Yang, Xu Liu http://arxiv.org/pdf/2404.17173v1 null
2024-04-26 MorphText: Deep Morphology Regularized Arbitrary-shape Scene Text Detection MorphText:深度形态正则化任意形状场景文本检测 Chengpei Xu, Wenjing Jia, Ruomei Wang, Xiaonan Luo, Xiangjian He http://arxiv.org/pdf/2404.17151v1 null
2024-04-26 Pose-Specific 3D Fingerprint Unfolding 特定姿势的 3D 指纹展开 Xiongjun Guan, Jianjiang Feng, Jie Zhou http://arxiv.org/pdf/2404.17149v1 null
2024-04-26 Direct Regression of Distortion Field from a Single Fingerprint Image 单指纹图像畸变场的直接回归 Xiongjun Guan, Yongjie Duan, Jianjiang Feng, Jie Zhou http://arxiv.org/pdf/2404.17148v1 null
2024-04-26 Localization of Pallets on Shelves Using Horizontal Plane Projection of a 360-degree Image 使用 360 度图像的水平面投影定位货架上的托盘 Yasuyo Kita, Yudai Fujieda, Ichiro Matsuda, Nobuyuki Kita http://arxiv.org/pdf/2404.17118v1 null
2024-04-26 Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive Prompting 具有人类表情敏感提示的基于开放集视频的面部表情识别 Yuanyuan Liu, Yuxuan Huang, Shuyang Liu, Yibing Zhan, Zijing Chen, Zhe Chen http://arxiv.org/pdf/2404.17100v1 null

图像理解

Publish Date Title Title_CN Authors PDF Code
2024-04-26 Weakly Supervised Training for Hologram Verification in Identity Documents 身份证件全息图验证的弱监督训练 Glen Pouliquen, Guillaume Chiron, Joseph Chazalon, Thierry Géraud, Ahmad Montaser Awal http://arxiv.org/pdf/2404.17253v1 null

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-04-26 Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos 隧道试穿:挖掘时空隧道,实现高质量视频虚拟试穿 Zhengze Xu, Mengting Chen, Zhao Wang, Linyu Xing, Zhonghua Zhai, Nong Sang, Jinsong Lan, Shuai Xiao, Changxin Gao http://arxiv.org/pdf/2404.17571v1 null
2024-04-26 PromptCIR: Blind Compressed Image Restoration with Prompt Learning PromptCIR:快速学习的盲压缩图像恢复 Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen http://arxiv.org/pdf/2404.17433v1 null
2024-04-26 Spatial-frequency Dual-Domain Feature Fusion Network for Low-Light Remote Sensing Image Enhancement 用于弱光遥感图像增强的空频双域特征融合网络 Zishu Yao, Guodong Fan, Jinfu Fan, Min Gan, C. L. Philip Chen http://arxiv.org/pdf/2404.17400v1 null
2024-04-26 Parameter Efficient Fine-tuning of Self-supervised ViTs without Catastrophic Forgetting 自监督 ViT 的参数高效微调,不会出现灾难性遗忘 Reza Akbarian Bafghi, Nidhin Harilal, Claire Monteleoni, Maziar Raissi http://arxiv.org/pdf/2404.17245v1 null
2024-04-26 Binarizing Documents by Leveraging both Space and Frequency 利用空间和频率对文档进行二值化 Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli, Rita Cucchiara http://arxiv.org/pdf/2404.17243v1 null
2024-04-26 ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion ObjectAdd:通过免训练的扩散修改方式将对象添加到图像中 Ziyue Zhang, Mingbao Lin, Rongrong Ji http://arxiv.org/pdf/2404.17230v1 null
2024-04-26 S-IQA Image Quality Assessment With Compressive Sampling 通过压缩采样进行 S-IQA 图像质量评估 Ronghua Liao, Chen Hui, Lang Yuan, Feng Jiang http://arxiv.org/pdf/2404.17170v1 null
2024-04-26 On the Federated Learning Framework for Cooperative Perception 合作感知的联邦学习框架 Zhenrong Zhang, Jianan Liu, Xi Zhou, Tao Huang, Qing-Long Han, Jingxin Liu, Hongbin Liu http://arxiv.org/pdf/2404.17147v1 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-04-26 Camera Motion Estimation from RGB-D-Inertial Scene Flow 根据 RGB-D-惯性场景流进行相机运动估计 Samuel Cerezo, Javier Civera http://arxiv.org/pdf/2404.17251v1 null

各类学习方式

Publish Date Title Title_CN Authors PDF Code
2024-04-26 Learning text-to-video retrieval from image captioning 从图像字幕中学习文本到视频的检索 Lucas Ventura, Cordelia Schmid, Gül Varol http://arxiv.org/pdf/2404.17498v1 null
2024-04-26 Self-supervised visual learning in the low-data regime: a comparative evaluation 低数据条件下的自监督视觉学习:比较评估 Sotirios Konstantakos, Despina Ioanna Chalkiadaki, Ioannis Mademlis, Yuki M. Asano, Efstratios Gavves, Georgios Th. Papadopoulos http://arxiv.org/pdf/2404.17202v1 null

其他

Publish Date Title Title_CN Authors PDF Code
2024-04-26 Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations Ag2Manip:通过与代理无关的视觉和动作表示来学习新颖的操作技能 Puhao Li, Tengyu Liu, Yuyang Li, Muzhi Han, Haoran Geng, Shu Wang, Yixin Zhu, Song-Chun Zhu, Siyuan Huang http://arxiv.org/pdf/2404.17521v1 null
2024-04-26 HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts HYPE:针对未指定图像和文本的双曲蕴涵过滤 Wonjae Kim, Sanghyuk Chun, Taekyung Kim, Dongyoon Han, Sangdoo Yun http://arxiv.org/pdf/2404.17507v1 null
2024-04-26 Sparse Reconstruction of Optical Doppler Tomography Based on State Space Model 基于状态空间模型的光学多普勒层析成像稀疏重建 Zhenghong Li, Jiaxiang Ren, Wensheng Cheng, Congwu Du, Yingtian Pan, Haibin Ling http://arxiv.org/pdf/2404.17484v1 null
2024-04-26 One-Shot Image Restoration 一次图像恢复 Deborah Pereg http://arxiv.org/pdf/2404.17426v1 null
2024-04-26 Estimating the Robustness Radius for Randomized Smoothing with 100$\times$ Sample Efficiency 估计 100$\times$ 样本效率的随机平滑的鲁棒半径 Emmanouil Seferis, Stefanos Kollias, Chih-Hong Cheng http://arxiv.org/pdf/2404.17371v1 null
2024-04-26 Scrutinizing Data from Sky: An Examination of Its Veracity in Area Based Traffic Contexts 仔细检查来自天空的数据:检查其在基于区域的交通环境中的准确性 Yawar Ali, Krishnan K N, Debashis Ray Sarkar, K. Ramachandra Rao, Niladri Chatterjee, Ashish Bhaskar http://arxiv.org/pdf/2404.17212v1 null
2024-04-26 Low-Rank Knowledge Decomposition for Medical Foundation Models 医学基础模型的低阶知识分解 Yuhang Zhou, Haolin Li, Siyuan Du, Jiangchao Yao, Ya Zhang, Yanfeng Wang http://arxiv.org/pdf/2404.17184v1 null
2024-04-26 Phase-aggregated Dual-branch Network for Efficient Fingerprint Dense Registration 用于高效指纹密集注册的相位聚合双分支网络 Xiongjun Guan, Jianjiang Feng, Jie Zhou http://arxiv.org/pdf/2404.17159v1 null
2024-04-26 Don't Look at the Camera: Achieving Perceived Eye Contact 不要看镜头:实现感知的眼神接触 Alice Gao, Samyukta Jayakumar, Marcello Maniglia, Brian Curless, Ira Kemelmacher-Shlizerman, Aaron R. Seitz, Steven M. Seitz http://arxiv.org/pdf/2404.17104v1 null
2024-04-26 Defending Spiking Neural Networks against Adversarial Attacks through Image Purification 通过图像净化保护尖峰神经网络免受对抗性攻击 Weiran Chen, Qi Sun, Qi Xu http://arxiv.org/pdf/2404.17092v1 null