Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-04-26 | MaPa: Text-driven Photorealistic Material Painting for 3D Shapes | MaPa:文本驱动的 3D 形状真实感材质绘画 | Shangzhan Zhang, Sida Peng, Tao Xu, Yuanbo Yang, Tianrun Chen, Nan Xue, Yujun Shen, Hujun Bao, Ruizhen Hu, Xiaowei Zhou | http://arxiv.org/pdf/2404.17569v1 | null |
2024-04-26 | Multi-view Image Prompted Multi-view Diffusion for Improved 3D Generation | 多视图图像提示多视图扩散以改进 3D 生成 | Seungwook Kim, Yichun Shi, Kejie Li, Minsu Cho, Peng Wang | http://arxiv.org/pdf/2404.17419v1 | null |
2024-04-26 | MV-VTON: Multi-View Virtual Try-On with Diffusion Models | MV-VTON:使用扩散模型的多视图虚拟试戴 | Haoyu Wang, Zhilu Zhang, Donglin Di, Shiliang Zhang, Wangmeng Zuo | http://arxiv.org/pdf/2404.17364v1 | null |
2024-04-26 | Simultaneous Tri-Modal Medical Image Fusion and Super-Resolution using Conditional Diffusion Model | 使用条件扩散模型的同时三模态医学图像融合和超分辨率 | Yushen Xu, Xiaosong Li, Yuchan Jie, Haishu Tan | http://arxiv.org/pdf/2404.17357v1 | null |
2024-04-26 | On the Road to Clarity: Exploring Explainable AI for World Models in a Driver Assistance System | 走向清晰之路:在驾驶员辅助系统中探索可解释的人工智能世界模型 | Mohamed Roshdi, Julian Petzold, Mostafa Wahby, Hussein Ebrahim, Mladen Berekovic, Heiko Hamann | http://arxiv.org/pdf/2404.17350v1 | null |
2024-04-26 | Trinity Detector:text-assisted and attention mechanisms based spectral fusion for diffusion generation image detection | Trinity Detector:用于扩散生成图像检测的基于文本辅助和注意机制的光谱融合 | Jiawei Song, Dengpan Ye, Yunming Zhang | http://arxiv.org/pdf/2404.17254v1 | null |
2024-04-26 | Few-shot Calligraphy Style Learning | 少笔书法风格学习 | Fangda Chen, Jiacheng Nie, Lichuan Jiang, Zhuoer Zeng | http://arxiv.org/pdf/2404.17199v1 | link |
2024-04-26 | Synthesizing Iris Images using Generative Adversarial Networks: Survey and Comparative Analysis | 使用生成对抗网络合成虹膜图像:调查与比较分析 | Shivangi Yadav, Arun Ross | http://arxiv.org/pdf/2404.17105v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-04-26 | Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models | 探索大型视觉语言模型生成的描述的独特性和保真度 | Yuhang Huang, Zihan Wu, Chongyang Gao, Jiawei Peng, Xu Yang | http://arxiv.org/pdf/2404.17534v1 | null |
2024-04-26 | UniRGB-IR: A Unified Framework for Visible-Infrared Downstream Tasks via Adapter Tuning | UniRGB-IR:通过适配器调整实现可见红外下游任务的统一框架 | Maoxun Yuan, Bo Cui, Tianyi Zhao, Xingxing Wei | http://arxiv.org/pdf/2404.17360v1 | null |
2024-04-26 | Dense Road Surface Grip Map Prediction from Multimodal Image Data | 根据多模态图像数据预测密集路面抓地力地图 | Jyri Maanpää, Julius Pesonen, Heikki Hyyti, Iaroslav Melekhov, Juho Kannala, Petri Manninen, Antero Kukko, Juha Hyyppä | http://arxiv.org/pdf/2404.17324v1 | null |
2024-04-26 | MovieChat+: Question-aware Sparse Memory for Long Video Question Answering | MovieChat+:用于长视频问答的问题感知稀疏内存 | Enxin Song, Wenhao Chai, Tian Ye, Jenq-Neng Hwang, Xi Li, Gaoang Wang | http://arxiv.org/pdf/2404.17176v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-04-26 | Geometry-aware Reconstruction and Fusion-refined Rendering for Generalizable Neural Radiance Fields | 可泛化神经辐射场的几何感知重建和融合细化渲染 | Tianqi Liu, Xinyi Ye, Min Shi, Zihao Huang, Zhiyu Pan, Zhan Peng, Zhiguo Cao | http://arxiv.org/pdf/2404.17528v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-04-26 | SLAM for Indoor Mapping of Wide Area Construction Environments | 用于广域施工环境室内测绘的 SLAM | Vincent Ress, Wei Zhang, David Skuddis, Norbert Haala, Uwe Soergel | http://arxiv.org/pdf/2404.17215v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-04-26 | A Novel Spike Transformer Network for Depth Estimation from Event Cameras via Cross-modality Knowledge Distillation | 一种通过跨模态知识提炼从事件相机进行深度估计的新型尖峰变换器网络 | Xin Zhang, Liangxiu Han, Tam Sobeih, Lianghao Han, Darren Dancey | http://arxiv.org/pdf/2404.17335v1 | null |
2024-04-26 | CSCO: Connectivity Search of Convolutional Operators | CSCO:卷积算子的连通性搜索 | Tunhou Zhang, Shiyu Li, Hsin-Pai Cheng, Feng Yan, Hai Li, Yiran Chen | http://arxiv.org/pdf/2404.17152v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-04-26 | ChangeBind: A Hybrid Change Encoder for Remote Sensing Change Detection | ChangeBind:用于遥感变化检测的混合变化编码器 | Mubashir Noman, Mustansar Fiaz, Hisham Cholakkal | http://arxiv.org/pdf/2404.17565v1 | null |
2024-04-26 | Inhomogeneous illuminated image enhancement under extremely low visibility condition | 极低能见度条件下的不均匀照明图像增强 | Libang Chen, Yikun Liu, Jianying Zhou | http://arxiv.org/pdf/2404.17503v1 | null |
2024-04-26 | Low Cost Machine Vision for Insect Classification | 用于昆虫分类的低成本机器视觉 | Danja Brandt, Martin Tschaikner, Teodor Chiaburu, Henning Schmidt, Ilona Schrimpf, Alexandra Stadel, Ingeborg E. Beckers, Frank Haußer | http://arxiv.org/pdf/2404.17488v1 | null |
2024-04-26 | TextGaze: Gaze-Controllable Face Generation with Natural Language | TextGaze:使用自然语言进行注视控制的面部生成 | Hengfei Wang, Zhongqun Zhang, Yihua Cheng, Hyung Jin Chang | http://arxiv.org/pdf/2404.17486v1 | null |
2024-04-26 | Cost-Sensitive Uncertainty-Based Failure Recognition for Object Detection | 用于目标检测的成本敏感的基于不确定性的故障识别 | Moussa Kassem Sbeyti, Michelle Karg, Christian Wirth, Nadja Klein, Sahin Albayrak | http://arxiv.org/pdf/2404.17427v1 | null |
2024-04-26 | Frequency-Guided Multi-Level Human Action Anomaly Detection with Normalizing Flows | 具有标准化流程的频率引导多级人体行为异常检测 | Shun Maeda, Chunzhi Gu, Jun Yu, Shogo Tokai, Shangce Gao, Chao Zhang | http://arxiv.org/pdf/2404.17381v1 | null |
2024-04-26 | Masked Two-channel Decoupling Framework for Incomplete Multi-view Weak Multi-label Learning | 用于不完全多视图弱多标签学习的屏蔽两通道解耦框架 | Chengliang Liu, Jie Wen, Yabo Liu, Chao Huang, Zhihao Wu, Xiaoling Luo, Yong Xu | http://arxiv.org/pdf/2404.17340v1 | null |
2024-04-26 | Image Copy-Move Forgery Detection via Deep PatchMatch and Pairwise Ranking Learning | 通过深度补丁匹配和成对排名学习进行图像复制移动伪造检测 | Yuanman Li, Yingjie He, Changsheng Chen, Li Dong, Bin Li, Jiantao Zhou, Xia Li | http://arxiv.org/pdf/2404.17310v1 | null |
2024-04-26 | Part-Guided 3D RL for Sim2Real Articulated Object Manipulation | 用于 Sim2Real 铰接式物体操作的部分引导 3D RL | Pengwei Xie, Rui Chen, Siang Chen, Yuzhe Qin, Fanbo Xiang, Tianyu Sun, Jing Xu, Guijin Wang, Hao Su | http://arxiv.org/pdf/2404.17302v1 | null |
2024-04-26 | Adversarial Reweighting with |
通过 |
Xiang Gu, Xi Yu, Yan Yang, Jian Sun, Zongben Xu | http://arxiv.org/pdf/2404.17275v1 | null |
2024-04-26 | 3SHNet: Boosting Image-Sentence Retrieval via Visual Semantic-Spatial Self-Highlighting | 3SHNet:通过视觉语义空间自我突出增强图像句子检索 | Xuri Ge, Songpei Xu, Fuhai Chen, Jie Wang, Guoxin Wang, Shan An, Joemon M. Jose | http://arxiv.org/pdf/2404.17273v1 | null |
2024-04-26 | SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes | SDFD:构建具有多种属性的多功能合成人脸图像数据集 | Georgia Baltsou, Ioannis Sarridis, Christos Koutlis, Symeon Papadopoulos | http://arxiv.org/pdf/2404.17255v1 | null |
2024-04-26 | Comparison of self-supervised in-domain and supervised out-domain transfer learning for bird species recognition | 用于鸟类物种识别的自监督域内和监督域外迁移学习的比较 | Houtan Ghaffari, Paul Devos | http://arxiv.org/pdf/2404.17252v1 | null |
2024-04-26 | Optimizing Universal Lesion Segmentation: State Space Model-Guided Hierarchical Networks with Feature Importance Adjustment | 优化通用病变分割:状态空间模型引导的具有特征重要性调整的分层网络 | Kazi Shahriar Sanjid, Md. Tanzim Hossain, Md. Shakib Shahariar Junayed, M. Monir Uddin | http://arxiv.org/pdf/2404.17235v1 | null |
2024-04-26 | SAGHOG: Self-Supervised Autoencoder for Generating HOG Features for Writer Retrieval | SAGHOG:用于生成用于作家检索的 HOG 特征的自监督自动编码器 | Marco Peer, Florian Kleber, Robert Sablatnig | http://arxiv.org/pdf/2404.17221v1 | null |
2024-04-26 | Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer | 二合一:具有解耦主题上下文转换器的单级情感识别 | Xinpeng Li, Teng Wang, Jian Zhao, Shuyi Mao, Jinbao Wang, Feng Zheng, Xiaojiang Peng, Xuelong Li | http://arxiv.org/pdf/2404.17205v1 | null |
2024-04-26 | MCSDNet: Mesoscale Convective System Detection Network via Multi-scale Spatiotemporal Information | MCSDNet:基于多尺度时空信息的中尺度对流系统检测网络 | Jiajun Liang, Baoquan Zhang, Yunming Ye, Xutao Li, Chuyao Luo, Xukai Fu | http://arxiv.org/pdf/2404.17186v1 | null |
2024-04-26 | Exploring Beyond Logits: Hierarchical Dynamic Labeling Based on Embeddings for Semi-Supervised Classification | 探索 Logits 之外:基于半监督分类嵌入的分层动态标记 | Yanbiao Ma, Licheng Jiao, Fang Liu, Lingling Li, Shuyuan Yang, Xu Liu | http://arxiv.org/pdf/2404.17173v1 | null |
2024-04-26 | MorphText: Deep Morphology Regularized Arbitrary-shape Scene Text Detection | MorphText:深度形态正则化任意形状场景文本检测 | Chengpei Xu, Wenjing Jia, Ruomei Wang, Xiaonan Luo, Xiangjian He | http://arxiv.org/pdf/2404.17151v1 | null |
2024-04-26 | Pose-Specific 3D Fingerprint Unfolding | 特定姿势的 3D 指纹展开 | Xiongjun Guan, Jianjiang Feng, Jie Zhou | http://arxiv.org/pdf/2404.17149v1 | null |
2024-04-26 | Direct Regression of Distortion Field from a Single Fingerprint Image | 单指纹图像畸变场的直接回归 | Xiongjun Guan, Yongjie Duan, Jianjiang Feng, Jie Zhou | http://arxiv.org/pdf/2404.17148v1 | null |
2024-04-26 | Localization of Pallets on Shelves Using Horizontal Plane Projection of a 360-degree Image | 使用 360 度图像的水平面投影定位货架上的托盘 | Yasuyo Kita, Yudai Fujieda, Ichiro Matsuda, Nobuyuki Kita | http://arxiv.org/pdf/2404.17118v1 | null |
2024-04-26 | Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive Prompting | 具有人类表情敏感提示的基于开放集视频的面部表情识别 | Yuanyuan Liu, Yuxuan Huang, Shuyang Liu, Yibing Zhan, Zijing Chen, Zhe Chen | http://arxiv.org/pdf/2404.17100v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-04-26 | Weakly Supervised Training for Hologram Verification in Identity Documents | 身份证件全息图验证的弱监督训练 | Glen Pouliquen, Guillaume Chiron, Joseph Chazalon, Thierry Géraud, Ahmad Montaser Awal | http://arxiv.org/pdf/2404.17253v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-04-26 | Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos | 隧道试穿:挖掘时空隧道,实现高质量视频虚拟试穿 | Zhengze Xu, Mengting Chen, Zhao Wang, Linyu Xing, Zhonghua Zhai, Nong Sang, Jinsong Lan, Shuai Xiao, Changxin Gao | http://arxiv.org/pdf/2404.17571v1 | null |
2024-04-26 | PromptCIR: Blind Compressed Image Restoration with Prompt Learning | PromptCIR:快速学习的盲压缩图像恢复 | Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen | http://arxiv.org/pdf/2404.17433v1 | null |
2024-04-26 | Spatial-frequency Dual-Domain Feature Fusion Network for Low-Light Remote Sensing Image Enhancement | 用于弱光遥感图像增强的空频双域特征融合网络 | Zishu Yao, Guodong Fan, Jinfu Fan, Min Gan, C. L. Philip Chen | http://arxiv.org/pdf/2404.17400v1 | null |
2024-04-26 | Parameter Efficient Fine-tuning of Self-supervised ViTs without Catastrophic Forgetting | 自监督 ViT 的参数高效微调,不会出现灾难性遗忘 | Reza Akbarian Bafghi, Nidhin Harilal, Claire Monteleoni, Maziar Raissi | http://arxiv.org/pdf/2404.17245v1 | null |
2024-04-26 | Binarizing Documents by Leveraging both Space and Frequency | 利用空间和频率对文档进行二值化 | Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli, Rita Cucchiara | http://arxiv.org/pdf/2404.17243v1 | null |
2024-04-26 | ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion | ObjectAdd:通过免训练的扩散修改方式将对象添加到图像中 | Ziyue Zhang, Mingbao Lin, Rongrong Ji | http://arxiv.org/pdf/2404.17230v1 | null |
2024-04-26 | S-IQA Image Quality Assessment With Compressive Sampling | 通过压缩采样进行 S-IQA 图像质量评估 | Ronghua Liao, Chen Hui, Lang Yuan, Feng Jiang | http://arxiv.org/pdf/2404.17170v1 | null |
2024-04-26 | On the Federated Learning Framework for Cooperative Perception | 合作感知的联邦学习框架 | Zhenrong Zhang, Jianan Liu, Xi Zhou, Tao Huang, Qing-Long Han, Jingxin Liu, Hongbin Liu | http://arxiv.org/pdf/2404.17147v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-04-26 | Camera Motion Estimation from RGB-D-Inertial Scene Flow | 根据 RGB-D-惯性场景流进行相机运动估计 | Samuel Cerezo, Javier Civera | http://arxiv.org/pdf/2404.17251v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-04-26 | Learning text-to-video retrieval from image captioning | 从图像字幕中学习文本到视频的检索 | Lucas Ventura, Cordelia Schmid, Gül Varol | http://arxiv.org/pdf/2404.17498v1 | null |
2024-04-26 | Self-supervised visual learning in the low-data regime: a comparative evaluation | 低数据条件下的自监督视觉学习:比较评估 | Sotirios Konstantakos, Despina Ioanna Chalkiadaki, Ioannis Mademlis, Yuki M. Asano, Efstratios Gavves, Georgios Th. Papadopoulos | http://arxiv.org/pdf/2404.17202v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-04-26 | Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations | Ag2Manip:通过与代理无关的视觉和动作表示来学习新颖的操作技能 | Puhao Li, Tengyu Liu, Yuyang Li, Muzhi Han, Haoran Geng, Shu Wang, Yixin Zhu, Song-Chun Zhu, Siyuan Huang | http://arxiv.org/pdf/2404.17521v1 | null |
2024-04-26 | HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts | HYPE:针对未指定图像和文本的双曲蕴涵过滤 | Wonjae Kim, Sanghyuk Chun, Taekyung Kim, Dongyoon Han, Sangdoo Yun | http://arxiv.org/pdf/2404.17507v1 | null |
2024-04-26 | Sparse Reconstruction of Optical Doppler Tomography Based on State Space Model | 基于状态空间模型的光学多普勒层析成像稀疏重建 | Zhenghong Li, Jiaxiang Ren, Wensheng Cheng, Congwu Du, Yingtian Pan, Haibin Ling | http://arxiv.org/pdf/2404.17484v1 | null |
2024-04-26 | One-Shot Image Restoration | 一次图像恢复 | Deborah Pereg | http://arxiv.org/pdf/2404.17426v1 | null |
2024-04-26 | Estimating the Robustness Radius for Randomized Smoothing with 100$\times$ Sample Efficiency | 估计 100$\times$ 样本效率的随机平滑的鲁棒半径 | Emmanouil Seferis, Stefanos Kollias, Chih-Hong Cheng | http://arxiv.org/pdf/2404.17371v1 | null |
2024-04-26 | Scrutinizing Data from Sky: An Examination of Its Veracity in Area Based Traffic Contexts | 仔细检查来自天空的数据:检查其在基于区域的交通环境中的准确性 | Yawar Ali, Krishnan K N, Debashis Ray Sarkar, K. Ramachandra Rao, Niladri Chatterjee, Ashish Bhaskar | http://arxiv.org/pdf/2404.17212v1 | null |
2024-04-26 | Low-Rank Knowledge Decomposition for Medical Foundation Models | 医学基础模型的低阶知识分解 | Yuhang Zhou, Haolin Li, Siyuan Du, Jiangchao Yao, Ya Zhang, Yanfeng Wang | http://arxiv.org/pdf/2404.17184v1 | null |
2024-04-26 | Phase-aggregated Dual-branch Network for Efficient Fingerprint Dense Registration | 用于高效指纹密集注册的相位聚合双分支网络 | Xiongjun Guan, Jianjiang Feng, Jie Zhou | http://arxiv.org/pdf/2404.17159v1 | null |
2024-04-26 | Don't Look at the Camera: Achieving Perceived Eye Contact | 不要看镜头:实现感知的眼神接触 | Alice Gao, Samyukta Jayakumar, Marcello Maniglia, Brian Curless, Ira Kemelmacher-Shlizerman, Aaron R. Seitz, Steven M. Seitz | http://arxiv.org/pdf/2404.17104v1 | null |
2024-04-26 | Defending Spiking Neural Networks against Adversarial Attacks through Image Purification | 通过图像净化保护尖峰神经网络免受对抗性攻击 | Weiran Chen, Qi Sun, Qi Xu | http://arxiv.org/pdf/2404.17092v1 | null |