Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-08-11 | LaWa: Using Latent Space for In-Generation Image Watermarking | LaWa:利用潜在空间进行生成图像水印 | Ahmad Rezaei, Mohammad Akbari, Saeed Ranjbar Alvar, Arezou Fatemi, Yong Zhang | http://arxiv.org/pdf/2408.05868v1 | null |
2024-08-11 | Egocentric Vision Language Planning | 自我中心愿景语言规划 | Zhirui Fang, Ming Yang, Weishuai Zeng, Boyu Li, Junpeng Yue, Ziluo Ding, Xiu Li, Zongqing Lu | http://arxiv.org/pdf/2408.05802v1 | null |
2024-08-11 | Seg-CycleGAN : SAR-to-optical image translation guided by a downstream task | Seg-CycleGAN:由下游任务引导的 SAR 到光学图像转换 | Hannuo Zhang, Huihui Li, Jiarui Lin, Yujie Zhang, Jianghua Fan, Hang Liu | http://arxiv.org/pdf/2408.05777v1 | null |
2024-08-11 | SSL: A Self-similarity Loss for Improving Generative Image Super-resolution | SSL:一种用于改进生成图像超分辨率的自相似性损失 | Du Chen, Zhengqiang Zhang, Jie Liang, Lei Zhang | http://arxiv.org/pdf/2408.05713v1 | null |
2024-08-11 | TC-KANRecon: High-Quality and Accelerated MRI Reconstruction via Adaptive KAN Mechanisms and Intelligent Feature Scaling | TC-KANRecon:通过自适应 KAN 机制和智能特征缩放实现高质量、快速的 MRI 重建 | Ruiquan Ge, Xiao Yu, Yifei Chen, Fan Jia, Shenghao Zhu, Guanyu Zhou, Yiyu Huang, Chenyan Zhang, Dong Zeng, Changmiao Wang, et.al. | http://arxiv.org/pdf/2408.05705v1 | link |
2024-08-11 | A Novel Momentum-Based Deep Learning Techniques for Medical Image Classification and Segmentation | 一种基于动量的新型深度学习医学图像分类和分割技术 | Koushik Biswas, Ridal Pal, Shaswat Patel, Debesh Jha, Meghana Karri, Amit Reza, Gorkem Durak, Alpay Medetalibeyoglu, Matthew Antalek, Yury Velichko, et.al. | http://arxiv.org/pdf/2408.05692v1 | null |
2024-08-11 | StealthDiffusion: Towards Evading Diffusion Forensic Detection through Diffusion Model | StealthDiffusion:通过扩散模型逃避扩散取证检测 | Ziyin Zhou, Ke Sun, Zhongxi Chen, Huafeng Kuang, Xiaoshuai Sun, Rongrong Ji | http://arxiv.org/pdf/2408.05669v1 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-08-11 | Robust Domain Generalization for Multi-modal Object Recognition | 多模态物体识别的鲁棒领域泛化 | Yuxin Qiao, Keqin Li, Junhong Lin, Rong Wei, Chufeng Jiang, Yang Luo, Haoyu Yang | http://arxiv.org/pdf/2408.05831v1 | null |
2024-08-11 | An analysis of HOI: using a training-free method with multimodal visual foundation models when only the test set is available, without the training set | HOI 分析:在仅有测试集而没有训练集的情况下,使用多模态视觉基础模型的免训练方法 | Chaoyi Ai | http://arxiv.org/pdf/2408.05772v1 | null |
2024-08-11 | Advancing Re-Ranking with Multimodal Fusion and Target-Oriented Auxiliary Tasks in E-Commerce Search | 利用多模态融合和面向目标的辅助任务推进电子商务搜索中的重新排序 | Enqiang Xu, Xinhui Li, Zhigong Zhou, Jiahao Ji, Jinyuan Zhao, Dadong Miao, Songlin Wang, Lin Liu, Sulong Xu | http://arxiv.org/pdf/2408.05751v1 | null |
2024-08-11 | A Training-Free Framework for Video License Plate Tracking and Recognition with Only One-Shot | 无需训练的一次性视频车牌跟踪和识别框架 | Haoxuan Ding, Qi Wang, Junyu Gao, Qiang Li | http://arxiv.org/pdf/2408.05729v1 | link |
2024-08-11 | Contrastive masked auto-encoders based self-supervised hashing for 2D image and 3D point cloud cross-modal retrieval | 基于对比掩蔽自动编码器的自监督散列,用于二维图像和三维点云跨模态检索 | Rukai Wei, Heng Cui, Yu Liu, Yufeng Hou, Yanzhao Xie, Ke Zhou | http://arxiv.org/pdf/2408.05711v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-08-11 | RTF-Q: Unsupervised domain adaptation based retraining-free quantization network | RTF-Q:基于无监督域自适应的无再训练量化网络 | Nanyang Du, Chen Tang, Yuan Meng, Zhi Wang | http://arxiv.org/pdf/2408.05752v1 | null |
2024-08-11 | Neural Architecture Search based Global-local Vision Mamba for Palm-Vein Recognition | 基于神经架构搜索的全局局部视觉 Mamba 用于手掌静脉识别 | Huafeng Qin, Yuming Fu, Jing Chen, Mounim A. El-Yacoubi, Xinbo Gao, Jun Wang | http://arxiv.org/pdf/2408.05743v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-08-11 | Real-Time Drowsiness Detection Using Eye Aspect Ratio and Facial Landmark Detection | 使用眼部纵横比和面部特征点检测进行实时困倦检测 | Varun Shiva Krishna Rupani, Velpooru Venkata Sai Thushar, Kondadi Tejith | http://arxiv.org/pdf/2408.05836v1 | null |
2024-08-11 | Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI | 用于 DCE-MRI 乳腺肿瘤分割的原型学习引导混合网络 | Lei Zhou, Yuzhong Zhang, Jiadong Zhang, Xuejun Qian, Chen Gong, Kun Sun, Zhongxiang Ding, Xing Wang, Zhenhui Li, Zaiyi Liu, et.al. | http://arxiv.org/pdf/2408.05803v1 | link |
2024-08-11 | U-DECN: End-to-End Underwater Object Detection ConvNet with Improved DeNoising Training | U-DECN:具有改进的去噪训练的端到端水下物体检测 ConvNet | Zhuoyan Liu, Bo Wang, Ye Li | http://arxiv.org/pdf/2408.05780v1 | link |
2024-08-11 | Efficient Test-Time Prompt Tuning for Vision-Language Models | 视觉语言模型的高效测试时间快速调整 | Yuhan Zhu, Guozhen Zhang, Chen Xu, Haocheng Shen, Xiaoxin Chen, Gangshan Wu, Limin Wang | http://arxiv.org/pdf/2408.05775v1 | null |
2024-08-11 | PRECISe : Prototype-Reservation for Explainable Classification under Imbalanced and Scarce-Data Settings | PRECISe:不平衡和稀缺数据环境下可解释分类的原型保留 | Vaibhav Ganatra, Drishti Goel | http://arxiv.org/pdf/2408.05754v1 | null |
2024-08-11 | FADE: A Dataset for Detecting Falling Objects around Buildings in Video | FADE:用于检测视频中建筑物周围坠落物体的数据集 | Zhigang Tu, Zitao Gao, Zhengbo Zhang, Chunluan Zhou, Junsong Yuan, Bo Du | http://arxiv.org/pdf/2408.05750v1 | null |
2024-08-11 | Efficient and Versatile Robust Fine-Tuning of Zero-shot Models | 高效、多功能的零样本模型鲁棒微调 | Sungyeon Kim, Boseung Jeong, Donghyun Kim, Suha Kwak | http://arxiv.org/pdf/2408.05749v1 | null |
2024-08-11 | Decoder Pre-Training with only Text for Scene Text Recognition | 仅使用文本进行解码器预训练以进行场景文本识别 | Shuai Zhao, Yongkun Du, Zhineng Chen, Yu-Gang Jiang | http://arxiv.org/pdf/2408.05706v1 | link |
2024-08-11 | MacFormer: Semantic Segmentation with Fine Object Boundaries | MacFormer:具有精细对象边界的语义分割 | Guoan Xu, Wenfeng Huang, Tao Wu, Ligeng Chen, Wenjing Jia, Guangwei Gao, Xiatian Zhu, Stuart Perry | http://arxiv.org/pdf/2408.05699v1 | null |
2024-08-11 | Evaluating BM3D and NBNet: A Comprehensive Study of Image Denoising Across Multiple Datasets | 评估 BM3D 和 NBNet:跨多个数据集的图像去噪综合研究 | Ghazal Kaviani, Reza Marzban, Ghassan AlRegib | http://arxiv.org/pdf/2408.05697v1 | null |
2024-08-11 | PS-TTL: Prototype-based Soft-labels and Test-Time Learning for Few-shot Object Detection | PS-TTL:基于原型的软标签和测试时间学习,用于小样本物体检测 | Yingjie Gao, Yanan Zhang, Ziyue Huang, Nanqing Liu, Di Huang | http://arxiv.org/pdf/2408.05674v1 | link |
2024-08-11 | Performance Evaluation of YOLOv8 Model Configurations, for Instance Segmentation of Strawberry Fruit Development Stages in an Open Field Environment | YOLOv8 模型配置的性能评估,例如在露天环境中对草莓果实发育阶段进行分割 | Abdul-Razak Alhassan Gamani, Ibrahim Arhin, Adrena Kyeremateng Asamoah | http://arxiv.org/pdf/2408.05661v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-08-11 | Sampling Foundational Transformer: A Theoretical Perspective | 基础 Transformer 采样:理论视角 | Viet Anh Nguyen, Minh Lenhat, Khoa Nguyen, Duong Duc Hieu, Dao Huu Hung, Truong Son Hy | http://arxiv.org/pdf/2408.05822v1 | null |
2024-08-11 | HySparK: Hybrid Sparse Masking for Large Scale Medical Image Pre-Training | HySparK:用于大规模医学图像预训练的混合稀疏掩模 | Fenghe Tang, Ronghao Xu, Qingsong Yao, Xueming Fu, Quan Quan, Heqin Zhu, Zaiyi Liu, S. Kevin Zhou | http://arxiv.org/pdf/2408.05815v1 | link |
2024-08-11 | Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators | 具有逐步动态注意介质的高效扩散变压器 | Yifan Pu, Zhuofan Xia, Jiayi Guo, Dongchen Han, Qixiu Li, Duo Li, Yuhui Yuan, Ji Li, Yizeng Han, Shiji Song, et.al. | http://arxiv.org/pdf/2408.05710v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-08-11 | CURLing the Dream: Contrastive Representations for World Modeling in Reinforcement Learning | 冰壶梦:强化学习中世界建模的对比表征 | Victor Augusto Kich, Jair Augusto Bottega, Raul Steinmetz, Ricardo Bedin Grando, Ayano Yorozu, Akihisa Ohya | http://arxiv.org/pdf/2408.05781v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-08-11 | Deformable Image Registration with Multi-scale Feature Fusion from Shared Encoder, Auxiliary and Pyramid Decoders | 通过共享编码器、辅助解码器和金字塔解码器进行多尺度特征融合的可变形图像配准 | Hongchao Zhou, Shunbo Hu | http://arxiv.org/pdf/2408.05717v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-08-11 | SABER-6D: Shape Representation Based Implicit Object Pose Estimation | SABER-6D:基于形状表示的隐式物体姿态估计 | Shishir Reddy Vutukur, Mengkejiergeli Ba, Benjamin Busam, Matthias Kayser, Gurprit Singh | http://arxiv.org/pdf/2408.05867v1 | null |
2024-08-11 | Deep Learning in Medical Image Registration: Magic or Mirage? | 深度学习在医学图像配准中的应用:魔法还是海市蜃楼? | Rohit Jena, Deeksha Sethi, Pratik Chaudhari, James C. Gee | http://arxiv.org/pdf/2408.05839v1 | null |
2024-08-11 | Improving Adversarial Transferability with Neighbourhood Gradient Information | 利用邻域梯度信息提高对抗性可转移性 | Haijing Guo, Jiafeng Wang, Zhaoyu Chen, Kaixun Jiang, Lingyi Hong, Pinxue Guo, Jinglun Li, Wenqiang Zhang | http://arxiv.org/pdf/2408.05745v1 | null |
2024-08-11 | Deep Learning with Data Privacy via Residual Perturbation | 通过残余扰动实现数据隐私的深度学习 | Wenqi Tao, Huaming Ling, Zuoqiang Shi, Bao Wang | http://arxiv.org/pdf/2408.05723v1 | null |
2024-08-11 | Single Image Dehazing Using Scene Depth Ordering | 使用场景深度排序对单幅图像进行去雾 | Pengyang Ling, Huaian Chen, Xiao Tan, Yimeng Shan, Yi Jin | http://arxiv.org/pdf/2408.05683v1 | null |