Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-22 | PUDD: Towards Robust Multi-modal Prototype-based Deepfake Detection | PUDD:面向稳健的多模态原型 Deepfake 检测 | Alvaro Lopez Pellcier, Yi Li, Plamen Angelov | http://arxiv.org/pdf/2406.15921v1 | null |
2024-06-22 | Soft Masked Mamba Diffusion Model for CT to MRI Conversion | 用于 CT 到 MRI 转换的软蒙版 Mamba 扩散模型 | Zhenbin Wang, Lei Zhang, Lituan Wang, Zhenwei Zhang | http://arxiv.org/pdf/2406.15910v1 | link |
2024-06-22 | EmoAttack: Emotion-to-Image Diffusion Models for Emotional Backdoor Generation | EmoAttack:用于生成情感后门的情感到图像扩散模型 | Tianyu Wei, Shanmin Pang, Qi Guo, Yizhuo Ma, Qing Guo | http://arxiv.org/pdf/2406.15863v1 | null |
2024-06-22 | MVOC: a training-free multiple video object composition method with diffusion models | MVOC:一种无需训练、具有扩散模型的多视频对象合成方法 | Wei Wang, Yaosen Chen, Yuegen Liu, Qi Yuan, Shubin Yang, Yanru Zhang | http://arxiv.org/pdf/2406.15829v1 | null |
2024-06-22 | PointDreamer: Zero-shot 3D Textured Mesh Reconstruction from Colored Point Cloud by 2D Inpainting | PointDreamer:通过 2D 修复从彩色点云进行零样本 3D 纹理网格重建 | Qiao Yu, Xianzhi Li, Yuan Tang, Jinfeng Xu, Long Hu, Yixue Hao, Min Chen | http://arxiv.org/pdf/2406.15811v1 | link |
2024-06-22 | Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model | 图像到视频扩散模型中的条件图像泄漏识别与解决 | Min Zhao, Hongzhou Zhu, Chendong Xiang, Kaiwen Zheng, Chongxuan Li, Jun Zhu | http://arxiv.org/pdf/2406.15735v1 | null |
2024-06-22 | How to Learn More? Exploring Kolmogorov-Arnold Networks for Hyperspectral Image Classification | 如何了解更多?探索 Kolmogorov-Arnold 网络用于高光谱图像分类 | Ali Jamali, Swalpa Kumar Roy, Danfeng Hong, Bing Lu, Pedram Ghamisi | http://arxiv.org/pdf/2406.15719v1 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-22 | Intrinsic Dimension Correlation: uncovering nonlinear connections in multimodal representations | 内在维度相关性:揭示多模态表示中的非线性连接 | Lorenzo Basile, Santiago Acevedo, Luca Bortolussi, Fabio Anselmi, Alex Rodriguez | http://arxiv.org/pdf/2406.15812v1 | null |
2024-06-22 | MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception | MR-MLLM:多模态理解与视觉感知的相互强化 | Guanqun Wang, Xinyu Wei, Jiaming Liu, Ray Zhang, Yichi Zhang, Kevin Zhang, Maurice Chong, Shanghang Zhang | http://arxiv.org/pdf/2406.15768v1 | null |
2024-06-22 | TP-DRSeg: Improving Diabetic Retinopathy Lesion Segmentation with Explicit Text-Prompts Assisted SAM | TP-DRSeg:使用显式文本提示辅助 SAM 改善糖尿病视网膜病变分割 | Wenxue Li, Xinyu Xiong, Peng Xia, Lie Ju, Zongyuan Ge | http://arxiv.org/pdf/2406.15764v1 | null |
2024-06-22 | Multimodal Segmentation for Vocal Tract Modeling | 用于声道建模的多模态分割 | Rishi Jain, Bohan Yu, Peter Wu, Tejas Prabhune, Gopala Anumanchipalli | http://arxiv.org/pdf/2406.15754v1 | null |
2024-06-22 | psPRF:Pansharpening Planar Neural Radiance Field for Generalized 3D Reconstruction Satellite Imagery | psPRF:用于广义 3D 重建卫星图像的全色锐化平面神经辐射场 | Tongtong Zhang, Yuanxiang Li | http://arxiv.org/pdf/2406.15707v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-22 | DISHA: Low-Energy Sparse Transformer at Edge for Outdoor Navigation for the Visually Impaired Individuals | DISHA:用于视障人士户外导航的边缘低能量稀疏变压器 | Praveen Nagil, Sumit K. Mandal | http://arxiv.org/pdf/2406.15864v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-22 | Bone Fracture Classification using Transfer Learning | 使用迁移学习进行骨折分类 | Shyam Gupta, Dhanisha Sharma | http://arxiv.org/pdf/2406.15958v1 | link |
2024-06-22 | Federated Adversarial Learning for Robust Autonomous Landing Runway Detection | 联合对抗学习实现稳健的自主着陆跑道检测 | Yi Li, Plamen Angelov, Zhengxin Yu, Alvaro Lopez Pellicer, Neeraj Suri | http://arxiv.org/pdf/2406.15925v1 | null |
2024-06-22 | SEDMamba: Enhancing Selective State Space Modelling with Bottleneck Mechanism and Fine-to-Coarse Temporal Fusion for Efficient Error Detection in Robot-Assisted Surgery | SEDMamba:利用瓶颈机制和细到粗时间融合增强选择性状态空间建模,以实现机器人辅助手术中的有效错误检测 | Jialang Xu, Nazir Sirajudeen, Matthew Boal, Nader Francis, Danail Stoyanov, Evangelos Mazomenos | http://arxiv.org/pdf/2406.15920v1 | null |
2024-06-22 | DISentangled Counterfactual Visual interpretER (DISCOVER) generalizes to natural images | DISentangled Counterfactual Visual explainER (DISCOVER) 推广到自然图像 | Oded Rotem, Assaf Zaritsky | http://arxiv.org/pdf/2406.15918v1 | null |
2024-06-22 | Reading Is Believing: Revisiting Language Bottleneck Models for Image Classification | 阅读即信仰:重新审视图像分类的语言瓶颈模型 | Honori Udo, Takafumi Koshinaka | http://arxiv.org/pdf/2406.15816v1 | null |
2024-06-22 | Smart Feature is What You Need | 智能功能正是您所需要的 | Zhaoxin Hu, Keyan Ren | http://arxiv.org/pdf/2406.15805v1 | link |
2024-06-22 | Fine-grained Background Representation for Weakly Supervised Semantic Segmentation | 弱监督语义分割的细粒度背景表示 | Xu Yin, Woobin Im, Dongbo Min, Yuchi Huo, Fei Pan, Sung-Eui Yoon | http://arxiv.org/pdf/2406.15755v1 | null |
2024-06-22 | Semi-supervised variational autoencoder for cell feature extraction in multiplexed immunofluorescence images | 半监督变分自动编码器用于多路复用免疫荧光图像中的细胞特征提取 | Piumi Sandarenu, Julia Chen, Iveta Slapetova, Lois Browne, Peter H. Graham, Alexander Swarbrick, Ewan K. A. Millar, Yang Song, Erik Meijering | http://arxiv.org/pdf/2406.15727v1 | null |
2024-06-22 | Self-Supervised Alignment Learning for Medical Image Segmentation | 用于医学图像分割的自监督对齐学习 | Haofeng Li, Yiming Ouyang, Xiang Wan | http://arxiv.org/pdf/2406.15699v1 | null |
2024-06-22 | Single-Temporal Supervised Learning for Universal Remote Sensing Change Detection | 用于通用遥感变化检测的单时间监督学习 | Zhuo Zheng, Yanfei Zhong, Ailong Ma, Liangpei Zhang | http://arxiv.org/pdf/2406.15694v1 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-22 | Beyond the Doors of Perception: Vision Transformers Represent Relations Between Objects | 超越感知之门:视觉转换器代表物体之间的关系 | Michael A. Lepori, Alexa R. Tartaglini, Wai Keen Vong, Thomas Serre, Brenden M. Lake, Ellie Pavlick | http://arxiv.org/pdf/2406.15955v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-22 | Shape2.5D: A Dataset of Texture-less Surfaces for Depth and Normals Estimation | Shape2.5D:用于深度和法线估计的无纹理表面数据集 | Muhammad Saif Ullah Khan, Muhammad Zeshan Afzal, Didier Stricker | http://arxiv.org/pdf/2406.15831v1 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-22 | video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models | 视频-SALMONN:语音增强视听大型语言模型 | Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang | http://arxiv.org/pdf/2406.15704v1 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-22 | LaneSegNet Design Study | LaneSegNet 设计研究 | William Stevens, Vishal Urs, Karthik Selvaraj, Gabriel Torres, Gaurish Lakhanpal | http://arxiv.org/pdf/2406.15946v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-22 | Quality-guided Skin Tone Enhancement for Portrait Photography | 肖像摄影的品质引导肤色增强 | Shiqi Gao, Huiyu Duan, Xinyue Li, Kang Fu, Yicong Peng, Qihang Xu, Yuanyuan Chang, Jia Wang, Xiongkuo Min, Guangtao Zhai | http://arxiv.org/pdf/2406.15848v1 | null |
2024-06-22 | ObjectNLQ @ Ego4D Episodic Memory Challenge 2024 | ObjectNLQ @ Ego4D 情景记忆挑战 2024 | Yisen Feng, Haoyu Zhang, Yuquan Xie, Zaijing Li, Meng Liu, Liqiang Nie | http://arxiv.org/pdf/2406.15778v1 | null |
2024-06-22 | HCQA @ Ego4D EgoSchema Challenge 2024 | HCQA @ Ego4D EgoSchema 挑战赛 2024 | Haoyu Zhang, Yuquan Xie, Yisen Feng, Zaijing Li, Meng Liu, Liqiang Nie | http://arxiv.org/pdf/2406.15771v1 | link |
2024-06-22 | Evaluating Large Vision-and-Language Models on Children's Mathematical Olympiads | 评估儿童数学奥林匹克竞赛的大型视觉和语言模型 | Anoop Cherian, Kuan-Chuan Peng, Suhas Lohit, Joanna Matthiesen, Kevin Smith, Joshua B. Tenenbaum | http://arxiv.org/pdf/2406.15736v1 | null |
2024-06-22 | Predicting fluorescent labels in label-free microscopy images with pix2pix and adaptive loss in Light My Cells challenge | 在 Light My Cells 挑战赛中使用 pix2pix 和自适应损失预测无标签显微镜图像中的荧光标签 | Han Liu, Hao Li, Jiacheng Wang, Yubo Fan, Zhoubing Xu, Ipek Oguz | http://arxiv.org/pdf/2406.15716v1 | link |