Skip to content

Latest commit

 

History

History
executable file
·
79 lines (60 loc) · 9.8 KB

2024-06-22.md

File metadata and controls

executable file
·
79 lines (60 loc) · 9.8 KB

[UPDATED!] 2024-06-22 (Publish Time)

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-06-22 PUDD: Towards Robust Multi-modal Prototype-based Deepfake Detection PUDD:面向稳健的多模态原型 Deepfake 检测 Alvaro Lopez Pellcier, Yi Li, Plamen Angelov http://arxiv.org/pdf/2406.15921v1 null
2024-06-22 Soft Masked Mamba Diffusion Model for CT to MRI Conversion 用于 CT 到 MRI 转换的软蒙版 Mamba 扩散模型 Zhenbin Wang, Lei Zhang, Lituan Wang, Zhenwei Zhang http://arxiv.org/pdf/2406.15910v1 link
2024-06-22 EmoAttack: Emotion-to-Image Diffusion Models for Emotional Backdoor Generation EmoAttack:用于生成情感后门的情感到图像扩散模型 Tianyu Wei, Shanmin Pang, Qi Guo, Yizhuo Ma, Qing Guo http://arxiv.org/pdf/2406.15863v1 null
2024-06-22 MVOC: a training-free multiple video object composition method with diffusion models MVOC:一种无需训练、具有扩散模型的多视频对象合成方法 Wei Wang, Yaosen Chen, Yuegen Liu, Qi Yuan, Shubin Yang, Yanru Zhang http://arxiv.org/pdf/2406.15829v1 null
2024-06-22 PointDreamer: Zero-shot 3D Textured Mesh Reconstruction from Colored Point Cloud by 2D Inpainting PointDreamer:通过 2D 修复从彩色点云进行零样本 3D 纹理网格重建 Qiao Yu, Xianzhi Li, Yuan Tang, Jinfeng Xu, Long Hu, Yixue Hao, Min Chen http://arxiv.org/pdf/2406.15811v1 link
2024-06-22 Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model 图像到视频扩散模型中的条件图像泄漏识别与解决 Min Zhao, Hongzhou Zhu, Chendong Xiang, Kaiwen Zheng, Chongxuan Li, Jun Zhu http://arxiv.org/pdf/2406.15735v1 null
2024-06-22 How to Learn More? Exploring Kolmogorov-Arnold Networks for Hyperspectral Image Classification 如何了解更多?探索 Kolmogorov-Arnold 网络用于高光谱图像分类 Ali Jamali, Swalpa Kumar Roy, Danfeng Hong, Bing Lu, Pedram Ghamisi http://arxiv.org/pdf/2406.15719v1 link

多模态

Publish Date Title Title_CN Authors PDF Code
2024-06-22 Intrinsic Dimension Correlation: uncovering nonlinear connections in multimodal representations 内在维度相关性:揭示多模态表示中的非线性连接 Lorenzo Basile, Santiago Acevedo, Luca Bortolussi, Fabio Anselmi, Alex Rodriguez http://arxiv.org/pdf/2406.15812v1 null
2024-06-22 MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception MR-MLLM:多模态理解与视觉感知的相互强化 Guanqun Wang, Xinyu Wei, Jiaming Liu, Ray Zhang, Yichi Zhang, Kevin Zhang, Maurice Chong, Shanghang Zhang http://arxiv.org/pdf/2406.15768v1 null
2024-06-22 TP-DRSeg: Improving Diabetic Retinopathy Lesion Segmentation with Explicit Text-Prompts Assisted SAM TP-DRSeg:使用显式文本提示辅助 SAM 改善糖尿病视网膜病变分割 Wenxue Li, Xinyu Xiong, Peng Xia, Lie Ju, Zongyuan Ge http://arxiv.org/pdf/2406.15764v1 null
2024-06-22 Multimodal Segmentation for Vocal Tract Modeling 用于声道建模的多模态分割 Rishi Jain, Bohan Yu, Peter Wu, Tejas Prabhune, Gopala Anumanchipalli http://arxiv.org/pdf/2406.15754v1 null
2024-06-22 psPRF:Pansharpening Planar Neural Radiance Field for Generalized 3D Reconstruction Satellite Imagery psPRF:用于广义 3D 重建卫星图像的全色锐化平面神经辐射场 Tongtong Zhang, Yuanxiang Li http://arxiv.org/pdf/2406.15707v1 null

模型压缩/优化

Publish Date Title Title_CN Authors PDF Code
2024-06-22 DISHA: Low-Energy Sparse Transformer at Edge for Outdoor Navigation for the Visually Impaired Individuals DISHA:用于视障人士户外导航的边缘低能量稀疏变压器 Praveen Nagil, Sumit K. Mandal http://arxiv.org/pdf/2406.15864v1 null

分类/检测/识别/分割/...

Publish Date Title Title_CN Authors PDF Code
2024-06-22 Bone Fracture Classification using Transfer Learning 使用迁移学习进行骨折分类 Shyam Gupta, Dhanisha Sharma http://arxiv.org/pdf/2406.15958v1 link
2024-06-22 Federated Adversarial Learning for Robust Autonomous Landing Runway Detection 联合对抗学习实现稳健的自主着陆跑道检测 Yi Li, Plamen Angelov, Zhengxin Yu, Alvaro Lopez Pellicer, Neeraj Suri http://arxiv.org/pdf/2406.15925v1 null
2024-06-22 SEDMamba: Enhancing Selective State Space Modelling with Bottleneck Mechanism and Fine-to-Coarse Temporal Fusion for Efficient Error Detection in Robot-Assisted Surgery SEDMamba:利用瓶颈机制和细到粗时间融合增强选择性状态空间建模,以实现机器人辅助手术中的有效错误检测 Jialang Xu, Nazir Sirajudeen, Matthew Boal, Nader Francis, Danail Stoyanov, Evangelos Mazomenos http://arxiv.org/pdf/2406.15920v1 null
2024-06-22 DISentangled Counterfactual Visual interpretER (DISCOVER) generalizes to natural images DISentangled Counterfactual Visual explainER (DISCOVER) 推广到自然图像 Oded Rotem, Assaf Zaritsky http://arxiv.org/pdf/2406.15918v1 null
2024-06-22 Reading Is Believing: Revisiting Language Bottleneck Models for Image Classification 阅读即信仰:重新审视图像分类的语言瓶颈模型 Honori Udo, Takafumi Koshinaka http://arxiv.org/pdf/2406.15816v1 null
2024-06-22 Smart Feature is What You Need 智能功能正是您所需要的 Zhaoxin Hu, Keyan Ren http://arxiv.org/pdf/2406.15805v1 link
2024-06-22 Fine-grained Background Representation for Weakly Supervised Semantic Segmentation 弱监督语义分割的细粒度背景表示 Xu Yin, Woobin Im, Dongbo Min, Yuchi Huo, Fei Pan, Sung-Eui Yoon http://arxiv.org/pdf/2406.15755v1 null
2024-06-22 Semi-supervised variational autoencoder for cell feature extraction in multiplexed immunofluorescence images 半监督变分自动编码器用于多路复用免疫荧光图像中的细胞特征提取 Piumi Sandarenu, Julia Chen, Iveta Slapetova, Lois Browne, Peter H. Graham, Alexander Swarbrick, Ewan K. A. Millar, Yang Song, Erik Meijering http://arxiv.org/pdf/2406.15727v1 null
2024-06-22 Self-Supervised Alignment Learning for Medical Image Segmentation 用于医学图像分割的自监督对齐学习 Haofeng Li, Yiming Ouyang, Xiang Wan http://arxiv.org/pdf/2406.15699v1 null
2024-06-22 Single-Temporal Supervised Learning for Universal Remote Sensing Change Detection 用于通用遥感变化检测的单时间监督学习 Zhuo Zheng, Yanfei Zhong, Ailong Ma, Liangpei Zhang http://arxiv.org/pdf/2406.15694v1 link

GNN

Publish Date Title Title_CN Authors PDF Code
2024-06-22 Beyond the Doors of Perception: Vision Transformers Represent Relations Between Objects 超越感知之门:视觉转换器代表物体之间的关系 Michael A. Lepori, Alexa R. Tartaglini, Wai Keen Vong, Thomas Serre, Brenden M. Lake, Ellie Pavlick http://arxiv.org/pdf/2406.15955v1 null

图像理解

Publish Date Title Title_CN Authors PDF Code
2024-06-22 Shape2.5D: A Dataset of Texture-less Surfaces for Depth and Normals Estimation Shape2.5D:用于深度和法线估计的无纹理表面数据集 Muhammad Saif Ullah Khan, Muhammad Zeshan Afzal, Didier Stricker http://arxiv.org/pdf/2406.15831v1 link

LLM

Publish Date Title Title_CN Authors PDF Code
2024-06-22 video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models 视频-SALMONN:语音增强视听大型语言模型 Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang http://arxiv.org/pdf/2406.15704v1 link

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-06-22 LaneSegNet Design Study LaneSegNet 设计研究 William Stevens, Vishal Urs, Karthik Selvaraj, Gabriel Torres, Gaurish Lakhanpal http://arxiv.org/pdf/2406.15946v1 null

其他

Publish Date Title Title_CN Authors PDF Code
2024-06-22 Quality-guided Skin Tone Enhancement for Portrait Photography 肖像摄影的品质引导肤色增强 Shiqi Gao, Huiyu Duan, Xinyue Li, Kang Fu, Yicong Peng, Qihang Xu, Yuanyuan Chang, Jia Wang, Xiongkuo Min, Guangtao Zhai http://arxiv.org/pdf/2406.15848v1 null
2024-06-22 ObjectNLQ @ Ego4D Episodic Memory Challenge 2024 ObjectNLQ @ Ego4D 情景记忆挑战 2024 Yisen Feng, Haoyu Zhang, Yuquan Xie, Zaijing Li, Meng Liu, Liqiang Nie http://arxiv.org/pdf/2406.15778v1 null
2024-06-22 HCQA @ Ego4D EgoSchema Challenge 2024 HCQA @ Ego4D EgoSchema 挑战赛 2024 Haoyu Zhang, Yuquan Xie, Yisen Feng, Zaijing Li, Meng Liu, Liqiang Nie http://arxiv.org/pdf/2406.15771v1 link
2024-06-22 Evaluating Large Vision-and-Language Models on Children's Mathematical Olympiads 评估儿童数学奥林匹克竞赛的大型视觉和语言模型 Anoop Cherian, Kuan-Chuan Peng, Suhas Lohit, Joanna Matthiesen, Kevin Smith, Joshua B. Tenenbaum http://arxiv.org/pdf/2406.15736v1 null
2024-06-22 Predicting fluorescent labels in label-free microscopy images with pix2pix and adaptive loss in Light My Cells challenge 在 Light My Cells 挑战赛中使用 pix2pix 和自适应损失预测无标签显微镜图像中的荧光标签 Han Liu, Hao Li, Jiacheng Wang, Yubo Fan, Zhoubing Xu, Ipek Oguz http://arxiv.org/pdf/2406.15716v1 link