Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-02 | DistilDIRE: A Small, Fast, Cheap and Lightweight Diffusion Synthesized Deepfake Detection | DistilDIRE:一种小型、快速、廉价且轻量的扩散合成 Deepfake 检测 | Yewon Lim, Changyeon Lee, Aerin Kim, Oren Etzioni | http://arxiv.org/pdf/2406.00856v1 | null |
2024-06-02 | Invisible Backdoor Attacks on Diffusion Models | 针对扩散模型的隐形后门攻击 | Sen Li, Junchi Ma, Minhao Cheng | http://arxiv.org/pdf/2406.00816v1 | link |
2024-06-02 | EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing | EchoNet-Synthetic:保护隐私的视频生成,实现安全的医疗数据共享 | Hadrien Reynaud, Qingjie Meng, Mischa Dombrowski, Arijit Ghosh, Thomas Day, Alberto Gomez, Paul Leeson, Bernhard Kainz | http://arxiv.org/pdf/2406.00808v1 | null |
2024-06-04 | AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark | AI-Face:百万级人口统计学注释的 AI 生成人脸数据集和公平性基准 | Li Lin, Santosh, Xin Wang, Shu Hu | http://arxiv.org/pdf/2406.00783v2 | link |
2024-06-02 | Diffusion Features to Bridge Domain Gap for Semantic Segmentation | 扩散特征弥补语义分割领域的差距 | Yuxiang Ji, Boyong He, Chenyuan Qu, Zhuoyue Tan, Chuan Qin, Liaoni Wu | http://arxiv.org/pdf/2406.00777v1 | null |
2024-06-02 | Diffusion Tuning: Transferring Diffusion Models via Chain of Forgetting | 扩散调节:通过遗忘链转移扩散模型 | Jincheng Zhong, Xingzhuo Guo, Jiaxiang Dong, Mingsheng Long | http://arxiv.org/pdf/2406.00773v1 | null |
2024-06-04 | Unsupervised Contrastive Analysis for Salient Pattern Detection using Conditional Diffusion Models | 使用条件扩散模型进行显着模式检测的无监督对比分析 | Cristiano Patrício, Carlo Alberto Barbano, Attilio Fiandrotti, Riccardo Renzulli, Marco Grangetto, Luis F. Teixeira, João C. Neves | http://arxiv.org/pdf/2406.00772v2 | link |
2024-06-02 | Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption | 一劳永逸:具有动态粒度自适应的可控生成图像压缩 | Anqi Li, Yuxi Liu, Huihui Bai, Feng Li, Runmin Cong, Meng Wang, Yao Zhao | http://arxiv.org/pdf/2406.00758v1 | link |
2024-06-02 | Freeplane: Unlocking Free Lunch in Triplane-Based Sparse-View Reconstruction Models | Freeplane:在基于三平面的稀疏视图重建模型中解锁免费午餐 | Wenqiang Sun, Zhengyi Wang, Shuo Chen, Yikai Wang, Zilong Chen, Jun Zhu, Jun Zhang | http://arxiv.org/pdf/2406.00750v1 | null |
2024-06-02 | Deciphering Oracle Bone Language with Diffusion Models | 用传播模型解读甲骨文 | Haisu Guan, Huanxin Yang, Xinyu Wang, Shengwei Han, Yongge Liu, Lianwen Jin, Xiang Bai, Yuliang Liu | http://arxiv.org/pdf/2406.00684v1 | link |
2024-06-02 | An Early Investigation into the Utility of Multimodal Large Language Models in Medical Imaging | 对多模态大型语言模型在医学成像中的实用性的早期调查 | Sulaiman Khan, Md. Rafiul Biswas, Alina Murad, Hazrat Ali, Zubair Shah | http://arxiv.org/pdf/2406.00667v1 | null |
2024-06-02 | Ultrasound Report Generation with Cross-Modality Feature Alignment via Unsupervised Guidance | 通过无监督引导生成跨模态特征对齐的超声报告 | Jun Li, Tongkun Su, Baoliang Zhao, Faqin Lv, Qiong Wang, Nassir Navab, Ying Hu, Zhongliang Jiang | http://arxiv.org/pdf/2406.00644v1 | null |
2024-06-02 | T2LM: Long-Term 3D Human Motion Generation from Multiple Sentences | T2LM:从多个句子生成长期 3D 人体运动 | Taeryung Lee, Fabien Baradel, Thomas Lucas, Kyoung Mu Lee, Gregory Rogez | http://arxiv.org/pdf/2406.00636v1 | null |
2024-06-02 | Improving GFlowNets for Text-to-Image Diffusion Alignment | 改进 GFlowNets 以实现文本到图像的扩散对齐 | Dinghuai Zhang, Yizhe Zhang, Jiatao Gu, Ruixiang Zhang, Josh Susskind, Navdeep Jaitly, Shuangfei Zhai | http://arxiv.org/pdf/2406.00633v1 | null |
2024-06-02 | Diff-Mosaic: Augmenting Realistic Representations in Infrared Small Target Detection via Diffusion Prior | Diff-Mosaic:通过扩散先验增强红外小目标检测中的真实表示 | Yukai Shi, Yupei Lin, Pengxu Wei, Xiaoyu Xian, Tianshui Chen, Liang Lin | http://arxiv.org/pdf/2406.00632v1 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-02 | OLIVE: Object Level In-Context Visual Embeddings | OLIVE:对象级上下文视觉嵌入 | Timothy Ossowski, Junjie Hu | http://arxiv.org/pdf/2406.00872v1 | null |
2024-06-02 | MGI: Multimodal Contrastive pre-training of Genomic and Medical Imaging | MGI:基因组和医学成像的多模态对比预训练 | Jiaying Zhou, Mingzhou Jiang, Junde Wu, Jiayuan Zhu, Ziyue Wang, Yueming Jin | http://arxiv.org/pdf/2406.00631v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-02 | PruNeRF: Segment-Centric Dataset Pruning via 3D Spatial Consistency | PruNeRF:通过 3D 空间一致性进行以段为中心的数据集修剪 | Yeonsung Jung, Heecheol Yun, Joonhyung Park, Jin-Hwa Kim, Eunho Yang | http://arxiv.org/pdf/2406.00798v1 | null |
2024-06-02 | Representing Animatable Avatar via Factorized Neural Fields | 通过分解神经场表示可动画的头像 | Chunjin Song, Zhijie Wu, Bastian Wandt, Leonid Sigal, Helge Rhodin | http://arxiv.org/pdf/2406.00637v1 | null |
2024-06-04 | SuperGaussian: Repurposing Video Models for 3D Super Resolution | SuperGaussian:重新利用视频模型实现 3D 超分辨率 | Yuan Shen, Duygu Ceylan, Paul Guerrero, Zexiang Xu, Niloy J. Mitra, Shenlong Wang, Anna Frühstück | http://arxiv.org/pdf/2406.00609v2 | null |
2024-06-02 | Efficient Neural Light Fields (ENeLF) for Mobile Devices | 适用于移动设备的高效神经光场 (ENeLF) | Austin Peng | http://arxiv.org/pdf/2406.00598v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-02 | Stealing Image-to-Image Translation Models With a Single Query | 通过单个查询窃取图像到图像的翻译模型 | Nurit Spingarn-Eliezer, Tomer Michaeli | http://arxiv.org/pdf/2406.00828v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-02 | Global High Categorical Resolution Land Cover Mapping via Weak Supervision | 通过弱监督进行全球高分类分辨率土地覆盖测绘 | Xin-Yi Tong, Runmin Dong, Xiao Xiang Zhu | http://arxiv.org/pdf/2406.00891v1 | null |
2024-06-02 | Visual place recognition for aerial imagery: A survey | 航空影像的视觉位置识别:一项调查 | Ivan Moskalenko, Anastasiia Kornilova, Gonzalo Ferrer | http://arxiv.org/pdf/2406.00885v1 | link |
2024-06-02 | Eating Smart: Advancing Health Informatics with the Grounding DINO based Dietary Assistant App | 明智饮食:利用基于 DINO 的饮食助理应用程序推进健康信息学 | Abdelilah Nossair, Hamza El Housni | http://arxiv.org/pdf/2406.00848v1 | null |
2024-06-02 | Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection | 用于开放词汇 3D 对象检测的协作式新对象发现和框引导跨模态对齐 | Yang Cao, Yihan Zeng, Hang Xu, Dan Xu | http://arxiv.org/pdf/2406.00830v1 | link |
2024-06-02 | Towards Point Cloud Compression for Machine Perception: A Simple and Strong Baseline by Learning the Octree Depth Level Predictor | 面向机器感知的点云压缩:通过学习八叉树深度级别预测器建立简单而强大的基线 | Lei Liu, Zhihao Hu, Zhenghao Chen | http://arxiv.org/pdf/2406.00791v1 | null |
2024-06-02 | CCF: Cross Correcting Framework for Pedestrian Trajectory Prediction | CCF:行人轨迹预测的交叉校正框架 | Pranav Singh Chib, Pravendra Singh | http://arxiv.org/pdf/2406.00749v1 | null |
2024-06-02 | A Survey of Deep Learning Based Radar and Vision Fusion for 3D Object Detection in Autonomous Driving | 基于深度学习的自动驾驶 3D 物体检测雷达与视觉融合研究 | Di Wu, Feng Yang, Benlian Xu, Pan Liao, Bo Liu | http://arxiv.org/pdf/2406.00714v1 | null |
2024-06-02 | An Optimized Toolbox for Advanced Image Processing with Tsetlin Machine Composites | 使用 Tsetlin Machine Composites 进行高级图像处理的优化工具箱 | Ylva Grønningsæter, Halvor S. Smørvik, Ole-Christoffer Granmo | http://arxiv.org/pdf/2406.00704v1 | null |
2024-06-02 | Towards General Robustness Verification of MaxPool-based Convolutional Neural Networks via Tightening Linear Approximation | 通过紧致线性近似实现基于 MaxPool 的卷积神经网络的通用鲁棒性验证 | Yuan Xiao, Shiqing Ma, Juan Zhai, Chunrong Fang, Jinyuan Jia, Zhenyu Chen | http://arxiv.org/pdf/2406.00699v1 | link |
2024-06-02 | Bilinear-Convolutional Neural Network Using a Matrix Similarity-based Joint Loss Function for Skin Disease Classification | 使用基于矩阵相似性的联合损失函数的双线性卷积神经网络进行皮肤病分类 | Belal Ahmad, Mohd Usama, Tanvir Ahmad, Adnan Saeed, Shabnam Khatoon, Long Hu | http://arxiv.org/pdf/2406.00696v1 | null |
2024-06-02 | Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training | 通过像素加权对抗训练提高准确率和稳健性的权衡 | Jiacheng Zhang, Feng Liu, Dawei Zhou, Jingfeng Zhang, Tongliang Liu | http://arxiv.org/pdf/2406.00685v1 | null |
2024-06-02 | W-Net: A Facial Feature-Guided Face Super-Resolution Network | W-Net:面部特征引导的超分辨率网络 | Hao Liu, Yang Yang, Yunxia Liu | http://arxiv.org/pdf/2406.00676v1 | null |
2024-06-02 | Task-oriented Embedding Counts: Heuristic Clustering-driven Feature Fine-tuning for Whole Slide Image Classification | 面向任务的嵌入计数:启发式聚类驱动的全幻灯片图像分类特征微调 | Xuenian Wang, Shanshan Shi, Renao Yan, Qiehe Sun, Lianghui Zhu, Tian Guan, Yonghong He | http://arxiv.org/pdf/2406.00672v1 | null |
2024-06-02 | Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation | Cascade-CLIP:用于零样本语义分割的级联视觉语言嵌入对齐 | Yunheng Li, ZhongYu Li, Quansheng Zeng, Qibin Hou, Ming-Ming Cheng | http://arxiv.org/pdf/2406.00670v1 | null |
2024-06-02 | SimSAM: Zero-shot Medical Image Segmentation via Simulated Interaction | SimSAM:通过模拟交互实现零样本医学图像分割 | Benjamin Towle, Xin Chen, Ke Zhou | http://arxiv.org/pdf/2406.00663v1 | link |
2024-06-02 | An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition | 基于零样本骨架的动作识别信息补偿框架 | Haojun Xu, Yan Gao, Jie Li, Xinbo Gao | http://arxiv.org/pdf/2406.00639v1 | null |
2024-06-02 | SAM-LAD: Segment Anything Model Meets Zero-Shot Logic Anomaly Detection | SAM-LAD:任何事物分割模型与零样本逻辑异常检测相结合 | Yun Peng, Xiao Lin, Nachuan Ma, Jiayuan Du, Chuangwei Liu, Chengju Liu, Qijun Chen | http://arxiv.org/pdf/2406.00625v1 | null |
2024-06-02 | Kolmogorov-Arnold Network for Satellite Image Classification in Remote Sensing | 柯尔莫哥洛夫-阿诺德网络用于遥感卫星图像分类 | Minjong Cheon | http://arxiv.org/pdf/2406.00600v1 | null |
2024-06-02 | Semi-supervised Video Semantic Segmentation Using Unreliable Pseudo Labels for PVUW2024 | 使用不可靠伪标签对 PVUW2024 进行半监督视频语义分割 | Biao Wu, Diankai Zhang, Si Gao, Chengjian Zheng, Shaoli Liu, Ning Wang | http://arxiv.org/pdf/2406.00587v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-02 | Explore Internal and External Similarity for Single Image Deraining with Graph Neural Networks | 使用图神经网络探索单幅图像去雨的内部和外部相似性 | Cong Wang, Wei Wang, Chengjin Yu, Jie Mu | http://arxiv.org/pdf/2406.00721v1 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-02 | Exploiting Frequency Correlation for Hyperspectral Image Reconstruction | 利用频率相关性进行高光谱图像重建 | Muge Yan, Lizhi Wang, Lin Zhu, Hua Huang | http://arxiv.org/pdf/2406.00683v1 | null |
2024-06-02 | Correlation Matching Transformation Transformers for UHD Image Restoration | 用于超高清图像恢复的相关匹配变换器 | Cong Wang, Jinshan Pan, Wei Wang, Gang Fu, Siyuan Liang, Mengzhu Wang, Xiao-Ming Wu, Jun Liu | http://arxiv.org/pdf/2406.00629v1 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-04 | Lay-A-Scene: Personalized 3D Object Arrangement Using Text-to-Image Priors | Lay-A-Scene:使用文本到图像先验进行个性化 3D 对象排列 | Ohad Rahamim, Hilit Segev, Idan Achituve, Yuval Atzmon, Yoni Kasten, Gal Chechik | http://arxiv.org/pdf/2406.00687v2 | null |
2024-06-02 | Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering | 利用物理先验知识理解视频问答中的 4D 动态场景组合 | Xingrui Wang, Wufei Ma, Angtian Wang, Shuo Chen, Adam Kortylewski, Alan Yuille | http://arxiv.org/pdf/2406.00622v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-02 | Streaming quanta sensors for online, high-performance imaging and vision | 用于在线高性能成像和视觉的流式量子传感器 | Tianyi Zhang, Matthew Dutson, Vivek Boominathan, Mohit Gupta, Ashok Veeraraghavan | http://arxiv.org/pdf/2406.00859v1 | null |
2024-06-02 | End-to-End Hybrid Refractive-Diffractive Lens Design with Differentiable Ray-Wave Model | 采用可微分射线波模型的端到端混合折射衍射透镜设计 | Xinge Yang, Matheus Souza, Kunyi Wang, Praneeth Chakravarthula, Qiang Fu, Wolfgang Heidrich | http://arxiv.org/pdf/2406.00834v1 | null |
2024-06-02 | Developing an efficient corpus using Ensemble Data cleaning approach | 使用 Ensemble Data 清理方法开发高效语料库 | Md Taimur Ahad | http://arxiv.org/pdf/2406.00789v1 | null |
2024-06-02 | FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning | FuRL:视觉语言模型作为强化学习的模糊奖励 | Yuwei Fu, Haichao Zhang, Di Wu, Wei Xu, Benoit Boulet | http://arxiv.org/pdf/2406.00645v1 | link |
2024-06-02 | Robust Visual Tracking via Iterative Gradient Descent and Threshold Selection | 通过迭代梯度下降和阈值选择实现稳健的视觉跟踪 | Zhuang Qi, Junlin Zhang, Xin Qi | http://arxiv.org/pdf/2406.00589v1 | null |