Skip to content

Latest commit

 

History

History
executable file
·
98 lines (79 loc) · 14.9 KB

2024-06-02.md

File metadata and controls

executable file
·
98 lines (79 loc) · 14.9 KB

[UPDATED!] 2024-06-02 (Publish Time)

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-06-02 DistilDIRE: A Small, Fast, Cheap and Lightweight Diffusion Synthesized Deepfake Detection DistilDIRE:一种小型、快速、廉价且轻量的扩散合成 Deepfake 检测 Yewon Lim, Changyeon Lee, Aerin Kim, Oren Etzioni http://arxiv.org/pdf/2406.00856v1 null
2024-06-02 Invisible Backdoor Attacks on Diffusion Models 针对扩散模型的隐形后门攻击 Sen Li, Junchi Ma, Minhao Cheng http://arxiv.org/pdf/2406.00816v1 link
2024-06-02 EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing EchoNet-Synthetic:保护隐私的视频生成,实现安全的医疗数据共享 Hadrien Reynaud, Qingjie Meng, Mischa Dombrowski, Arijit Ghosh, Thomas Day, Alberto Gomez, Paul Leeson, Bernhard Kainz http://arxiv.org/pdf/2406.00808v1 null
2024-06-04 AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark AI-Face:百万级人口统计学注释的 AI 生成人脸数据集和公平性基准 Li Lin, Santosh, Xin Wang, Shu Hu http://arxiv.org/pdf/2406.00783v2 link
2024-06-02 Diffusion Features to Bridge Domain Gap for Semantic Segmentation 扩散特征弥补语义分割领域的差距 Yuxiang Ji, Boyong He, Chenyuan Qu, Zhuoyue Tan, Chuan Qin, Liaoni Wu http://arxiv.org/pdf/2406.00777v1 null
2024-06-02 Diffusion Tuning: Transferring Diffusion Models via Chain of Forgetting 扩散调节:通过遗忘链转移扩散模型 Jincheng Zhong, Xingzhuo Guo, Jiaxiang Dong, Mingsheng Long http://arxiv.org/pdf/2406.00773v1 null
2024-06-04 Unsupervised Contrastive Analysis for Salient Pattern Detection using Conditional Diffusion Models 使用条件扩散模型进行显着模式检测的无监督对比分析 Cristiano Patrício, Carlo Alberto Barbano, Attilio Fiandrotti, Riccardo Renzulli, Marco Grangetto, Luis F. Teixeira, João C. Neves http://arxiv.org/pdf/2406.00772v2 link
2024-06-02 Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption 一劳永逸:具有动态粒度自适应的可控生成图像压缩 Anqi Li, Yuxi Liu, Huihui Bai, Feng Li, Runmin Cong, Meng Wang, Yao Zhao http://arxiv.org/pdf/2406.00758v1 link
2024-06-02 Freeplane: Unlocking Free Lunch in Triplane-Based Sparse-View Reconstruction Models Freeplane:在基于三平面的稀疏视图重建模型中解锁免费午餐 Wenqiang Sun, Zhengyi Wang, Shuo Chen, Yikai Wang, Zilong Chen, Jun Zhu, Jun Zhang http://arxiv.org/pdf/2406.00750v1 null
2024-06-02 Deciphering Oracle Bone Language with Diffusion Models 用传播模型解读甲骨文 Haisu Guan, Huanxin Yang, Xinyu Wang, Shengwei Han, Yongge Liu, Lianwen Jin, Xiang Bai, Yuliang Liu http://arxiv.org/pdf/2406.00684v1 link
2024-06-02 An Early Investigation into the Utility of Multimodal Large Language Models in Medical Imaging 对多模态大型语言模型在医学成像中的实用性的早期调查 Sulaiman Khan, Md. Rafiul Biswas, Alina Murad, Hazrat Ali, Zubair Shah http://arxiv.org/pdf/2406.00667v1 null
2024-06-02 Ultrasound Report Generation with Cross-Modality Feature Alignment via Unsupervised Guidance 通过无监督引导生成跨模态特征对齐的超声报告 Jun Li, Tongkun Su, Baoliang Zhao, Faqin Lv, Qiong Wang, Nassir Navab, Ying Hu, Zhongliang Jiang http://arxiv.org/pdf/2406.00644v1 null
2024-06-02 T2LM: Long-Term 3D Human Motion Generation from Multiple Sentences T2LM:从多个句子生成长期 3D 人体运动 Taeryung Lee, Fabien Baradel, Thomas Lucas, Kyoung Mu Lee, Gregory Rogez http://arxiv.org/pdf/2406.00636v1 null
2024-06-02 Improving GFlowNets for Text-to-Image Diffusion Alignment 改进 GFlowNets 以实现文本到图像的扩散对齐 Dinghuai Zhang, Yizhe Zhang, Jiatao Gu, Ruixiang Zhang, Josh Susskind, Navdeep Jaitly, Shuangfei Zhai http://arxiv.org/pdf/2406.00633v1 null
2024-06-02 Diff-Mosaic: Augmenting Realistic Representations in Infrared Small Target Detection via Diffusion Prior Diff-Mosaic:通过扩散先验增强红外小目标检测中的真实表示 Yukai Shi, Yupei Lin, Pengxu Wei, Xiaoyu Xian, Tianshui Chen, Liang Lin http://arxiv.org/pdf/2406.00632v1 link

多模态

Publish Date Title Title_CN Authors PDF Code
2024-06-02 OLIVE: Object Level In-Context Visual Embeddings OLIVE:对象级上下文视觉嵌入 Timothy Ossowski, Junjie Hu http://arxiv.org/pdf/2406.00872v1 null
2024-06-02 MGI: Multimodal Contrastive pre-training of Genomic and Medical Imaging MGI:基因组和医学成像的多模态对比预训练 Jiaying Zhou, Mingzhou Jiang, Junde Wu, Jiayuan Zhu, Ziyue Wang, Yueming Jin http://arxiv.org/pdf/2406.00631v1 null

Nerf

Publish Date Title Title_CN Authors PDF Code
2024-06-02 PruNeRF: Segment-Centric Dataset Pruning via 3D Spatial Consistency PruNeRF:通过 3D 空间一致性进行以段为中心的数据集修剪 Yeonsung Jung, Heecheol Yun, Joonhyung Park, Jin-Hwa Kim, Eunho Yang http://arxiv.org/pdf/2406.00798v1 null
2024-06-02 Representing Animatable Avatar via Factorized Neural Fields 通过分解神经场表示可动画的头像 Chunjin Song, Zhijie Wu, Bastian Wandt, Leonid Sigal, Helge Rhodin http://arxiv.org/pdf/2406.00637v1 null
2024-06-04 SuperGaussian: Repurposing Video Models for 3D Super Resolution SuperGaussian:重新利用视频模型实现 3D 超分辨率 Yuan Shen, Duygu Ceylan, Paul Guerrero, Zexiang Xu, Niloy J. Mitra, Shenlong Wang, Anna Frühstück http://arxiv.org/pdf/2406.00609v2 null
2024-06-02 Efficient Neural Light Fields (ENeLF) for Mobile Devices 适用于移动设备的高效神经光场 (ENeLF) Austin Peng http://arxiv.org/pdf/2406.00598v1 null

模型压缩/优化

Publish Date Title Title_CN Authors PDF Code
2024-06-02 Stealing Image-to-Image Translation Models With a Single Query 通过单个查询窃取图像到图像的翻译模型 Nurit Spingarn-Eliezer, Tomer Michaeli http://arxiv.org/pdf/2406.00828v1 null

分类/检测/识别/分割/...

Publish Date Title Title_CN Authors PDF Code
2024-06-02 Global High Categorical Resolution Land Cover Mapping via Weak Supervision 通过弱监督进行全球高分类分辨率土地覆盖测绘 Xin-Yi Tong, Runmin Dong, Xiao Xiang Zhu http://arxiv.org/pdf/2406.00891v1 null
2024-06-02 Visual place recognition for aerial imagery: A survey 航空影像的视觉位置识别:一项调查 Ivan Moskalenko, Anastasiia Kornilova, Gonzalo Ferrer http://arxiv.org/pdf/2406.00885v1 link
2024-06-02 Eating Smart: Advancing Health Informatics with the Grounding DINO based Dietary Assistant App 明智饮食:利用基于 DINO 的饮食助理应用程序推进健康信息学 Abdelilah Nossair, Hamza El Housni http://arxiv.org/pdf/2406.00848v1 null
2024-06-02 Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection 用于开放词汇 3D 对象检测的协作式新对象发现和框引导跨模态对齐 Yang Cao, Yihan Zeng, Hang Xu, Dan Xu http://arxiv.org/pdf/2406.00830v1 link
2024-06-02 Towards Point Cloud Compression for Machine Perception: A Simple and Strong Baseline by Learning the Octree Depth Level Predictor 面向机器感知的点云压缩:通过学习八叉树深度级别预测器建立简单而强大的基线 Lei Liu, Zhihao Hu, Zhenghao Chen http://arxiv.org/pdf/2406.00791v1 null
2024-06-02 CCF: Cross Correcting Framework for Pedestrian Trajectory Prediction CCF:行人轨迹预测的交叉校正框架 Pranav Singh Chib, Pravendra Singh http://arxiv.org/pdf/2406.00749v1 null
2024-06-02 A Survey of Deep Learning Based Radar and Vision Fusion for 3D Object Detection in Autonomous Driving 基于深度学习的自动驾驶 3D 物体检测雷达与视觉融合研究 Di Wu, Feng Yang, Benlian Xu, Pan Liao, Bo Liu http://arxiv.org/pdf/2406.00714v1 null
2024-06-02 An Optimized Toolbox for Advanced Image Processing with Tsetlin Machine Composites 使用 Tsetlin Machine Composites 进行高级图像处理的优化工具箱 Ylva Grønningsæter, Halvor S. Smørvik, Ole-Christoffer Granmo http://arxiv.org/pdf/2406.00704v1 null
2024-06-02 Towards General Robustness Verification of MaxPool-based Convolutional Neural Networks via Tightening Linear Approximation 通过紧致线性近似实现基于 MaxPool 的卷积神经网络的通用鲁棒性验证 Yuan Xiao, Shiqing Ma, Juan Zhai, Chunrong Fang, Jinyuan Jia, Zhenyu Chen http://arxiv.org/pdf/2406.00699v1 link
2024-06-02 Bilinear-Convolutional Neural Network Using a Matrix Similarity-based Joint Loss Function for Skin Disease Classification 使用基于矩阵相似性的联合损失函数的双线性卷积神经网络进行皮肤病分类 Belal Ahmad, Mohd Usama, Tanvir Ahmad, Adnan Saeed, Shabnam Khatoon, Long Hu http://arxiv.org/pdf/2406.00696v1 null
2024-06-02 Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training 通过像素加权对抗训练提高准确率和稳健性的权衡 Jiacheng Zhang, Feng Liu, Dawei Zhou, Jingfeng Zhang, Tongliang Liu http://arxiv.org/pdf/2406.00685v1 null
2024-06-02 W-Net: A Facial Feature-Guided Face Super-Resolution Network W-Net:面部特征引导的超分辨率网络 Hao Liu, Yang Yang, Yunxia Liu http://arxiv.org/pdf/2406.00676v1 null
2024-06-02 Task-oriented Embedding Counts: Heuristic Clustering-driven Feature Fine-tuning for Whole Slide Image Classification 面向任务的嵌入计数:启发式聚类驱动的全幻灯片图像分类特征微调 Xuenian Wang, Shanshan Shi, Renao Yan, Qiehe Sun, Lianghui Zhu, Tian Guan, Yonghong He http://arxiv.org/pdf/2406.00672v1 null
2024-06-02 Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation Cascade-CLIP:用于零样本语义分割的级联视觉语言嵌入对齐 Yunheng Li, ZhongYu Li, Quansheng Zeng, Qibin Hou, Ming-Ming Cheng http://arxiv.org/pdf/2406.00670v1 null
2024-06-02 SimSAM: Zero-shot Medical Image Segmentation via Simulated Interaction SimSAM:通过模拟交互实现零样本医学图像分割 Benjamin Towle, Xin Chen, Ke Zhou http://arxiv.org/pdf/2406.00663v1 link
2024-06-02 An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition 基于零样本骨架的动作识别信息补偿框架 Haojun Xu, Yan Gao, Jie Li, Xinbo Gao http://arxiv.org/pdf/2406.00639v1 null
2024-06-02 SAM-LAD: Segment Anything Model Meets Zero-Shot Logic Anomaly Detection SAM-LAD:任何事物分割模型与零样本逻辑异常检测相结合 Yun Peng, Xiao Lin, Nachuan Ma, Jiayuan Du, Chuangwei Liu, Chengju Liu, Qijun Chen http://arxiv.org/pdf/2406.00625v1 null
2024-06-02 Kolmogorov-Arnold Network for Satellite Image Classification in Remote Sensing 柯尔莫哥洛夫-阿诺德网络用于遥感卫星图像分类 Minjong Cheon http://arxiv.org/pdf/2406.00600v1 null
2024-06-02 Semi-supervised Video Semantic Segmentation Using Unreliable Pseudo Labels for PVUW2024 使用不可靠伪标签对 PVUW2024 进行半监督视频语义分割 Biao Wu, Diankai Zhang, Si Gao, Chengjian Zheng, Shaoli Liu, Ning Wang http://arxiv.org/pdf/2406.00587v1 null

GNN

Publish Date Title Title_CN Authors PDF Code
2024-06-02 Explore Internal and External Similarity for Single Image Deraining with Graph Neural Networks 使用图神经网络探索单幅图像去雨的内部和外部相似性 Cong Wang, Wei Wang, Chengjin Yu, Jie Mu http://arxiv.org/pdf/2406.00721v1 link

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-06-02 Exploiting Frequency Correlation for Hyperspectral Image Reconstruction 利用频率相关性进行高光谱图像重建 Muge Yan, Lizhi Wang, Lin Zhu, Hua Huang http://arxiv.org/pdf/2406.00683v1 null
2024-06-02 Correlation Matching Transformation Transformers for UHD Image Restoration 用于超高清图像恢复的相关匹配变换器 Cong Wang, Jinshan Pan, Wei Wang, Gang Fu, Siyuan Liang, Mengzhu Wang, Xiao-Ming Wu, Jun Liu http://arxiv.org/pdf/2406.00629v1 link

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-06-04 Lay-A-Scene: Personalized 3D Object Arrangement Using Text-to-Image Priors Lay-A-Scene:使用文本到图像先验进行个性化 3D 对象排列 Ohad Rahamim, Hilit Segev, Idan Achituve, Yuval Atzmon, Yoni Kasten, Gal Chechik http://arxiv.org/pdf/2406.00687v2 null
2024-06-02 Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering 利用物理先验知识理解视频问答中的 4D 动态场景组合 Xingrui Wang, Wufei Ma, Angtian Wang, Shuo Chen, Adam Kortylewski, Alan Yuille http://arxiv.org/pdf/2406.00622v1 null

其他

Publish Date Title Title_CN Authors PDF Code
2024-06-02 Streaming quanta sensors for online, high-performance imaging and vision 用于在线高性能成像和视觉的流式量子传感器 Tianyi Zhang, Matthew Dutson, Vivek Boominathan, Mohit Gupta, Ashok Veeraraghavan http://arxiv.org/pdf/2406.00859v1 null
2024-06-02 End-to-End Hybrid Refractive-Diffractive Lens Design with Differentiable Ray-Wave Model 采用可微分射线波模型的端到端混合折射衍射透镜设计 Xinge Yang, Matheus Souza, Kunyi Wang, Praneeth Chakravarthula, Qiang Fu, Wolfgang Heidrich http://arxiv.org/pdf/2406.00834v1 null
2024-06-02 Developing an efficient corpus using Ensemble Data cleaning approach 使用 Ensemble Data 清理方法开发高效语料库 Md Taimur Ahad http://arxiv.org/pdf/2406.00789v1 null
2024-06-02 FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning FuRL:视觉语言模型作为强化学习的模糊奖励 Yuwei Fu, Haichao Zhang, Di Wu, Wei Xu, Benoit Boulet http://arxiv.org/pdf/2406.00645v1 link
2024-06-02 Robust Visual Tracking via Iterative Gradient Descent and Threshold Selection 通过迭代梯度下降和阈值选择实现稳健的视觉跟踪 Zhuang Qi, Junlin Zhang, Xin Qi http://arxiv.org/pdf/2406.00589v1 null