Skip to content

Latest commit

 

History

History
executable file
·
111 lines (90 loc) · 17.9 KB

2024-08-16.md

File metadata and controls

executable file
·
111 lines (90 loc) · 17.9 KB

[UPDATED!] 2024-08-16 (Publish Time)

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-08-16 HistoGym: A Reinforcement Learning Environment for Histopathological Image Analysis HistoGym:用于组织病理学图像分析的强化学习环境 Zhi-Bo Liu, Xiaobo Pang, Jizhao Wang, Shuai Liu, Chen Li http://arxiv.org/pdf/2408.08847v1 link
2024-08-16 PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future PFDiff:通过过去和未来的梯度引导实现无需训练的扩散模型加速 Guangyi Wang, Yuren Cai, Lijiang Li, Wei Peng, Songzhi Su http://arxiv.org/pdf/2408.08822v1 null
2024-08-16 Comparative Analysis of Generative Models: Enhancing Image Synthesis with VAEs, GANs, and Stable Diffusion 生成模型的比较分析:使用 VAE、GAN 和稳定扩散增强图像合成 Sanchayan Vivekananthan http://arxiv.org/pdf/2408.08751v1 null
2024-08-16 Beyond the Hype: A dispassionate look at vision-language models in medical scenario 超越炒作:冷静看待医疗场景中的视觉语言模型 Yang Nan, Huichi Zhou, Xiaodan Xing, Guang Yang http://arxiv.org/pdf/2408.08704v1 null
2024-08-16 Modeling the Neonatal Brain Development Using Implicit Neural Representations 使用隐性神经表征对新生儿大脑发育进行建模 Florentin Bieder, Paul Friedrich, Hélène Corbaz, Alicia Durrer, Julia Wolleb, Philippe C. Cattin http://arxiv.org/pdf/2408.08647v1 null
2024-08-16 Generative Dataset Distillation Based on Diffusion Model 基于扩散模型的生成数据集蒸馏 Duo Su, Junjie Hou, Guang Li, Ren Togo, Rui Song, Takahiro Ogawa, Miki Haseyama http://arxiv.org/pdf/2408.08610v1 link
2024-08-16 A New Chinese Landscape Paintings Generation Model based on Stable Diffusion using DreamBooth 基于 DreamBooth 的稳定扩散中国山水画生成新模型 Yujia Gu, Xinyu Fang, Xueyuan Deng http://arxiv.org/pdf/2408.08561v1 null
2024-08-16 Visual-Friendly Concept Protection via Selective Adversarial Perturbations 通过选择性对抗扰动实现视觉友好概念保护 Xiaoyue Mi, Fan Tang, Juan Cao, Peng Li, Yang Liu http://arxiv.org/pdf/2408.08518v1 link
2024-08-16 Efficient Image-to-Image Diffusion Classifier for Adversarial Robustness 具有对抗鲁棒性的高效图像到图像扩散分类器 Hefei Mei, Minjing Dong, Chang Xu http://arxiv.org/pdf/2408.08502v1 link
2024-08-16 Achieving Complex Image Edits via Function Aggregation with Diffusion Models 通过扩散模型的功能聚合实现复杂的图像编辑 Mohammadreza Samadi, Fred X. Han, Mohammad Salameh, Hao Wu, Fengyu Sun, Chunhua Zhou, Di Niu http://arxiv.org/pdf/2408.08495v1 null

多模态

Publish Date Title Title_CN Authors PDF Code
2024-08-16 xGen-MM (BLIP-3): A Family of Open Large Multimodal Models xGen-MM (BLIP-3):开放式大型多模式模型系列 Le Xue, Manli Shu, Anas Awadalla, Jun Wang, An Yan, Senthil Purushwalkam, Honglu Zhou, Viraj Prabhu, Yutong Dai, Michael S Ryoo, et.al. http://arxiv.org/pdf/2408.08872v1 null
2024-08-16 RGBT Tracking via All-layer Multimodal Interactions with Progressive Fusion Mamba 通过与 Progressive Fusion Mamba 的全层多模态交互实现 RGBT 跟踪 Andong Lu, Wanyu Wang, Chenglong Li, Jin Tang, Bin Luo http://arxiv.org/pdf/2408.08827v1 null
2024-08-16 Decoupling Feature Representations of Ego and Other Modalities for Incomplete Multi-modal Brain Tumor Segmentation 分离自我特征表示和其他模态特征表示以实现不完全多模态脑肿瘤分割 Kaixiang Yang, Wenqi Shan, Xudong Li, Xuan Wang, Xikai Yang, Xi Wang, Pheng-Ann Heng, Qiang Li, Zhiwei Wang http://arxiv.org/pdf/2408.08708v1 link
2024-08-16 TsCA: On the Semantic Consistency Alignment via Conditional Transport for Compositional Zero-Shot Learning TsCA:通过条件传输实现组合零样本学习的语义一致性对齐 Miaoge Li, Jingcai Guo, Richard Yi Da Xu, Dongsheng Wang, Xiaofeng Cao, Song Guo http://arxiv.org/pdf/2408.08703v1 null
2024-08-16 A Survey on Benchmarks of Multimodal Large Language Models 多模态大型语言模型基准调查 Jian Li, Weiheng Lu http://arxiv.org/pdf/2408.08632v1 link
2024-08-16 Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs 告诉编解码器什么值得压缩:使用 LMM 进行机器的语义解缠图像编码 Jinming Liu, Yuntao Wei, Junyan Lin, Shengyang Zhao, Heming Sun, Zhibo Chen, Wenjun Zeng, Xin Jin http://arxiv.org/pdf/2408.08575v1 null
2024-08-16 Scaling up Multimodal Pre-training for Sign Language Understanding 扩大手语理解的多模式预训练 Wengang Zhou, Weichao Zhao, Hezhen Hu, Zecheng Li, Houqiang Li http://arxiv.org/pdf/2408.08544v1 null
2024-08-16 Focus on Focus: Focus-oriented Representation Learning and Multi-view Cross-modal Alignment for Glioma Grading 聚焦焦点:面向焦点的表征学习和多视图跨模态对齐用于胶质瘤分级 Li Pan, Yupei Zhang, Qiushi Yang, Tan Li, Xiaohan Xing, Maximus C. F. Yeung, Zhen Chen http://arxiv.org/pdf/2408.08527v1 link
2024-08-16 CoSEC: A Coaxial Stereo Event Camera Dataset for Autonomous Driving CoSEC:用于自动驾驶的同轴立体事件摄像机数据集 Shihan Peng, Hanyu Zhou, Hao Dong, Zhiwei Shi, Haoyue Liu, Yuxing Duan, Yi Chang, Luxin Yan http://arxiv.org/pdf/2408.08500v1 null

Nerf

Publish Date Title Title_CN Authors PDF Code
2024-08-16 VF-NeRF: Learning Neural Vector Fields for Indoor Scene Reconstruction VF-NeRF:学习神经矢量场用于室内场景重建 Albert Gassol Puigjaner, Edoardo Mello Rella, Erik Sandström, Ajad Chhatkuli, Luc Van Gool http://arxiv.org/pdf/2408.08766v1 link

3DGS

Publish Date Title Title_CN Authors PDF Code
2024-08-16 Correspondence-Guided SfM-Free 3D Gaussian Splatting for NVS 用于 NVS 的对应引导无 SfM 3D 高斯溅射 Wei Sun, Xiaosong Zhang, Fang Wan, Yanzhao Zhou, Yuan Li, Qixiang Ye, Jianbin Jiao http://arxiv.org/pdf/2408.08723v1 null
2024-08-16 GS-ID: Illumination Decomposition on Gaussian Splatting via Diffusion Prior and Parametric Light Source Optimization GS-ID:通过扩散先验和参数光源优化实现高斯散射光照分解 Kang Du, Zhihao Liang, Zeyu Wang http://arxiv.org/pdf/2408.08524v1 link

分类/检测/识别/分割/...

Publish Date Title Title_CN Authors PDF Code
2024-08-16 SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation SAM2-UNet:Segment Anything 2 为自然和医学图像分割打造强大的编码器 Xinyu Xiong, Zihuang Wu, Shuangyi Tan, Wenxue Li, Feilong Tang, Ying Chen, Siying Li, Jie Ma, Guanbin Li http://arxiv.org/pdf/2408.08870v1 link
2024-08-16 DPA: Dual Prototypes Alignment for Unsupervised Adaptation of Vision-Language Models DPA:视觉语言模型无监督适应的双原型对齐 Eman Ali, Sathira Silva, Muhammad Haris Khan http://arxiv.org/pdf/2408.08855v1 null
2024-08-16 Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models 使用基础模型进行检索增强的少样本医学图像分割 Lin Zhao, Xiao Chen, Eric Z. Chen, Yikang Liu, Terrence Chen, Shanhui Sun http://arxiv.org/pdf/2408.08813v1 null
2024-08-16 A Disease-Specific Foundation Model Using Over 100K Fundus Images: Release and Validation for Abnormality and Multi-Disease Classification on Downstream Tasks 使用超过 10 万张眼底图像的疾病特定基础模型:下游任务中的异常和多疾病分类的发布和验证 Boa Jang, Youngbin Ahn, Eun Kyung Choe, Chang Ki Yoon, Hyuk Jin Choi, Young-Gon Kim http://arxiv.org/pdf/2408.08790v1 null
2024-08-16 Towards Physical World Backdoor Attacks against Skeleton Action Recognition 针对骨骼动作识别的物理世界后门攻击 Qichen Zheng, Yi Yu, Siyuan Yang, Jun Liu, Kwok-Yan Lam, Alex Kot http://arxiv.org/pdf/2408.08671v1 null
2024-08-16 Extracting polygonal footprints in off-nadir images with Segment Anything Model 使用 Segment Anything 模型提取非地面图像中的多边形足迹 Kai Li, Jingbo Chen, Yupeng Deng, Yu Meng, Diyou Liu, Junxian Ma, Chenhao Wang http://arxiv.org/pdf/2408.08645v1 null
2024-08-16 SketchRef: A Benchmark Dataset and Evaluation Metrics for Automated Sketch Synthesis SketchRef:自动草图合成的基准数据集和评估指标 Xingyue Lin, Xingjian Hu, Shuai Peng, Jianhua Zhu, Liangcai Gao http://arxiv.org/pdf/2408.08623v1 null
2024-08-16 MM-UNet: A Mixed MLP Architecture for Improved Ophthalmic Image Segmentation MM-UNet:一种用于改进眼科图像分割的混合 MLP 架构 Zunjie Xiao, Xiaoqing Zhang, Risa Higashita, Jiang Liu http://arxiv.org/pdf/2408.08600v1 null
2024-08-16 Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D Instance Segmentation 用于开放词汇 3D 实例分割的零样本双路径集成框架 Tri Ton, Ji Woo Hong, SooHwan Eom, Jun Yeop Shim, Junyeong Kim, Chang D. Yoo http://arxiv.org/pdf/2408.08591v1 null
2024-08-16 TAMER: Tree-Aware Transformer for Handwritten Mathematical Expression Recognition TAMER:用于手写数学表达式识别的树感知变换器 Jianhua Zhu, Wenqi Zhao, Yu Li, Xingjian Hu, Liangcai Gao http://arxiv.org/pdf/2408.08578v1 link
2024-08-16 Tuning a SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation 使用多认知视觉适配器调整基于 SAM 的模型以进行遥感实例分割 Linghao Zheng, Xinyang Pu, Feng Xu http://arxiv.org/pdf/2408.08576v1 null
2024-08-16 A training regime to learn unified representations from complementary breast imaging modalities 一种从互补乳腺成像模式中学习统一表征的训练方案 Umang Sharma, Jungkyu Park, Laura Heacock, Sumit Chopra, Krzysztof Geras http://arxiv.org/pdf/2408.08560v1 null
2024-08-16 Detection and tracking of MAVs using a LiDAR with rosette scanning pattern 使用带有玫瑰花扫描模式的 LiDAR 检测和跟踪 MAV Sándor Gazdag, Tom Möller, Tamás Filep, Anita Keszler, András L. Majdik http://arxiv.org/pdf/2408.08555v1 null
2024-08-16 Language-Driven Interactive Shadow Detection 语言驱动的交互式阴影检测 Hongqiu Wang, Wei Wang, Haipeng Zhou, Huihui Xu, Shaozhi Wu, Lei Zhu http://arxiv.org/pdf/2408.08543v1 link
2024-08-16 DFT-Based Adversarial Attack Detection in MRI Brain Imaging: Enhancing Diagnostic Accuracy in Alzheimer's Case Studies 基于 DFT 的 MRI 脑成像对抗性攻击检测:提高阿尔茨海默病病例研究中的诊断准确性 Mohammad Hossein Najafi, Mohammad Morsali, Mohammadmahdi Vahediahmar, Saeed Bagheri Shouraki http://arxiv.org/pdf/2408.08489v1 null
2024-08-16 TEXTOC: Text-driven Object-Centric Style Transfer TEXTOC:文本驱动的以对象为中心的风格转换 Jihun Park, Jongmin Gim, Kyoungmin Lee, Seunghun Lee, Sunghoon Im http://arxiv.org/pdf/2408.08461v1 null

LLM

Publish Date Title Title_CN Authors PDF Code
2024-08-16 LLM-PCGC: Large Language Model-based Point Cloud Geometry Compression LLM-PCGC:基于大型语言模型的点云几何压缩 Yuqi Ye, Wei Gao http://arxiv.org/pdf/2408.08682v1 null

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-08-16 PriorMapNet: Enhancing Online Vectorized HD Map Construction with Priors PriorMapNet:利用 Priors 增强在线矢量化高清地图构建 Rongxuan Wang, Xin Lu, Xiaoyang Liu, Xiaoyi Zou, Tongyi Cao, Ying Li http://arxiv.org/pdf/2408.08802v1 null
2024-08-16 PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders PCP-MAE:学习预测点掩模自动编码器的中心 Xiangdong Zhang, Shaofeng Zhang, Junchi Yan http://arxiv.org/pdf/2408.08753v1 null
2024-08-16 Task-Aware Dynamic Transformer for Efficient Arbitrary-Scale Image Super-Resolution 任务感知动态变换器,用于高效任意尺度图像超分辨率 Tianyi Xu, Yiji Zhou, Xiaotao Hu, Kai Zhang, Anran Zhang, Xingye Qiu, Jun Xu http://arxiv.org/pdf/2408.08736v1 null
2024-08-16 HyCoT: Hyperspectral Compression Transformer with an Efficient Training Strategy HyCoT:具有高效训练策略的高光谱压缩变换器 Martin Hermann Paul Fuchs, Behnood Rasti, Begüm Demir http://arxiv.org/pdf/2408.08700v1 null
2024-08-16 Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning 自适应层选择,实现高效的视觉变换器微调 Alessio Devoto, Federico Alvetreti, Jary Pomponi, Paolo Di Lorenzo, Pasquale Minervini, Simone Scardapane http://arxiv.org/pdf/2408.08670v1 null
2024-08-16 Learning A Low-Level Vision Generalist via Visual Task Prompt 通过视觉任务提示学习低级视觉通才 Xiangyu Chen, Yihao Liu, Yuandong Pu, Wenlong Zhang, Jiantao Zhou, Yu Qiao, Chao Dong http://arxiv.org/pdf/2408.08601v1 link
2024-08-16 EraW-Net: Enhance-Refine-Align W-Net for Scene-Associated Driver Attention Estimation EraW-Net:增强-细化-对齐 W-Net,用于场景相关驾驶员注意力估计 Jun Zhou, Chunsheng Liu, Faliang Chang, Wenqian Wang, Penghui Hao, Yiming Huang, Zhiqiang Yang http://arxiv.org/pdf/2408.08570v1 null
2024-08-16 Unsupervised Non-Rigid Point Cloud Matching through Large Vision Models 通过大型视觉模型进行无监督非刚性点云匹配 Zhangquan Chen, Puhua Jiang, Ruqi Huang http://arxiv.org/pdf/2408.08568v1 null
2024-08-16 S$^3$Attention: Improving Long Sequence Attention with Smoothed Skeleton Sketching S$^3$Attention:通过平滑骨架草图提高长序列注意力 Xue Wang, Tian Zhou, Jianqing Zhu, Jialin Liu, Kun Yuan, Tao Yao, Wotao Yin, Rong Jin, HanQin Cai http://arxiv.org/pdf/2408.08567v1 null
2024-08-16 Privacy-Preserving Vision Transformer Using Images Encrypted with Restricted Random Permutation Matrices 使用受限随机置换矩阵加密图像的隐私保护视觉转换器 Kouki Horio, Kiyoshi Nishikawa, Hitoshi Kiya http://arxiv.org/pdf/2408.08529v1 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-08-16 Multi-task Learning Approach for Intracranial Hemorrhage Prognosis 颅内出血预后的多任务学习方法 Miriam Cobo, Amaia Pérez del Barrio, Pablo Menéndez Fernández-Miranda, Pablo Sanz Bellón, Lara Lloret Iglesias, Wilson Silva http://arxiv.org/pdf/2408.08784v1 link
2024-08-16 QMambaBSR: Burst Image Super-Resolution with Query State Space Model QMambaBSR:使用查询状态空间模型实现突发图像超分辨率 Xin Di, Long Peng, Peizhe Xia, Wenbo Li, Renjing Pei, Yang Cao, Yang Wang, Zheng-Jun Zha http://arxiv.org/pdf/2408.08665v1 null
2024-08-16 Reference-free Axial Super-resolution of 3D Microscopy Images using Implicit Neural Representation with a 2D Diffusion Prior 使用隐式神经表征和二维扩散先验实现 3D 显微镜图像的无参考轴向超分辨率 Kyungryun Lee, Won-Ki Jeong http://arxiv.org/pdf/2408.08616v1 link

各类学习方式

Publish Date Title Title_CN Authors PDF Code
2024-08-16 Assessing Generalization Capabilities of Malaria Diagnostic Models from Thin Blood Smears 通过薄血涂片评估疟疾诊断模型的泛化能力 Louise Guillon, Soheib Biga, Axel Puyo, Grégoire Pasquier, Valentin Foucher, Yendoubé E. Kantchire, Stéphane E. Sossou, Ameyo M. Dorkenoo, Laurent Bonnardot, Marc Thellier, et.al. http://arxiv.org/pdf/2408.08792v1 null
2024-08-16 MicroSSIM: Improved Structural Similarity for Comparing Microscopy Data MicroSSIM:改进的显微镜数据结构相似性比较方法 Ashesh Ashesh, Joran Deschamps, Florian Jug http://arxiv.org/pdf/2408.08747v1 link
2024-08-16 Historical Printed Ornaments: Dataset and Tasks 历史印刷装饰品:数据集和任务 Sayan Kumar Chaki, Zeynep Sonat Baltaci, Elliot Vincent, Remi Emonet, Fabienne Vial-Bonacci, Christelle Bahier-Porte, Mathieu Aubry, Thierry Fournel http://arxiv.org/pdf/2408.08633v1 null

其他

Publish Date Title Title_CN Authors PDF Code
2024-08-16 Backward-Compatible Aligned Representations via an Orthogonal Transformation Layer 通过正交变换层实现向后兼容的对齐表示 Simone Ricci, Niccolò Biondi, Federico Pernici, Alberto Del Bimbo http://arxiv.org/pdf/2408.08793v1 null
2024-08-16 A lifted Bregman strategy for training unfolded proximal neural network Gaussian denoisers 一种用于训练展开近端神经网络高斯降噪器的提升 Bregman 策略 Xiaoyu Wang, Martin Benning, Audrey Repetti http://arxiv.org/pdf/2408.08742v1 null
2024-08-16 Bi-Directional Deep Contextual Video Compression 双向深度上下文视频压缩 Xihua Sheng, Li Li, Dong Liu, Shiqi Wang http://arxiv.org/pdf/2408.08604v1 null
2024-08-16 S-RAF: A Simulation-Based Robustness Assessment Framework for Responsible Autonomous Driving S-RAF:基于模拟的负责任自动驾驶稳健性评估框架 Daniel Omeiza, Pratik Somaiya, Jo-Ann Pattinson, Carolyn Ten-Holter, Jack Stilgoe, Marina Jirotka, Lars Kunze http://arxiv.org/pdf/2408.08584v1 link