Skip to content

Latest commit

 

History

History
166 lines (117 loc) · 20.6 KB

2024-12-28.md

File metadata and controls

166 lines (117 loc) · 20.6 KB

[UPDATED!] 2024-12-28 (Update Time)

3D感知

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 Election of Collaborators via Reinforcement Learning for Federated Brain Tumor Segmentation 基于强化学习的联邦脑肿瘤分割协作者选择 Muhammad Irfan Khan, Elina Kontio, Suleiman A. Khan, Mojtaba Jafaritadi http://arxiv.org/pdf/2412.20253v1 None
2024-12-28 Towards Real-Time 2D Mapping: Harnessing Drones, AI, and Computer Vision for Advanced Insights 迈向实时二维制图:利用无人机、人工智能和计算机视觉实现高级洞察 Bharath Kumar Agnur http://arxiv.org/pdf/2412.20210v1 None
2024-12-28 Multi-Modality Driven LoRA for Adverse Condition Depth Estimation 多模态驱动LoRA的恶劣条件深度估计 Guanglei Yang, Rui Tian, Yongqiang Zhang, Zhun Zhong, Yongqiang Li, Wangmeng Zuo http://arxiv.org/pdf/2412.20162v1 None
2024-12-28 Enhancing Marine Debris Acoustic Monitoring by Optical Flow-Based Motion Vector Analysis 基于光流运动矢量分析增强海洋垃圾声学监测 Xiaoteng Zhou, Katsunori Mizuno http://arxiv.org/pdf/2412.20085v1 None
2024-12-28 MambaVO: Deep Visual Odometry Based on Sequential Matching Refinement and Training Smoothing MambaVO:基于序列匹配优化和训练平滑的深度视觉里程计 Shuo Wang, Wanting Li, Yongcai Wang, Zhaoxin Fan, Zhe Huang, Xudong Cai, Jian Zhao, Deying Li http://arxiv.org/pdf/2412.20082v1 None
2024-12-28 GSplatLoc: Ultra-Precise Camera Localization via 3D Gaussian Splatting GSplatLoc:基于3D高斯散布的超精确相机定位 Atticus J. Zeller http://arxiv.org/pdf/2412.20056v1 None
2024-12-28 DepthMamba with Adaptive Fusion 深度Mamba自适应融合 Zelin Meng, Zhichen Wang http://arxiv.org/pdf/2412.19964v1 None

NeRF

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face Synthesis DEGSTalk:分解的每嵌入高斯场用于保留头发的说话人脸合成 Kaijun Deng, Dezhi Zheng, Jindong Xie, Jinbao Wang, Weicheng Xie, Linlin Shen, Siyang Song http://arxiv.org/pdf/2412.20148v1 https://github.com/CVI-SZU/DEGSTalk.
2024-12-28 Canonical Factors for Hybrid Neural Fields 混合神经场的规范因子 Brent Yi, Weijia Zeng, Sam Buchanan, Yi Ma http://arxiv.org/pdf/2308.15461v2 None
2024-12-28 Comprehensive Review of EEG-to-Output Research: Decoding Neural Signals into Images, Videos, and Audio 全面回顾脑电图到输出的研究:将神经信号解码为图像、视频和音频 Yashvir Sabharwal, Balaji Rama http://arxiv.org/pdf/2412.19999v1 None

人脸识别/处理

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning FSFM:通过自监督面部表示学习构建的可泛化人脸安全基础模型 Gaojian Wang, Feng Lin, Tong Wu, Zhenguang Liu, Zhongjie Ba, Kui Ren http://arxiv.org/pdf/2412.12032v2 None

动作识别

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 Transformer-Based Contrastive Meta-Learning For Low-Resource Generalizable Activity Recognition 基于Transformer的对比元学习用于低资源泛化活动识别 Junyao Wang, Mohammad Abdullah Al Faruque http://arxiv.org/pdf/2412.20290v1 None
2024-12-28 SyncDiff: Synchronized Motion Diffusion for Multi-Body Human-Object Interaction Synthesis 同步运动扩散:多人体与人-物体交互合成的同步运动 Wenkun He, Yun Liu, Ruitao Liu, Li Yi http://arxiv.org/pdf/2412.20104v1 None
2024-12-28 VELoRA: A Low-Rank Adaptation Approach for Efficient RGB-Event based Recognition VELoRA:一种用于高效RGB-事件识别的低秩自适应方法 Lan Chen, Haoxiang Yang, Pengpeng Shao, Haoyu Song, Xiao Wang, Zhicheng Zhao, Yaowei Wang, Yonghong Tian http://arxiv.org/pdf/2412.20064v1 https://github.com/Event-AHU/VELoRA

图像恢复

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 UniRestorer: Universal Image Restoration via Adaptively Estimating Image Degradation at Proper Granularity 通用图像修复:通过自适应估计适当粒度下的图像退化 Jingbo Lin, Zhilu Zhang, Wenbo Li, Renjing Pei, Hang Xu, Hongzhi Zhang, Wangmeng Zuo http://arxiv.org/pdf/2412.20157v1 https://github.com/mrluin/UniRestorer.
2024-12-28 MaIR: A Locality- and Continuity-Preserving Mamba for Image Restoration MaIR:一种保留局部性和连续性的Mamba图像恢复方法 Boyun Li, Haiyu Zhao, Wenxin Wang, Peng Hu, Yuanbiao Gou, Xi Peng http://arxiv.org/pdf/2412.20066v1 None

图像描述生成

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 Altogether: Image Captioning via Re-aligning Alt-text 整体:通过重新对齐替代文本进行图像描述 Hu Xu, Po-Yao Huang, Xiaoqing Ellen Tan, Ching-Feng Yeh, Jacob Kahn, Christine Jou, Gargi Ghosh, Omer Levy http://arxiv.org/pdf/2410.17251v3 None

图像生成/合成

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 Demystifying CLIP Data 揭开CLIP数据的神秘面纱 Hu Xu, Saining Xie, Xiaoqing Ellen Tan, Po-Yao Huang, Russell Howes, Vasu Sharma, Shang-Wen Li, Gargi Ghosh http://arxiv.org/pdf/2309.16671v5 https://github.com/facebookresearch/MetaCLIP.
2024-12-28 ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving 一致ID:多模态细粒度身份保持的肖像生成 Jiehui Huang, Xiao Dong, Wenhui Song, Zheng Chong, Zhenchao Tang, Jun Zhou, Yuhao Cheng, Long Chen http://arxiv.org/pdf/2404.16771v2 None
2024-12-28 StyleAutoEncoder for manipulating image attributes using pre-trained StyleGAN 风格自动编码器:利用预训练的StyleGAN操纵图像属性 Andrzej Bedychaj, Jacek Tabor, Marek Śmieja http://arxiv.org/pdf/2412.20164v1 None
2024-12-28 ST$^3$: Accelerating Multimodal Large Language Model by Spatial-Temporal Visual Token Trimming ST$^3$:通过时空视觉标记修剪加速多模态大型语言模型 Jiedong Zhuang, Lu Lu, Ming Dai, Rui Hu, Jian Chen, Qiang Liu, Haoji Hu http://arxiv.org/pdf/2412.20105v1 None
2024-12-28 AdaDiff: Adaptive Step Selection for Fast Diffusion Models AdaDiff:快速扩散模型的自适应步长选择 Hui Zhang, Zuxuan Wu, Zhen Xing, Jie Shao, Yu-Gang Jiang http://arxiv.org/pdf/2311.14768v2 None
2024-12-28 Enhancing Diffusion Models for Inverse Problems with Covariance-Aware Posterior Sampling 增强扩散模型在逆问题中的协方差感知后验采样 Shayan Mohajer Hamidi, En-Hui Yang http://arxiv.org/pdf/2412.20045v1 None
2024-12-28 VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis VersaGen:释放多功能的视觉控制以实现文本到图像的合成 Zhipeng Chen, Lan Yang, Yonggang Qi, Honggang Zhang, Kaiyue Pang, Ke Li, Yi-Zhe Song http://arxiv.org/pdf/2412.11594v3 None
2024-12-28 An Ordinary Differential Equation Sampler with Stochastic Start for Diffusion Bridge Models 具有随机起始的扩散桥模型普通微分方程采样器 Yuang Wang, Pengfei Jin, Li Zhang, Quanzheng Li, Zhiqiang Chen, Dufan Wu http://arxiv.org/pdf/2412.19992v1 None
2024-12-28 ChatGarment: Garment Estimation, Generation and Editing via Large Language Models ChatGarment:通过大型语言模型进行服装估计、生成和编辑 Siyuan Bian, Chenghao Xu, Yuliang Xiu, Artur Grigorev, Zhen Liu, Cewu Lu, Michael J. Black, Yao Feng http://arxiv.org/pdf/2412.17811v2 None
2024-12-28 DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation DiMSUM:扩散曼巴——一种可扩展且统一的图像生成空间-频率方法 Hao Phung, Quan Dao, Trung Dao, Hoang Phan, Dimitris Metaxas, Anh Tran http://arxiv.org/pdf/2411.04168v2 https://github.com/VinAIResearch/DiMSUM.git.

图像编辑/处理

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 Face-StyleSpeech: Enhancing Zero-shot Speech Synthesis from Face Images with Improved Face-to-Speech Mapping 面部风格语音:通过改进人脸到语音映射增强从人脸图像的零样本语音合成 Minki Kang, Wooseok Han, Eunho Yang http://arxiv.org/pdf/2311.05844v2 None
2024-12-28 Cross-Modal Mapping: Eliminating the Modality Gap for Few-Shot Image Classification 跨模态映射:消除小样本图像分类的模态差距 Xi Yang, Pai Peng, Wulin Xie, Xiaohuan Lu, Jie Wen http://arxiv.org/pdf/2412.20110v1 None
2024-12-28 SwinIA: Self-Supervised Blind-Spot Image Denoising without Convolutions SwinIA:无需卷积的自监督盲点图像去噪 Mikhail Papkov, Pavel Chizhov, Leopold Parts http://arxiv.org/pdf/2305.05651v2 None
2024-12-28 ERUP-YOLO: Enhancing Object Detection Robustness for Adverse Weather Condition by Unified Image-Adaptive Processing ERUP-YOLO:通过统一图像自适应处理增强恶劣天气条件下目标检测鲁棒性 Yuka Ogino, Yuho Shoji, Takahiro Toizumi, Atsushi Ito http://arxiv.org/pdf/2411.02799v4 None
2024-12-28 On the Compositional Generalization of Multimodal LLMs for Medical Imaging 关于多模态LLMs在医学影像中的组合泛化 Zhenyang Cai, Junying Chen, Rongsheng Wang, Weihong Wang, Yonglin Deng, Dingjie Song, Yize Chen, Zixu Zhang http://arxiv.org/pdf/2412.20070v1 https://github.com/FreedomIntelligence/Med-MAT.
2024-12-28 MADiff: Text-Guided Fashion Image Editing with Mask Prediction and Attention-Enhanced Diffusion MADiff:基于掩码预测和注意力增强扩散的文本引导时尚图像编辑 Zechao Zhan, Dehong Gao, Jinxia Zhang, Jiale Huang, Yang Hu, Xin Wang http://arxiv.org/pdf/2412.20062v1 None
2024-12-28 Unsupervised Cross-Domain Image Retrieval via Prototypical Optimal Transport 无监督跨域图像检索通过原型最优传输 Bin Li, Ye Shi, Qian Yu, Jingya Wang http://arxiv.org/pdf/2402.18411v4 None
2024-12-28 A Robust Adversarial Ensemble with Causal (Feature Interaction) Interpretations for Image Classification 鲁棒对抗集成图像分类及其因果(特征交互)解释 Chunheng Zhao, Pierluigi Pisu, Gurcan Comert, Negash Begashaw, Varghese Vaidyan, Nina Christine Hubig http://arxiv.org/pdf/2412.20025v1 None
2024-12-28 Uncertainty Quantified Deep Learning and Regression Analysis Framework for Image Segmentation of Skin Cancer Lesions 皮肤癌病变图像分割的不确定性量化深度学习和回归分析框架 Elhoucine Elfatimi, Pratik Shah http://arxiv.org/pdf/2412.20007v1 None
2024-12-28 SegKAN: High-Resolution Medical Image Segmentation with Long-Distance Dependencies SegKAN:具有长距离依赖关系的超分辨率医学图像分割 Shengbo Tan, Rundong Xue, Shipeng Luo, Zeyu Zhang, Xinran Wang, Lei Zhang, Daji Ergu, Zhang Yi http://arxiv.org/pdf/2412.19990v1 https://github.com/goblin327/SegKAN

实例分割

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 Geo-ConvGRU: Geographically Masked Convolutional Gated Recurrent Unit for Bird-Eye View Segmentation 地理掩码卷积门控循环单元在鸟瞰图分割中的应用 Guanglei Yang, Yongqiang Zhang, Wanlong Li, Yu Tang, Weize Shang, Feng Wen, Hongbo Zhang, Mingli Ding http://arxiv.org/pdf/2412.20171v1 None

少样本学习

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 TAVP: Task-Adaptive Visual Prompt for Cross-domain Few-shot Segmentation 跨域小样本分割的任务自适应视觉提示 Jiaqi Yang, Yaning Zhang, Jingxi Hu, Xiangjian He, Linlin Shen, Guoping Qiu http://arxiv.org/pdf/2409.05393v2 None
2024-12-28 Maintain Plasticity in Long-timescale Continual Test-time Adaptation 保持长时标持续测试时自适应的塑性 Yanshuo Wang, Xuesong Li, Jinguang Tong, Jie Hong, Jun Lan, Weiqiang Wang, Huijia Zhu, Haoxing Chen http://arxiv.org/pdf/2412.20034v1 None

模型压缩

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 Few-shot Algorithm Assurance 少量样本算法保证 Dang Nguyen, Sunil Gupta http://arxiv.org/pdf/2412.20275v1 None
2024-12-28 An archaeological Catalog Collection Method Based on Large Vision-Language Models 基于大型视觉-语言模型的考古目录集合方法 Honglin Pang, Yi Chang, Tianjing Duan, Xi Yang http://arxiv.org/pdf/2412.20088v1 None

目标检测

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 Enhancing Transfer Learning for Medical Image Classification with SMOTE: A Comparative Study 基于SMOTE增强医学图像分类的迁移学习:一项比较研究 Md. Zehan Alam, Tonmoy Roy, H. M. Nahid Kawsar, Iffat Rimi http://arxiv.org/pdf/2412.20235v1 None
2024-12-28 Plastic Waste Classification Using Deep Learning: Insights from the WaDaBa Dataset 塑料垃圾分类利用深度学习:WaDaBa数据集的见解 Suman Kunwar, Banji Raphael Owabumoye, Abayomi Simeon Alade http://arxiv.org/pdf/2412.20232v1 None
2024-12-28 First qualitative observations on deep learning vision model YOLO and DETR for automated driving in Austria 奥地利自动驾驶中的深度学习视觉模型YOLO和DETR的初步定性观察 Stefan Schoder http://arxiv.org/pdf/2312.12314v2 None
2024-12-28 Shifting Spotlight for Co-supervision: A Simple yet Efficient Single-branch Network to See Through Camouflage 移位焦点协同监督:一种简单高效的单一分支网络穿透伪装 Yang Hu, Jinxia Zhang, Kaihua Zhang, Yin Yuan, Jiale Huang, Zechao Zhan, Xing Wang http://arxiv.org/pdf/2404.08936v2 None
2024-12-28 Mining Platoon Patterns from Traffic Videos 从交通视频中挖掘车队模式 Yijun Bei, Teng Ma, Dongxiang Zhang, Sai Wu, Kian-Lee Tan, Gang Chen http://arxiv.org/pdf/2412.20177v1 None
2024-12-28 On dataset transferability in medical image classification 医学图像分类中的数据集迁移性研究 Dovile Juodelyte, Enzo Ferrante, Yucheng Lu, Prabhant Singh, Joaquin Vanschoren, Veronika Cheplygina http://arxiv.org/pdf/2412.20172v1 https://github.com/DovileDo/transferability-in-medical-imaging.
2024-12-28 CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition CHASE:基于骨架的多实体动作识别中的凸包自适应偏移学习 Yuhang Wen, Mengyuan Liu, Songtao Wu, Beichen Ding http://arxiv.org/pdf/2410.07153v2 https://github.com/Necolizer/CHASE
2024-12-28 Conformal Risk Control for Pulmonary Nodule Detection 肺结节检测中的共形风险控制 Roel Hulsman, Valentin Comte, Lorenzo Bertolini, Tobias Wiesenthal, Antonio Puertas Gallardo, Mario Ceresa http://arxiv.org/pdf/2412.20167v1 None
2024-12-28 A Cascaded Dilated Convolution Approach for Mpox Lesion Classification 基于级联扩张卷积的猴痘病变分类方法 Ayush Deshmukh http://arxiv.org/pdf/2412.10106v2 None
2024-12-28 Distilled Transformers with Locally Enhanced Global Representations for Face Forgery Detection 具有局部增强全局表示的蒸馏Transformer用于人脸伪造检测 Yaning Zhang, Qiufu Li, Zitong Yu, Linlin Shen http://arxiv.org/pdf/2412.20156v1 None
2024-12-28 Self-Calibrated Dual Contrasting for Annotation-Efficient Bacteria Raman Spectroscopy Clustering and Classification 自校准双对比法在细菌拉曼光谱聚类和分类中的高效标注 Haiming Yao, Wei Luo, Tao Zhou, Ang Gao, Xue Wang http://arxiv.org/pdf/2412.20060v1 None
2024-12-28 SimLTD: Simple Supervised and Semi-Supervised Long-Tailed Object Detection SimLTD:简单监督和半监督长尾目标检测 Phi Vu Tran http://arxiv.org/pdf/2412.20047v1 None
2024-12-28 DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments DAVE:复杂和不可预测环境中具有高脆弱道路使用者代表性的多样化原子视觉元素数据集 Xijun Wang, Pedro Sandoval-Segura, Chengyuan Zhang, Junyun Huang, Tianrui Guan, Ruiqi Xian, Fuxiao Liu, Rohan Chandra http://arxiv.org/pdf/2412.20042v1 None
2024-12-28 Searching from Area to Point: A Hierarchical Framework for Semantic-Geometric Combined Feature Matching 从区域到点:语义-几何结合特征的层次化框架 Yesheng Zhang, Xu Zhao http://arxiv.org/pdf/2305.00194v6 None
2024-12-28 Adversarial Robustness for Deep Learning-based Wildfire Detection Models 基于深度学习的野火检测模型的对抗鲁棒性 Ryo Ide, Lei Yang http://arxiv.org/pdf/2412.20006v1 None
2024-12-28 DFME: A New Benchmark for Dynamic Facial Micro-expression Recognition 动态面部微表情识别新基准:DFME Sirui Zhao, Huaying Tang, Xinglong Mao, Shifeng Liu, Yiming Zhang, Hao Wang, Tong Xu, Enhong Chen http://arxiv.org/pdf/2301.00985v2 None

视觉-语言理解

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 Towards Visual Grounding: A Survey 视觉定位:综述 Linhui Xiao, Xiaoshan Yang, Xiangyuan Lan, Yaowei Wang, Changsheng Xu http://arxiv.org/pdf/2412.20206v1 https://github.com/linhuixiao/Awesome-Visual-Grounding.
2024-12-28 B-AVIBench: Towards Evaluating the Robustness of Large Vision-Language Model on Black-box Adversarial Visual-Instructions B-AVIBench:迈向评估大型视觉-语言模型在黑盒对抗视觉指令上的鲁棒性 Hao Zhang, Wenqi Shao, Hong Liu, Yongqiang Ma, Ping Luo, Yu Qiao, Nanning Zheng, Kaipeng Zhang http://arxiv.org/pdf/2403.09346v2 https://github.com/zhanghao5201/B-AVIBench.
2024-12-28 AI-based Wearable Vision Assistance System for the Visually Impaired: Integrating Real-Time Object Recognition and Contextual Understanding Using Large Vision-Language Models 基于AI的视障人士可穿戴视觉辅助系统:利用大型视觉-语言模型整合实时物体识别和上下文理解 Mirza Samad Ahmed Baig, Syeda Anshrah Gillani, Shahid Munir Shah, Mahmoud Aljawarneh, Abdul Akbar Khan, Muhammad Hamzah Siddiqui http://arxiv.org/pdf/2412.20059v1 None
2024-12-28 FashionFAE: Fine-grained Attributes Enhanced Fashion Vision-Language Pre-training 时尚FAE:细粒度属性增强的时尚视觉-语言预训练 Jiale Huang, Dehong Gao, Jinxia Zhang, Zechao Zhan, Yang Hu, Xin Wang http://arxiv.org/pdf/2412.19997v1 None

视频生成

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 Injecting Explainability and Lightweight Design into Weakly Supervised Video Anomaly Detection Systems 将弱监督视频异常检测系统注入可解释性和轻量级设计 Wen-Dong Jiang, Chih-Yung Chang, Hsiang-Chuan Chang, Ji-Yuan Chen, Diptendu Sinha Roy http://arxiv.org/pdf/2412.20201v1 None
2024-12-28 STNMamba: Mamba-based Spatial-Temporal Normality Learning for Video Anomaly Detection STNMamba:基于Mamba的空间-时间正常性学习用于视频异常检测 Zhangxun Li, Mengyang Zhao, Xuan Yang, Yang Liu, Jiamu Sheng, Xinhua Zeng, Tian Wang, Kewei Wu http://arxiv.org/pdf/2412.20084v1 None
2024-12-28 MAKIMA: Tuning-free Multi-Attribute Open-domain Video Editing via Mask-Guided Attention Modulation MAKIMA:基于掩码引导的注意力调制,无需调优的多属性开放域视频编辑 Haoyu Zheng, Wenqiao Zhang, Zheqi Lv, Yu Zhong, Yang Dai, Jianxiang An, Yongliang Shen, Juncheng Li http://arxiv.org/pdf/2412.19978v1 None

视频追踪

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 Learning Adaptive and View-Invariant Vision Transformer with Multi-Teacher Knowledge Distillation for Real-Time UAV Tracking 学习自适应和视角不变视觉Transformer,通过多教师知识蒸馏实现实时无人机跟踪 You Wu, Yongxin Li, Mengyuan Liu, Xucheng Wang, Xiangyang Yang, Hengzhou Ye, Dan Zeng, Qijun Zhao http://arxiv.org/pdf/2412.20002v1 None

语义分割

发布日期 英文标题 中文标题 作者 PDF链接 代码链接
2024-12-28 Recommender Engine Driven Client Selection in Federated Brain Tumor Segmentation 联邦脑肿瘤分割中的推荐引擎驱动的客户端选择 Muhammad Irfan Khan, Elina Kontio, Suleiman A. Khan, Mojtaba Jafaritadi http://arxiv.org/pdf/2412.20250v1 None
2024-12-28 MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping MSDNet:基于Transformer引导的原型设计的多尺度解码器用于小样本语义分割 Amirreza Fateh, Mohammad Reza Mohammadi, Mohammad Reza Jahed Motlagh http://arxiv.org/pdf/2409.11316v2 https://github.com/amirrezafateh/MSDNet