发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | Election of Collaborators via Reinforcement Learning for Federated Brain Tumor Segmentation | 基于强化学习的联邦脑肿瘤分割协作者选择 | Muhammad Irfan Khan, Elina Kontio, Suleiman A. Khan, Mojtaba Jafaritadi | http://arxiv.org/pdf/2412.20253v1 | None |
2024-12-28 | Towards Real-Time 2D Mapping: Harnessing Drones, AI, and Computer Vision for Advanced Insights | 迈向实时二维制图:利用无人机、人工智能和计算机视觉实现高级洞察 | Bharath Kumar Agnur | http://arxiv.org/pdf/2412.20210v1 | None |
2024-12-28 | Multi-Modality Driven LoRA for Adverse Condition Depth Estimation | 多模态驱动LoRA的恶劣条件深度估计 | Guanglei Yang, Rui Tian, Yongqiang Zhang, Zhun Zhong, Yongqiang Li, Wangmeng Zuo | http://arxiv.org/pdf/2412.20162v1 | None |
2024-12-28 | Enhancing Marine Debris Acoustic Monitoring by Optical Flow-Based Motion Vector Analysis | 基于光流运动矢量分析增强海洋垃圾声学监测 | Xiaoteng Zhou, Katsunori Mizuno | http://arxiv.org/pdf/2412.20085v1 | None |
2024-12-28 | MambaVO: Deep Visual Odometry Based on Sequential Matching Refinement and Training Smoothing | MambaVO:基于序列匹配优化和训练平滑的深度视觉里程计 | Shuo Wang, Wanting Li, Yongcai Wang, Zhaoxin Fan, Zhe Huang, Xudong Cai, Jian Zhao, Deying Li | http://arxiv.org/pdf/2412.20082v1 | None |
2024-12-28 | GSplatLoc: Ultra-Precise Camera Localization via 3D Gaussian Splatting | GSplatLoc:基于3D高斯散布的超精确相机定位 | Atticus J. Zeller | http://arxiv.org/pdf/2412.20056v1 | None |
2024-12-28 | DepthMamba with Adaptive Fusion | 深度Mamba自适应融合 | Zelin Meng, Zhichen Wang | http://arxiv.org/pdf/2412.19964v1 | None |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face Synthesis | DEGSTalk:分解的每嵌入高斯场用于保留头发的说话人脸合成 | Kaijun Deng, Dezhi Zheng, Jindong Xie, Jinbao Wang, Weicheng Xie, Linlin Shen, Siyang Song | http://arxiv.org/pdf/2412.20148v1 | https://github.com/CVI-SZU/DEGSTalk. |
2024-12-28 | Canonical Factors for Hybrid Neural Fields | 混合神经场的规范因子 | Brent Yi, Weijia Zeng, Sam Buchanan, Yi Ma | http://arxiv.org/pdf/2308.15461v2 | None |
2024-12-28 | Comprehensive Review of EEG-to-Output Research: Decoding Neural Signals into Images, Videos, and Audio | 全面回顾脑电图到输出的研究:将神经信号解码为图像、视频和音频 | Yashvir Sabharwal, Balaji Rama | http://arxiv.org/pdf/2412.19999v1 | None |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning | FSFM:通过自监督面部表示学习构建的可泛化人脸安全基础模型 | Gaojian Wang, Feng Lin, Tong Wu, Zhenguang Liu, Zhongjie Ba, Kui Ren | http://arxiv.org/pdf/2412.12032v2 | None |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | Transformer-Based Contrastive Meta-Learning For Low-Resource Generalizable Activity Recognition | 基于Transformer的对比元学习用于低资源泛化活动识别 | Junyao Wang, Mohammad Abdullah Al Faruque | http://arxiv.org/pdf/2412.20290v1 | None |
2024-12-28 | SyncDiff: Synchronized Motion Diffusion for Multi-Body Human-Object Interaction Synthesis | 同步运动扩散:多人体与人-物体交互合成的同步运动 | Wenkun He, Yun Liu, Ruitao Liu, Li Yi | http://arxiv.org/pdf/2412.20104v1 | None |
2024-12-28 | VELoRA: A Low-Rank Adaptation Approach for Efficient RGB-Event based Recognition | VELoRA:一种用于高效RGB-事件识别的低秩自适应方法 | Lan Chen, Haoxiang Yang, Pengpeng Shao, Haoyu Song, Xiao Wang, Zhicheng Zhao, Yaowei Wang, Yonghong Tian | http://arxiv.org/pdf/2412.20064v1 | https://github.com/Event-AHU/VELoRA |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | UniRestorer: Universal Image Restoration via Adaptively Estimating Image Degradation at Proper Granularity | 通用图像修复:通过自适应估计适当粒度下的图像退化 | Jingbo Lin, Zhilu Zhang, Wenbo Li, Renjing Pei, Hang Xu, Hongzhi Zhang, Wangmeng Zuo | http://arxiv.org/pdf/2412.20157v1 | https://github.com/mrluin/UniRestorer. |
2024-12-28 | MaIR: A Locality- and Continuity-Preserving Mamba for Image Restoration | MaIR:一种保留局部性和连续性的Mamba图像恢复方法 | Boyun Li, Haiyu Zhao, Wenxin Wang, Peng Hu, Yuanbiao Gou, Xi Peng | http://arxiv.org/pdf/2412.20066v1 | None |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | Altogether: Image Captioning via Re-aligning Alt-text | 整体:通过重新对齐替代文本进行图像描述 | Hu Xu, Po-Yao Huang, Xiaoqing Ellen Tan, Ching-Feng Yeh, Jacob Kahn, Christine Jou, Gargi Ghosh, Omer Levy | http://arxiv.org/pdf/2410.17251v3 | None |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | Demystifying CLIP Data | 揭开CLIP数据的神秘面纱 | Hu Xu, Saining Xie, Xiaoqing Ellen Tan, Po-Yao Huang, Russell Howes, Vasu Sharma, Shang-Wen Li, Gargi Ghosh | http://arxiv.org/pdf/2309.16671v5 | https://github.com/facebookresearch/MetaCLIP. |
2024-12-28 | ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving | 一致ID:多模态细粒度身份保持的肖像生成 | Jiehui Huang, Xiao Dong, Wenhui Song, Zheng Chong, Zhenchao Tang, Jun Zhou, Yuhao Cheng, Long Chen | http://arxiv.org/pdf/2404.16771v2 | None |
2024-12-28 | StyleAutoEncoder for manipulating image attributes using pre-trained StyleGAN | 风格自动编码器:利用预训练的StyleGAN操纵图像属性 | Andrzej Bedychaj, Jacek Tabor, Marek Śmieja | http://arxiv.org/pdf/2412.20164v1 | None |
2024-12-28 | ST$^3$: Accelerating Multimodal Large Language Model by Spatial-Temporal Visual Token Trimming | ST$^3$:通过时空视觉标记修剪加速多模态大型语言模型 | Jiedong Zhuang, Lu Lu, Ming Dai, Rui Hu, Jian Chen, Qiang Liu, Haoji Hu | http://arxiv.org/pdf/2412.20105v1 | None |
2024-12-28 | AdaDiff: Adaptive Step Selection for Fast Diffusion Models | AdaDiff:快速扩散模型的自适应步长选择 | Hui Zhang, Zuxuan Wu, Zhen Xing, Jie Shao, Yu-Gang Jiang | http://arxiv.org/pdf/2311.14768v2 | None |
2024-12-28 | Enhancing Diffusion Models for Inverse Problems with Covariance-Aware Posterior Sampling | 增强扩散模型在逆问题中的协方差感知后验采样 | Shayan Mohajer Hamidi, En-Hui Yang | http://arxiv.org/pdf/2412.20045v1 | None |
2024-12-28 | VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis | VersaGen:释放多功能的视觉控制以实现文本到图像的合成 | Zhipeng Chen, Lan Yang, Yonggang Qi, Honggang Zhang, Kaiyue Pang, Ke Li, Yi-Zhe Song | http://arxiv.org/pdf/2412.11594v3 | None |
2024-12-28 | An Ordinary Differential Equation Sampler with Stochastic Start for Diffusion Bridge Models | 具有随机起始的扩散桥模型普通微分方程采样器 | Yuang Wang, Pengfei Jin, Li Zhang, Quanzheng Li, Zhiqiang Chen, Dufan Wu | http://arxiv.org/pdf/2412.19992v1 | None |
2024-12-28 | ChatGarment: Garment Estimation, Generation and Editing via Large Language Models | ChatGarment:通过大型语言模型进行服装估计、生成和编辑 | Siyuan Bian, Chenghao Xu, Yuliang Xiu, Artur Grigorev, Zhen Liu, Cewu Lu, Michael J. Black, Yao Feng | http://arxiv.org/pdf/2412.17811v2 | None |
2024-12-28 | DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation | DiMSUM:扩散曼巴——一种可扩展且统一的图像生成空间-频率方法 | Hao Phung, Quan Dao, Trung Dao, Hoang Phan, Dimitris Metaxas, Anh Tran | http://arxiv.org/pdf/2411.04168v2 | https://github.com/VinAIResearch/DiMSUM.git. |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | Face-StyleSpeech: Enhancing Zero-shot Speech Synthesis from Face Images with Improved Face-to-Speech Mapping | 面部风格语音:通过改进人脸到语音映射增强从人脸图像的零样本语音合成 | Minki Kang, Wooseok Han, Eunho Yang | http://arxiv.org/pdf/2311.05844v2 | None |
2024-12-28 | Cross-Modal Mapping: Eliminating the Modality Gap for Few-Shot Image Classification | 跨模态映射:消除小样本图像分类的模态差距 | Xi Yang, Pai Peng, Wulin Xie, Xiaohuan Lu, Jie Wen | http://arxiv.org/pdf/2412.20110v1 | None |
2024-12-28 | SwinIA: Self-Supervised Blind-Spot Image Denoising without Convolutions | SwinIA:无需卷积的自监督盲点图像去噪 | Mikhail Papkov, Pavel Chizhov, Leopold Parts | http://arxiv.org/pdf/2305.05651v2 | None |
2024-12-28 | ERUP-YOLO: Enhancing Object Detection Robustness for Adverse Weather Condition by Unified Image-Adaptive Processing | ERUP-YOLO:通过统一图像自适应处理增强恶劣天气条件下目标检测鲁棒性 | Yuka Ogino, Yuho Shoji, Takahiro Toizumi, Atsushi Ito | http://arxiv.org/pdf/2411.02799v4 | None |
2024-12-28 | On the Compositional Generalization of Multimodal LLMs for Medical Imaging | 关于多模态LLMs在医学影像中的组合泛化 | Zhenyang Cai, Junying Chen, Rongsheng Wang, Weihong Wang, Yonglin Deng, Dingjie Song, Yize Chen, Zixu Zhang | http://arxiv.org/pdf/2412.20070v1 | https://github.com/FreedomIntelligence/Med-MAT. |
2024-12-28 | MADiff: Text-Guided Fashion Image Editing with Mask Prediction and Attention-Enhanced Diffusion | MADiff:基于掩码预测和注意力增强扩散的文本引导时尚图像编辑 | Zechao Zhan, Dehong Gao, Jinxia Zhang, Jiale Huang, Yang Hu, Xin Wang | http://arxiv.org/pdf/2412.20062v1 | None |
2024-12-28 | Unsupervised Cross-Domain Image Retrieval via Prototypical Optimal Transport | 无监督跨域图像检索通过原型最优传输 | Bin Li, Ye Shi, Qian Yu, Jingya Wang | http://arxiv.org/pdf/2402.18411v4 | None |
2024-12-28 | A Robust Adversarial Ensemble with Causal (Feature Interaction) Interpretations for Image Classification | 鲁棒对抗集成图像分类及其因果(特征交互)解释 | Chunheng Zhao, Pierluigi Pisu, Gurcan Comert, Negash Begashaw, Varghese Vaidyan, Nina Christine Hubig | http://arxiv.org/pdf/2412.20025v1 | None |
2024-12-28 | Uncertainty Quantified Deep Learning and Regression Analysis Framework for Image Segmentation of Skin Cancer Lesions | 皮肤癌病变图像分割的不确定性量化深度学习和回归分析框架 | Elhoucine Elfatimi, Pratik Shah | http://arxiv.org/pdf/2412.20007v1 | None |
2024-12-28 | SegKAN: High-Resolution Medical Image Segmentation with Long-Distance Dependencies | SegKAN:具有长距离依赖关系的超分辨率医学图像分割 | Shengbo Tan, Rundong Xue, Shipeng Luo, Zeyu Zhang, Xinran Wang, Lei Zhang, Daji Ergu, Zhang Yi | http://arxiv.org/pdf/2412.19990v1 | https://github.com/goblin327/SegKAN |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | Geo-ConvGRU: Geographically Masked Convolutional Gated Recurrent Unit for Bird-Eye View Segmentation | 地理掩码卷积门控循环单元在鸟瞰图分割中的应用 | Guanglei Yang, Yongqiang Zhang, Wanlong Li, Yu Tang, Weize Shang, Feng Wen, Hongbo Zhang, Mingli Ding | http://arxiv.org/pdf/2412.20171v1 | None |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | TAVP: Task-Adaptive Visual Prompt for Cross-domain Few-shot Segmentation | 跨域小样本分割的任务自适应视觉提示 | Jiaqi Yang, Yaning Zhang, Jingxi Hu, Xiangjian He, Linlin Shen, Guoping Qiu | http://arxiv.org/pdf/2409.05393v2 | None |
2024-12-28 | Maintain Plasticity in Long-timescale Continual Test-time Adaptation | 保持长时标持续测试时自适应的塑性 | Yanshuo Wang, Xuesong Li, Jinguang Tong, Jie Hong, Jun Lan, Weiqiang Wang, Huijia Zhu, Haoxing Chen | http://arxiv.org/pdf/2412.20034v1 | None |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | Few-shot Algorithm Assurance | 少量样本算法保证 | Dang Nguyen, Sunil Gupta | http://arxiv.org/pdf/2412.20275v1 | None |
2024-12-28 | An archaeological Catalog Collection Method Based on Large Vision-Language Models | 基于大型视觉-语言模型的考古目录集合方法 | Honglin Pang, Yi Chang, Tianjing Duan, Xi Yang | http://arxiv.org/pdf/2412.20088v1 | None |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | Enhancing Transfer Learning for Medical Image Classification with SMOTE: A Comparative Study | 基于SMOTE增强医学图像分类的迁移学习:一项比较研究 | Md. Zehan Alam, Tonmoy Roy, H. M. Nahid Kawsar, Iffat Rimi | http://arxiv.org/pdf/2412.20235v1 | None |
2024-12-28 | Plastic Waste Classification Using Deep Learning: Insights from the WaDaBa Dataset | 塑料垃圾分类利用深度学习:WaDaBa数据集的见解 | Suman Kunwar, Banji Raphael Owabumoye, Abayomi Simeon Alade | http://arxiv.org/pdf/2412.20232v1 | None |
2024-12-28 | First qualitative observations on deep learning vision model YOLO and DETR for automated driving in Austria | 奥地利自动驾驶中的深度学习视觉模型YOLO和DETR的初步定性观察 | Stefan Schoder | http://arxiv.org/pdf/2312.12314v2 | None |
2024-12-28 | Shifting Spotlight for Co-supervision: A Simple yet Efficient Single-branch Network to See Through Camouflage | 移位焦点协同监督:一种简单高效的单一分支网络穿透伪装 | Yang Hu, Jinxia Zhang, Kaihua Zhang, Yin Yuan, Jiale Huang, Zechao Zhan, Xing Wang | http://arxiv.org/pdf/2404.08936v2 | None |
2024-12-28 | Mining Platoon Patterns from Traffic Videos | 从交通视频中挖掘车队模式 | Yijun Bei, Teng Ma, Dongxiang Zhang, Sai Wu, Kian-Lee Tan, Gang Chen | http://arxiv.org/pdf/2412.20177v1 | None |
2024-12-28 | On dataset transferability in medical image classification | 医学图像分类中的数据集迁移性研究 | Dovile Juodelyte, Enzo Ferrante, Yucheng Lu, Prabhant Singh, Joaquin Vanschoren, Veronika Cheplygina | http://arxiv.org/pdf/2412.20172v1 | https://github.com/DovileDo/transferability-in-medical-imaging. |
2024-12-28 | CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition | CHASE:基于骨架的多实体动作识别中的凸包自适应偏移学习 | Yuhang Wen, Mengyuan Liu, Songtao Wu, Beichen Ding | http://arxiv.org/pdf/2410.07153v2 | https://github.com/Necolizer/CHASE |
2024-12-28 | Conformal Risk Control for Pulmonary Nodule Detection | 肺结节检测中的共形风险控制 | Roel Hulsman, Valentin Comte, Lorenzo Bertolini, Tobias Wiesenthal, Antonio Puertas Gallardo, Mario Ceresa | http://arxiv.org/pdf/2412.20167v1 | None |
2024-12-28 | A Cascaded Dilated Convolution Approach for Mpox Lesion Classification | 基于级联扩张卷积的猴痘病变分类方法 | Ayush Deshmukh | http://arxiv.org/pdf/2412.10106v2 | None |
2024-12-28 | Distilled Transformers with Locally Enhanced Global Representations for Face Forgery Detection | 具有局部增强全局表示的蒸馏Transformer用于人脸伪造检测 | Yaning Zhang, Qiufu Li, Zitong Yu, Linlin Shen | http://arxiv.org/pdf/2412.20156v1 | None |
2024-12-28 | Self-Calibrated Dual Contrasting for Annotation-Efficient Bacteria Raman Spectroscopy Clustering and Classification | 自校准双对比法在细菌拉曼光谱聚类和分类中的高效标注 | Haiming Yao, Wei Luo, Tao Zhou, Ang Gao, Xue Wang | http://arxiv.org/pdf/2412.20060v1 | None |
2024-12-28 | SimLTD: Simple Supervised and Semi-Supervised Long-Tailed Object Detection | SimLTD:简单监督和半监督长尾目标检测 | Phi Vu Tran | http://arxiv.org/pdf/2412.20047v1 | None |
2024-12-28 | DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments | DAVE:复杂和不可预测环境中具有高脆弱道路使用者代表性的多样化原子视觉元素数据集 | Xijun Wang, Pedro Sandoval-Segura, Chengyuan Zhang, Junyun Huang, Tianrui Guan, Ruiqi Xian, Fuxiao Liu, Rohan Chandra | http://arxiv.org/pdf/2412.20042v1 | None |
2024-12-28 | Searching from Area to Point: A Hierarchical Framework for Semantic-Geometric Combined Feature Matching | 从区域到点:语义-几何结合特征的层次化框架 | Yesheng Zhang, Xu Zhao | http://arxiv.org/pdf/2305.00194v6 | None |
2024-12-28 | Adversarial Robustness for Deep Learning-based Wildfire Detection Models | 基于深度学习的野火检测模型的对抗鲁棒性 | Ryo Ide, Lei Yang | http://arxiv.org/pdf/2412.20006v1 | None |
2024-12-28 | DFME: A New Benchmark for Dynamic Facial Micro-expression Recognition | 动态面部微表情识别新基准:DFME | Sirui Zhao, Huaying Tang, Xinglong Mao, Shifeng Liu, Yiming Zhang, Hao Wang, Tong Xu, Enhong Chen | http://arxiv.org/pdf/2301.00985v2 | None |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | Towards Visual Grounding: A Survey | 视觉定位:综述 | Linhui Xiao, Xiaoshan Yang, Xiangyuan Lan, Yaowei Wang, Changsheng Xu | http://arxiv.org/pdf/2412.20206v1 | https://github.com/linhuixiao/Awesome-Visual-Grounding. |
2024-12-28 | B-AVIBench: Towards Evaluating the Robustness of Large Vision-Language Model on Black-box Adversarial Visual-Instructions | B-AVIBench:迈向评估大型视觉-语言模型在黑盒对抗视觉指令上的鲁棒性 | Hao Zhang, Wenqi Shao, Hong Liu, Yongqiang Ma, Ping Luo, Yu Qiao, Nanning Zheng, Kaipeng Zhang | http://arxiv.org/pdf/2403.09346v2 | https://github.com/zhanghao5201/B-AVIBench. |
2024-12-28 | AI-based Wearable Vision Assistance System for the Visually Impaired: Integrating Real-Time Object Recognition and Contextual Understanding Using Large Vision-Language Models | 基于AI的视障人士可穿戴视觉辅助系统:利用大型视觉-语言模型整合实时物体识别和上下文理解 | Mirza Samad Ahmed Baig, Syeda Anshrah Gillani, Shahid Munir Shah, Mahmoud Aljawarneh, Abdul Akbar Khan, Muhammad Hamzah Siddiqui | http://arxiv.org/pdf/2412.20059v1 | None |
2024-12-28 | FashionFAE: Fine-grained Attributes Enhanced Fashion Vision-Language Pre-training | 时尚FAE:细粒度属性增强的时尚视觉-语言预训练 | Jiale Huang, Dehong Gao, Jinxia Zhang, Zechao Zhan, Yang Hu, Xin Wang | http://arxiv.org/pdf/2412.19997v1 | None |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | Injecting Explainability and Lightweight Design into Weakly Supervised Video Anomaly Detection Systems | 将弱监督视频异常检测系统注入可解释性和轻量级设计 | Wen-Dong Jiang, Chih-Yung Chang, Hsiang-Chuan Chang, Ji-Yuan Chen, Diptendu Sinha Roy | http://arxiv.org/pdf/2412.20201v1 | None |
2024-12-28 | STNMamba: Mamba-based Spatial-Temporal Normality Learning for Video Anomaly Detection | STNMamba:基于Mamba的空间-时间正常性学习用于视频异常检测 | Zhangxun Li, Mengyang Zhao, Xuan Yang, Yang Liu, Jiamu Sheng, Xinhua Zeng, Tian Wang, Kewei Wu | http://arxiv.org/pdf/2412.20084v1 | None |
2024-12-28 | MAKIMA: Tuning-free Multi-Attribute Open-domain Video Editing via Mask-Guided Attention Modulation | MAKIMA:基于掩码引导的注意力调制,无需调优的多属性开放域视频编辑 | Haoyu Zheng, Wenqiao Zhang, Zheqi Lv, Yu Zhong, Yang Dai, Jianxiang An, Yongliang Shen, Juncheng Li | http://arxiv.org/pdf/2412.19978v1 | None |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | Learning Adaptive and View-Invariant Vision Transformer with Multi-Teacher Knowledge Distillation for Real-Time UAV Tracking | 学习自适应和视角不变视觉Transformer,通过多教师知识蒸馏实现实时无人机跟踪 | You Wu, Yongxin Li, Mengyuan Liu, Xucheng Wang, Xiangyang Yang, Hengzhou Ye, Dan Zeng, Qijun Zhao | http://arxiv.org/pdf/2412.20002v1 | None |
发布日期 | 英文标题 | 中文标题 | 作者 | PDF链接 | 代码链接 |
---|---|---|---|---|---|
2024-12-28 | Recommender Engine Driven Client Selection in Federated Brain Tumor Segmentation | 联邦脑肿瘤分割中的推荐引擎驱动的客户端选择 | Muhammad Irfan Khan, Elina Kontio, Suleiman A. Khan, Mojtaba Jafaritadi | http://arxiv.org/pdf/2412.20250v1 | None |
2024-12-28 | MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping | MSDNet:基于Transformer引导的原型设计的多尺度解码器用于小样本语义分割 | Amirreza Fateh, Mohammad Reza Mohammadi, Mohammad Reza Jahed Motlagh | http://arxiv.org/pdf/2409.11316v2 | https://github.com/amirrezafateh/MSDNet |