Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-07-27 | Mamba? Catch The Hype Or Rethink What Really Helps for Image Registration | Mamba?赶上炒作还是重新思考图像配准的真正帮助 | Bailiang Jian, Jiazhen Pan, Morteza Ghahremani, Daniel Rueckert, Christian Wachinger, Benedikt Wiestler | http://arxiv.org/pdf/2407.19274v1 | null |
2024-07-27 | Fine-Grained Scene Graph Generation via Sample-Level Bias Prediction | 通过样本级偏差预测生成细粒度场景图 | Yansheng Li, Tingzhu Wang, Kang Wu, Linlin Wang, Xin Guo, Wenbin Wang | http://arxiv.org/pdf/2407.19259v1 | null |
2024-07-27 | Radio Frequency Signal based Human Silhouette Segmentation: A Sequential Diffusion Approach | 基于射频信号的人体轮廓分割:一种序贯扩散方法 | Penghui Wen, Kun Hu, Dong Yuan, Zhiyuan Ning, Changyang Li, Zhiyong Wang | http://arxiv.org/pdf/2407.19244v1 | null |
2024-07-27 | Channel Boosted CNN-Transformer-based Multi-Level and Multi-Scale Nuclei Segmentation | 基于通道增强 CNN-Transformer 的多级多尺度核分割 | Zunaira Rauf, Abdul Rehman Khan, Asifullah Khan | http://arxiv.org/pdf/2407.19186v1 | null |
2024-07-27 | Data Processing Techniques for Modern Multimodal Models | 现代多模态模型的数据处理技术 | Yinheng Li, Han Ding, Hang Chen | http://arxiv.org/pdf/2407.19180v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-07-27 | Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification | 将大型语言模型集成到三模态架构中,实现抑郁症的自动分类 | Santosh V. Patapati | http://arxiv.org/pdf/2407.19340v1 | null |
2024-07-27 | Harmfully Manipulated Images Matter in Multimodal Misinformation Detection | 有害操纵的图像在多模态错误信息检测中很重要 | Bing Wang, Shengsheng Wang, Changchun Li, Renchu Guan, Ximing Li | http://arxiv.org/pdf/2407.19192v1 | null |
2024-07-27 | LLaVA-Read: Enhancing Reading Ability of Multimodal Language Models | LLaVA-Read:增强多模态语言模型的阅读能力 | Ruiyi Zhang, Yufan Zhou, Jian Chen, Jiuxiang Gu, Changyou Chen, Tong Sun | http://arxiv.org/pdf/2407.19185v1 | null |
2024-07-27 | Robust Multimodal 3D Object Detection via Modality-Agnostic Decoding and Proximity-based Modality Ensemble | 通过模态无关解码和基于邻近度的模态集成实现稳健的多模态 3D 物体检测 | Juhan Cha, Minseok Joo, Jihwan Park, Sanghyeok Lee, Injae Kim, Hyunwoo J. Kim | http://arxiv.org/pdf/2407.19156v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-07-27 | Revisit Self-supervised Depth Estimation with Local Structure-from-Motion | 重新审视基于局部运动结构的自监督深度估计 | Shengjie Zhu, Xiaoming Liu | http://arxiv.org/pdf/2407.19166v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-07-27 | Sewer Image Super-Resolution with Depth Priors and Its Lightweight Network | 基于深度先验的下水道图像超分辨率及其轻量级网络 | Gang Pan, Chen Wang, Zhijie Sui, Shuai Guo, Yaozhi Lv, Honglie Li, Di Sun | http://arxiv.org/pdf/2407.19271v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-07-27 | Polyp segmentation in colonoscopy images using DeepLabV3++ | 使用 DeepLabV3++ 对结肠镜检查图像中的息肉进行分割 | Al Mohimanul Islam, Sadia Shakiba Bhuiyan, Mysun Mashira, Md. Rayhan Ahmed, Salekul Islam, Swakkhar Shatabda | http://arxiv.org/pdf/2407.19327v1 | null |
2024-07-27 | MSP-MVS: Multi-granularity Segmentation Prior Guided Multi-View Stereo | MSP-MVS:多粒度分割优先引导多视图立体 | Zhenlong Yuan, Cong Liu, Fei Shen, Zhaoxin Li, Tianlu Mao, Zhaoqi Wang | http://arxiv.org/pdf/2407.19323v1 | null |
2024-07-27 | AResNet-ViT: A Hybrid CNN-Transformer Network for Benign and Malignant Breast Nodule Classification in Ultrasound Images | AResNet-ViT:一种用于超声图像中乳腺结节良恶性分类的混合 CNN-Transformer 网络 | Xin Zhao, Qianqian Zhu, Jialing Wu | http://arxiv.org/pdf/2407.19316v1 | null |
2024-07-27 | Ensembling convolutional neural networks for human skin segmentation | 集成卷积神经网络进行人体皮肤分割 | Patryk Kuban, Michal Kawulok | http://arxiv.org/pdf/2407.19310v1 | null |
2024-07-27 | Symmetrical Joint Learning Support-query Prototypes for Few-shot Segmentation | 用于小样本分割的对称联合学习支持查询原型 | Qun Li, Baoquan Sun, Fu Xiao, Yonggang Qi, Bir Bhanu | http://arxiv.org/pdf/2407.19306v1 | null |
2024-07-27 | GP-VLS: A general-purpose vision language model for surgery | GP-VLS:用于手术的通用视觉语言模型 | Samuel Schmidgall, Joseph Cho, Cyril Zakka, William Hiesinger | http://arxiv.org/pdf/2407.19305v1 | null |
2024-07-27 | Rethinking Attention Module Design for Point Cloud Analysis | 重新思考点云分析的注意力模块设计 | Chengzhi Wu, Kaige Wang, Zeyun Zhong, Hao Fu, Junwei Zheng, Jiaming Zhang, Julius Pfrommer, Jürgen Beyerer | http://arxiv.org/pdf/2407.19294v1 | null |
2024-07-27 | Optimizing Synthetic Data for Enhanced Pancreatic Tumor Segmentation | 优化合成数据以增强胰腺肿瘤分割 | Linkai Peng, Zheyuan Zhang, Gorkem Durak, Frank H. Miller, Alpay Medetalibeyoglu, Michael B. Wallace, Ulas Bagci | http://arxiv.org/pdf/2407.19284v1 | null |
2024-07-27 | Enhancing Tree Type Detection in Forest Fire Risk Assessment: Multi-Stage Approach and Color Encoding with Forest Fire Risk Evaluation Framework for UAV Imagery | 增强森林火灾风险评估中的树木类型检测:无人机图像森林火灾风险评估框架的多阶段方法和颜色编码 | Jinda Zhang, Michal Aibin | http://arxiv.org/pdf/2407.19184v1 | null |
2024-07-27 | Reducing Spurious Correlation for Federated Domain Generalization | 减少联邦域泛化的虚假相关性 | Shuran Ma, Weiying Xie, Daixun Li, Haowei Li, Yunsong Li | http://arxiv.org/pdf/2407.19174v1 | null |
2024-07-27 | Few-Shot Medical Image Segmentation with Large Kernel Attention | 具有大核注意力机制的少样本医学图像分割 | Xiaoxiao Wu, Xiaowei Chen, Zhenguo Gao, Shulei Qu, Yuanyuan Qiu | http://arxiv.org/pdf/2407.19148v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-07-27 | Multi-Expert Adaptive Selection: Task-Balancing for All-in-One Image Restoration | 多专家自适应选择:一体化图像修复的任务平衡 | Xiaoyan Yu, Shen Zhou, Huafeng Li, Liehuang Zhu | http://arxiv.org/pdf/2407.19139v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-07-27 | Faster Image2Video Generation: A Closer Look at CLIP Image Embedding's Impact on Spatio-Temporal Cross-Attentions | 更快的图像到视频生成:仔细研究 CLIP 图像嵌入对时空交叉注意力的影响 | Ashkan Taghipour, Morteza Ghahremani, Mohammed Bennamoun, Aref Miri Rekavandi, Zinuo Li, Hamid Laga, Farid Boussaid | http://arxiv.org/pdf/2407.19205v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-07-27 | A Bayesian Approach Toward Robust Multidimensional Ellipsoid-Specific Fitting | 一种稳健的多维椭球体拟合的贝叶斯方法 | Zhao Mingyang, Jia Xiaohong, Ma Lei, Shi Yuke, Jiang Jingen, Li Qizhai, Yan Dong-Ming, Huang Tiejun | http://arxiv.org/pdf/2407.19269v1 | null |
2024-07-27 | Magic3DSketch: Create Colorful 3D Models From Sketch-Based 3D Modeling Guided by Text and Language-Image Pre-Training | Magic3DSketch:通过文本和语言图像预训练指导基于草图的 3D 建模创建彩色 3D 模型 | Ying Zang, Yidong Han, Chaotao Ding, Jianqi Zhang, Tianrun Chen | http://arxiv.org/pdf/2407.19225v1 | null |
2024-07-27 | RePLAy: Remove Projective LiDAR Depthmap Artifacts via Exploiting Epipolar Geometry | RePLAy:利用对极几何去除投影 LiDAR 深度图伪影 | Shengjie Zhu, Girish Chandar Ganesan, Abhinav Kumar, Xiaoming Liu | http://arxiv.org/pdf/2407.19154v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-07-27 | Comprehensive Attribution: Inherently Explainable Vision Model with Feature Detector | 综合归因:具有特征检测器的固有可解释视觉模型 | Xianren Zhang, Dongwon Lee, Suhang Wang | http://arxiv.org/pdf/2407.19308v1 | null |
2024-07-27 | A self-supervised and adversarial approach to hyperspectral demosaicking and RGB reconstruction in surgical imaging | 手术成像中高光谱去马赛克和 RGB 重建的自监督和对抗方法 | Peichao Li, Oscar MacCormac, Jonathan Shapey, Tom Vercauteren | http://arxiv.org/pdf/2407.19282v1 | null |
2024-07-27 | Towards the Dynamics of a DNN Learning Symbolic Interactions | 面向学习符号交互的 DNN 动力学 | Qihan Ren, Yang Xu, Junpeng Zhang, Yue Xin, Dongrui Liu, Quanshi Zhang | http://arxiv.org/pdf/2407.19198v1 | null |
2024-07-27 | Power-LLaVA: Large Language and Vision Assistant for Power Transmission Line Inspection | Power-LLaVA:电力输电线路巡检大型语言和视觉助手 | Jiahao Wang, Mingxuan Li, Haichen Luo, Jinguo Zhu, Aijun Yang, Mingzhe Rong, Xiaohua Wang | http://arxiv.org/pdf/2407.19178v1 | null |