Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-24 | Sandwich GAN: Image Reconstruction from Phase Mask based Anti-dazzle Imaging | Sandwich GAN:基于相位掩模的防眩光图像重建 | Xiaopeng Peng, Erin F. Fleet, Abbie T. Watnik, Grover A. Swartzlander | http://arxiv.org/pdf/2402.15919v1 | null |
2024-02-24 | Enhanced Droplet Analysis Using Generative Adversarial Networks | 使用生成对抗网络增强液滴分析 | Tan-Hanh Pham, Kim-Doang Nguyen | http://arxiv.org/pdf/2402.15909v1 | null |
2024-02-24 | HIR-Diff: Unsupervised Hyperspectral Image Restoration Via Improved Diffusion Models | HIR-Diff:通过改进的扩散模型进行无监督高光谱图像恢复 | Li Pang, Xiangyu Rui, Long Cui, Hongzhong Wang, Deyu Meng, Xiangyong Cao | http://arxiv.org/pdf/2402.15865v1 | null |
2024-02-24 | A Generative Machine Learning Model for Material Microstructure 3D Reconstruction and Performance Evaluation | 用于材料微观结构 3D 重建和性能评估的生成机器学习模型 | Yilin Zheng, Zhigong Song | http://arxiv.org/pdf/2402.15815v1 | null |
2024-02-24 | Intelligent Director: An Automatic Framework for Dynamic Visual Composition using ChatGPT | 智能导演:使用 ChatGPT 的动态视觉合成自动框架 | Sixiao Zheng, Jingyang Huo, Yu Wang, Yanwei Fu | http://arxiv.org/pdf/2402.15746v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-24 | Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA | 弥合 2D 和 3D 视觉问答之间的差距:3D VQA 的融合方法 | Wentao Mo, Yang Liu | http://arxiv.org/pdf/2402.15933v1 | null |
2024-02-24 | Multimodal Instruction Tuning with Conditional Mixture of LoRA | 使用 LoRA 的条件混合进行多模式指令调整 | Ying Shen, Zhiyang Xu, Qifan Wang, Yu Cheng, Wenpeng Yin, Lifu Huang | http://arxiv.org/pdf/2402.15896v1 | null |
2024-02-24 | FedMM: Federated Multi-Modal Learning with Modality Heterogeneity in Computational Pathology | FedMM:计算病理学中具有模态异质性的联合多模态学习 | Yuanzhe Peng, Jieming Bian, Jie Xu | http://arxiv.org/pdf/2402.15858v1 | null |
2024-02-24 | Parameter-efficient Prompt Learning for 3D Point Cloud Understanding | 用于 3D 点云理解的参数高效快速学习 | Hongyu Sun, Yongcai Wang, Wang Chen, Haoran Deng, Deying Li | http://arxiv.org/pdf/2402.15823v1 | null |
2024-02-24 | Increasing SAM Zero-Shot Performance on Multimodal Medical Images Using GPT-4 Generated Descriptive Prompts Without Human Annotation | 使用 GPT-4 生成的描述性提示(无需人工注释)提高多模态医学图像的 SAM 零样本性能 | Zekun Jiang, Dongjie Cheng, Ziyuan Qin, Jun Gao, Qicheng Lao, Kang Li, Le Zhang | http://arxiv.org/pdf/2402.15759v1 | null |
2024-02-24 | GAOKAO-MM: A Chinese Human-Level Benchmark for Multimodal Models Evaluation | GAOKAO-MM:中国人类水平的多模态模型评估基准 | Yi Zong, Xipeng Qiu | http://arxiv.org/pdf/2402.15745v1 | null |
2024-02-24 | CLIPose: Category-Level Object Pose Estimation with Pre-trained Vision-Language Knowledge | CLIPose:利用预先训练的视觉语言知识进行类别级物体姿态估计 | Xiao Lin, Minghao Zhu, Ronghao Dang, Guangliang Zhou, Shaolong Shu, Feng Lin, Chengju Liu, Qijun Chen | http://arxiv.org/pdf/2402.15726v1 | null |
2024-02-24 | DeepLight: Reconstructing High-Resolution Observations of Nighttime Light With Multi-Modal Remote Sensing Data | DeepLight:利用多模态遥感数据重建夜间光的高分辨率观测 | Lixian Zhang, Runmin Dong, Shuai Yuan, Jinxiao Zhang, Mengxuan Chen, Juepeng Zheng, Haohuan Fu | http://arxiv.org/pdf/2402.15659v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-24 | Spec-Gaussian: Anisotropic View-Dependent Appearance for 3D Gaussian Splatting | Spec-Gaussian:3D 高斯泼溅的各向异性视图相关外观 | Ziyi Yang, Xinyu Gao, Yangtian Sun, Yihua Huang, Xiaoyang Lyu, Wen Zhou, Shaohui Jiao, Xiaojuan Qi, Xiaogang Jin | http://arxiv.org/pdf/2402.15870v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-24 | DART: Depth-Enhanced Accurate and Real-Time Background Matting | DART:深度增强的准确实时背景抠图 | Hanxi Li, Guofeng Li, Bo Li, Lin Wu, Yan Cheng | http://arxiv.org/pdf/2402.15820v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-24 | Explainable Contrastive and Cost-Sensitive Learning for Cervical Cancer Classification | 宫颈癌分类的可解释对比和成本敏感学习 | Ashfiqun Mustari, Rushmia Ahmed, Afsara Tasnim, Jakia Sultana Juthi, G M Shahariar | http://arxiv.org/pdf/2402.15905v1 | null |
2024-02-24 | Multi-Object Tracking by Hierarchical Visual Representations | 通过分层视觉表示进行多目标跟踪 | Jinkun Cao, Jiangmiao Pang, Kris Kitani | http://arxiv.org/pdf/2402.15895v1 | null |
2024-02-24 | Multi-graph Graph Matching for Coronary Artery Semantic Labeling | 冠状动脉语义标记的多图图形匹配 | Chen Zhao, Zhihui Xu, Pukar Baral, Michel Esposito, Weihua Zhou | http://arxiv.org/pdf/2402.15894v1 | null |
2024-02-24 | Multiple Instance Learning for Glioma Diagnosis using Hematoxylin and Eosin Whole Slide Images: An Indian cohort Study | 使用苏木精和曙红全幻灯片图像进行神经胶质瘤诊断的多实例学习:一项印度队列研究 | Ekansh Chauhan, Amit Sharma, Megha S Uppin, C. V. Jawahar, Vinod P. K | http://arxiv.org/pdf/2402.15832v1 | null |
2024-02-24 | Sequential Visual and Semantic Consistency for Semi-supervised Text Recognition | 半监督文本识别的顺序视觉和语义一致性 | Mingkun Yang, Biao Yang, Minghui Liao, Yingying Zhu, Xiang Bai | http://arxiv.org/pdf/2402.15806v1 | null |
2024-02-24 | IRConStyle: Image Restoration Framework Using Contrastive Learning and Style Transfer | IRConStyle:使用对比学习和风格迁移的图像恢复框架 | Dongqi Fan, Xin Zhao, Liang Chang | http://arxiv.org/pdf/2402.15784v1 | null |
2024-02-24 | Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning | Res-VMamba:使用选择性状态空间模型和深度残差学习进行细粒度食品类别视觉分类 | Chi-Sheng Chen, Guan-Ying Chen, Dong Zhou, Di Jiang, Dai-Shi Chen | http://arxiv.org/pdf/2402.15761v1 | null |
2024-02-24 | Detection Is Tracking: Point Cloud Multi-Sweep Deep Learning Models Revisited | 检测即跟踪:重新审视点云多重扫描深度学习模型 | Lingji Chen | http://arxiv.org/pdf/2402.15756v1 | null |
2024-02-24 | GiMeFive: Towards Interpretable Facial Emotion Classification | GiMeFive:迈向可解释的面部情绪分类 | Jiawen Wang, Leah Kawka | http://arxiv.org/pdf/2402.15662v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-24 | RAUCA: A Novel Physical Adversarial Attack on Vehicle Detectors via Robust and Accurate Camouflage Generation | RAUCA:通过强大而准确的伪装生成对车辆探测器进行新型物理对抗攻击 | Jiawei Zhou, Linye Lyu, Daojing He, Yu Li | http://arxiv.org/pdf/2402.15853v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-24 | NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation | NaVid:基于视频的 VLM 计划视觉和语言导航的下一步 | Jiazhao Zhang, Kunyu Wang, Rongtao Xu, Gengze Zhou, Yicong Hong, Xiaomeng Fang, Qi Wu, Zhizheng Zhang, Wang He | http://arxiv.org/pdf/2402.15852v1 | null |
2024-02-24 | Design, Implementation and Analysis of a Compressed Sensing Photoacoustic Projection Imaging System | 压缩感知光声投影成像系统的设计、实现与分析 | Markus Haltmeier, Matthias Ye, Karoline Felbermayer, Florian Hinterleitner, Peter Burgholzer | http://arxiv.org/pdf/2402.15750v1 | null |
2024-02-24 | Traditional Transformation Theory Guided Model for Learned Image Compression | 传统变换理论指导的学习图像压缩模型 | Zhiyuan Li, Chenyang Ge, Shun Li | http://arxiv.org/pdf/2402.15744v1 | null |
2024-02-24 | A Heterogeneous Dynamic Convolutional Neural Network for Image Super-resolution | 一种用于图像超分辨率的异构动态卷积神经网络 | Chunwei Tian, Xuanyu Zhang, Jia Ren, Wangmeng Zuo, Yanning Zhang, Chia-Wen Lin | http://arxiv.org/pdf/2402.15704v1 | null |
2024-02-24 | General Purpose Image Encoder DINOv2 for Medical Image Registration | 用于医学图像配准的通用图像编码器 DINOv2 | Xinrui Song, Xuanang Xu, Pingkun Yan | http://arxiv.org/pdf/2402.15687v1 | null |
2024-02-24 | Scalable Density-based Clustering with Random Projections | 具有随机投影的可扩展的基于密度的聚类 | Haochuan Xu, Ninh Pham | http://arxiv.org/pdf/2402.15679v1 | null |