Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-25 | Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities | 多模态路径:利用其他模态的不相关数据改进 Transformer | Yiyuan Zhang, Xiaohan Ding, Kaixiong Gong, Yixiao Ge, Ying Shan, Xiangyu Yue | http://arxiv.org/pdf/2401.14405v1 | link |
2024-01-25 | pix2gestalt: Amodal Segmentation by Synthesizing Wholes | pix2gestalt:通过综合整体进行无模态分割 | Ege Ozguroglu, Ruoshi Liu, Dídac Surís, Dian Chen, Achal Dave, Pavel Tokmakov, Carl Vondrick | http://arxiv.org/pdf/2401.14398v1 | link |
2024-01-25 | Rethinking Patch Dependence for Masked Autoencoders | 重新思考屏蔽自动编码器的补丁依赖性 | Letian Fu, Long Lian, Renhao Wang, Baifeng Shi, Xudong Wang, Adam Yala, Trevor Darrell, Alexei A. Efros, Ken Goldberg | http://arxiv.org/pdf/2401.14391v1 | null |
2024-01-25 | Inconsistency Masks: Removing the Uncertainty from Input-Pseudo-Label Pairs | 不一致掩码:消除输入伪标签对的不确定性 | Michael R. H. Vorndran, Bernhard F. Roeck | http://arxiv.org/pdf/2401.14387v1 | link |
2024-01-25 | UrbanGenAI: Reconstructing Urban Landscapes using Panoptic Segmentation and Diffusion Models | UrbanGenAI:使用全景分割和扩散模型重建城市景观 | Timo Kapsalis | http://arxiv.org/pdf/2401.14379v1 | null |
2024-01-25 | Progressive Multi-task Anti-Noise Learning and Distilling Frameworks for Fine-grained Vehicle Recognition | 用于细粒度车辆识别的渐进式多任务抗噪声学习和蒸馏框架 | Dichao Liu | http://arxiv.org/pdf/2401.14336v1 | link |
2024-01-25 | Unlocking Past Information: Temporal Embeddings in Cooperative Bird's Eye View Prediction | 解锁过去的信息:合作鸟瞰预测中的时间嵌入 | Dominik Rößle, Jeremias Gerner, Klaus Bogenberger, Daniel Cremers, Stefanie Schmidtner, Torsten Schön | http://arxiv.org/pdf/2401.14325v1 | null |
2024-01-25 | Producing Plankton Classifiers that are Robust to Dataset Shift | 生成对数据集转换具有鲁棒性的浮游生物分类器 | Cheng Chen, Sreenath Kyathanahally, Marta Reyes, Stefanie Merkli, Ewa Merz, Emanuele Francazi, Marvin Hoege, Francesco Pomati, Marco Baity-Jesi | http://arxiv.org/pdf/2401.14256v1 | null |
2024-01-25 | On generalisability of segment anything model for nuclear instance segmentation in histology images | 组织学图像中核实例分割的分段任意模型的通用性 | Kesi Xu, Lea Goetz, Nasir Rajpoot | http://arxiv.org/pdf/2401.14248v1 | null |
2024-01-25 | Exploring the Unexplored: Understanding the Impact of Layer Adjustments on Image Classification | 探索未探索的事物:了解图层调整对图像分类的影响 | Haixia Liu, Tim Brailsford, James Goulding, Gavin Smith, Larry Bull | http://arxiv.org/pdf/2401.14236v1 | null |
2024-01-25 | Clinical Melanoma Diagnosis with Artificial Intelligence: Insights from a Prospective Multicenter Study | 人工智能临床黑色素瘤诊断:前瞻性多中心研究的见解 | Lukas Heinlein, Roman C. Maron, Achim Hekler, Sarah Haggenmüller, Christoph Wies, Jochen S. Utikal, Friedegund Meier, Sarah Hobelsberger, Frank F. Gellrich, Mildred Sergon, et.al. | http://arxiv.org/pdf/2401.14193v1 | null |
2024-01-25 | Vivim: a Video Vision Mamba for Medical Video Object Segmentation | Vivim:用于医疗视频对象分割的视频视觉 Mamba | Yijun Yang, Zhaohu Xing, Lei Zhu | http://arxiv.org/pdf/2401.14168v1 | null |
2024-01-25 | Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks | 扎根 SAM:为各种视觉任务组装开放世界模型 | Tianhe Ren, Shilong Liu, Ailing Zeng, Jing Lin, Kunchang Li, He Cao, Jiayu Chen, Xinyu Huang, Yukang Chen, Feng Yan, et.al. | http://arxiv.org/pdf/2401.14159v1 | null |
2024-01-25 | Expression-aware video inpainting for HMD removal in XR applications | 用于在 XR 应用程序中移除 HMD 的表情感知视频修复 | Fatemeh Ghorbani Lohesara, Karen Egiazarian, Sebastian Knorr | http://arxiv.org/pdf/2401.14136v1 | null |
2024-01-25 | Attention-based Efficient Classification for 3D MRI Image of Alzheimer's Disease | 基于注意力的阿尔茨海默病 3D MRI 图像高效分类 | Yihao Lin, Ximeng Li, Yan Zhang, Jinshan Tang | http://arxiv.org/pdf/2401.14130v1 | null |
2024-01-25 | MIFI: MultI-camera Feature Integration for Roust 3D Distracted Driver Activity Recognition | MIFI:用于 Roust 3D 分心驾驶员活动识别的多摄像头功能集成 | Jian Kuang, Wenjing Li, Fang Li, Jun Zhang, Zhongcheng Wu | http://arxiv.org/pdf/2401.14115v1 | link |
2024-01-25 | Double Trouble? Impact and Detection of Duplicates in Face Image Datasets | 双重麻烦?人脸图像数据集中重复项的影响和检测 | Torsten Schlett, Christian Rathgeb, Juan Tapia, Christoph Busch | http://arxiv.org/pdf/2401.14088v1 | link |
2024-01-25 | ProCNS: Progressive Prototype Calibration and Noise Suppression for Weakly-Supervised Medical Image Segmentation | ProCNS:弱监督医学图像分割的渐进式原型校准和噪声抑制 | Y. Liu, L. Lin, K. K. Y. Wong, X. Tang | http://arxiv.org/pdf/2401.14074v1 | link |
2024-01-25 | Unsupervised Spatial-Temporal Feature Enrichment and Fidelity Preservation Network for Skeleton based Action Recognition | 用于基于骨架的动作识别的无监督时空特征丰富和保真度网络 | Chuankun Li, Shuai Li, Yanbo Gao, Ping Chen, Jian Li, Wanqing Li | http://arxiv.org/pdf/2401.14034v1 | null |
2024-01-25 | PLCNet: Patch-wise Lane Correction Network for Automatic Lane Correction in High-definition Maps | PLCNet:用于高清地图中自动车道校正的分片车道校正网络 | Haiyang Peng, Yi Zhan, Benkang Wang, Hongtao Zhang | http://arxiv.org/pdf/2401.14024v1 | null |
2024-01-25 | WAL-Net: Weakly supervised auxiliary task learning network for carotid plaques classification | WAL-Net:用于颈动脉斑块分类的弱监督辅助任务学习网络 | Haitao Gan, Lingchao Fu, Ran Zhou, Weiyan Gan, Furong Wang, Xiaoyan Wu, Zhi Yang, Zhongwei Huang | http://arxiv.org/pdf/2401.13998v1 | null |
2024-01-25 | Deep Learning Innovations in Diagnosing Diabetic Retinopathy: The Potential of Transfer Learning and the DiaCNN Model | 诊断糖尿病视网膜病变的深度学习创新:迁移学习和 DiaCNN 模型的潜力 | Mohamed R. Shoaib, Heba M. Emara, Jun Zhao, Walid El-Shafai, Naglaa F. Soliman, Ahmed S. Mubarak, Osama A. Omer, Fathi E. Abd El-Samie, Hamada Esmaiel | http://arxiv.org/pdf/2401.13990v1 | null |
2024-01-25 | BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models | BootPIG:在预训练扩散模型中引导零样本个性化图像生成功能 | Senthil Purushwalkam, Akash Gokul, Shafiq Joty, Nikhil Naik | http://arxiv.org/pdf/2401.13974v1 | null |
2024-01-25 | Improving Pseudo-labelling and Enhancing Robustness for Semi-Supervised Domain Generalization | 改进伪标签并增强半监督域泛化的鲁棒性 | Adnan Khan, Mai A. Shaaban, Muhammad Haris Khan | http://arxiv.org/pdf/2401.13965v1 | link |
2024-01-25 | TriSAM: Tri-Plane SAM for zero-shot cortical blood vessel segmentation in VEM images | TriSAM:用于 VEM 图像中零次皮质血管分割的三平面 SAM | Jia Wan, Wanhua Li, Atmadeep Banerjee, Jason Ken Adhinarta, Evelina Sjostedt, Jingpeng Wu, Jeff Lichtman, Hanspeter Pfister, Donglai Wei | http://arxiv.org/pdf/2401.13961v1 | null |
2024-01-25 | A New Image Quality Database for Multiple Industrial Processes | 适用于多种工业流程的新图像质量数据库 | Xuanchao Ma, Zehan Wu, Hongyan Liu, Chengxu Zhou, Ke Gu | http://arxiv.org/pdf/2401.13956v1 | null |
2024-01-25 | AM-SORT: Adaptable Motion Predictor with Historical Trajectory Embedding for Multi-Object Tracking | AM-SORT:具有历史轨迹嵌入的自适应运动预测器,用于多对象跟踪 | Vitaliy Kim, Gunho Jung, Seong-Whan Lee | http://arxiv.org/pdf/2401.13950v1 | null |
2024-01-25 | Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention | 利用可变形注意力的蒸馏学习进行自监督视频对象分割 | Quang-Trung Truong, Duc Thanh Nguyen, Binh-Son Hua, Sai-Kit Yeung | http://arxiv.org/pdf/2401.13937v1 | null |
2024-01-25 | AscDAMs: Advanced SLAM-based channel detection and mapping system | AscDAMs:基于 SLAM 的高级通道检测和映射系统 | Tengfei Wang, Fucheng Lu, Jintao Qin, Taosheng Huang, Hui Kong, Ping Shen | http://arxiv.org/pdf/2401.13877v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-25 | Deep Clustering with Diffused Sampling and Hardness-aware Self-distillation | 具有扩散采样和硬度感知自蒸馏的深度聚类 | Hai-Xin Zhang, Dong Huang | http://arxiv.org/pdf/2401.14038v1 | null |
2024-01-25 | StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models | StyleInject:文本到图像扩散模型的参数高效调整 | Yalong Bai, Mohan Zhou, Qing Yang | http://arxiv.org/pdf/2401.13942v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-25 | Deconstructing Denoising Diffusion Models for Self-Supervised Learning | 解构自监督学习的去噪扩散模型 | Xinlei Chen, Zhuang Liu, Saining Xie, Kaiming He | http://arxiv.org/pdf/2401.14404v1 | null |
2024-01-25 | Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation | Sketch2NeRF:多视图草图引导的文本到 3D 生成 | Minglin Chen, Longguang Wang, Weihao Yuan, Yukun Wang, Zhe Sheng, Yisheng He, Zilong Dong, Liefeng Bo, Yulan Guo | http://arxiv.org/pdf/2401.14257v1 | null |
2024-01-25 | Scene Graph to Image Synthesis: Integrating CLIP Guidance with Graph Conditioning in Diffusion Models | 场景图到图像合成:将 CLIP 指导与扩散模型中的图调节相集成 | Rameshwar Mishra, A V Subramanyam | http://arxiv.org/pdf/2401.14111v1 | null |
2024-01-25 | CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion | CreativeSynth:基于多模态扩散的视觉艺术创意融合与合成 | Nisha Huang, Weiming Dong, Yuxin Zhang, Fan Tang, Ronghui Li, Chongyang Ma, Xiu Li, Changsheng Xu | http://arxiv.org/pdf/2401.14066v1 | link |
2024-01-25 | Diffusion-based Data Augmentation for Object Counting Problems | 针对对象计数问题的基于扩散的数据增强 | Zhen Wang, Yuelei Li, Jia Wan, Nuno Vasconcelos | http://arxiv.org/pdf/2401.13992v1 | null |
2024-01-25 | Appearance Debiased Gaze Estimation via Stochastic Subject-Wise Adversarial Learning | 通过随机主题对抗性学习进行外观去偏注视估计 | Suneung Kim, Woo-Jeoung Nam, Seong-Whan Lee | http://arxiv.org/pdf/2401.13865v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-25 | JUMP: A joint multimodal registration pipeline for neuroimaging with minimal preprocessing | JUMP:用于神经成像的联合多模式配准管道,只需最少的预处理 | Adria Casamitjana, Juan Eugenio Iglesias, Raul Tudela, Aida Ninerola-Baizan, Roser Sala-Llonch | http://arxiv.org/pdf/2401.14250v1 | link |
2024-01-25 | LanDA: Language-Guided Multi-Source Domain Adaptation | LanDA:语言引导的多源域适应 | Zhenbin Wang, Lei Zhang, Lituan Wang, Minjuan Zhu | http://arxiv.org/pdf/2401.14148v1 | null |
2024-01-25 | GauU-Scene: A Scene Reconstruction Benchmark on Large Scale 3D Reconstruction Dataset Using Gaussian Splatting | GauU-Scene:使用高斯泼溅的大规模 3D 重建数据集的场景重建基准 | Butian Xiong, Zhuo Li, Zhen Li | http://arxiv.org/pdf/2401.14032v1 | null |
2024-01-25 | MambaMorph: a Mamba-based Backbone with Contrastive Feature Learning for Deformable MR-CT Registration | MambaMorph:基于 Mamba 的骨干网,具有用于可变形 MR-CT 配准的对比特征学习 | Tao Guo, Yinuo Wang, Cai Meng | http://arxiv.org/pdf/2401.13934v1 | link |
2024-01-25 | Knowledge Graph Supported Benchmark and Video Captioning for Basketball | 知识图谱支持的篮球基准和视频字幕 | Zeyu Xi, Ge Shi, Lifang Wu, Xuefen Li, Junchi Yan, Liang Wang, Zilin Liu | http://arxiv.org/pdf/2401.13888v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-25 | Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression | 高保真神经图像压缩的语义集成损失和潜在细化 | Daxin Li, Yuanchao Bai, Kai Wang, Junjun Jiang, Xianming Liu | http://arxiv.org/pdf/2401.14007v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-25 | A real-time rendering method for high albedo anisotropic materials with multiple scattering | 一种多重散射高反照率各向异性材料的实时渲染方法 | Shun Fang, Xing Feng, Ming Cui | http://arxiv.org/pdf/2401.14051v1 | null |
2024-01-25 | Diverse and Lifespan Facial Age Transformation Synthesis with Identity Variation Rationality Metric | 具有身份变异理性度量的多样化和寿命面部年龄变换综合 | Jiu-Cheng Xie, Jun Yang, Wenqing Wang, Feng Xu, Hao Gao | http://arxiv.org/pdf/2401.14036v1 | null |
2024-01-25 | Learning to Manipulate Artistic Images | 学习操纵艺术图像 | Wei Guo, Yuqi Zhang, De Ma, Qian Zheng | http://arxiv.org/pdf/2401.13976v1 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-25 | Learning Robust Generalizable Radiance Field with Visibility and Feature Augmented Point Representation | 学习具有可见性和特征增强点表示的鲁棒可泛化辐射场 | Jiaxu Wang, Ziyi Zhang, Renjing Xu | http://arxiv.org/pdf/2401.14354v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-25 | Range-Agnostic Multi-View Depth Estimation With Keyframe Selection | 通过关键帧选择进行与范围无关的多视图深度估计 | Andrea Conti, Matteo Poggi, Valerio Cambareri, Stefano Mattoccia | http://arxiv.org/pdf/2401.14401v1 | link |
2024-01-25 | Learning to navigate efficiently and precisely in real environments | 学习在真实环境中高效、精确地导航 | Guillaume Bono, Hervé Poirier, Leonid Antsfeld, Gianluca Monaci, Boris Chidlovskii, Christian Wolf | http://arxiv.org/pdf/2401.14349v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-25 | Adaptive Mobile Manipulation for Articulated Objects In the Open World | 开放世界中铰接物体的自适应移动操纵 | Haoyu Xiong, Russell Mendonca, Kenneth Shaw, Deepak Pathak | http://arxiv.org/pdf/2401.14403v1 | null |
2024-01-25 | Generalized People Diversity: Learning a Human Perception-Aligned Diversity Representation for People Images | 广义的人物多样性:学习人物图像的与人类感知一致的多样性表示 | Hansa Srinivasan, Candice Schumann, Aradhana Sinha, David Madras, Gbolahan Oluwafemi Olanubi, Alex Beutel, Susanna Ricco, Jilin Chen | http://arxiv.org/pdf/2401.14322v1 | null |
2024-01-25 | POUR-Net: A Population-Prior-Aided Over-Under-Representation Network for Low-Count PET Attenuation Map Generation | POUR-Net:用于生成低计数 PET 衰减图的群体优先辅助过度表示网络 | Bo Zhou, Jun Hou, Tianqi Chen, Yinchi Zhou, Xiongchao Chen, Huidong Xie, Qiong Liu, Xueqi Guo, Yu-Jung Tsai, Vladimir Y. Panin, et.al. | http://arxiv.org/pdf/2401.14285v1 | null |
2024-01-25 | Energy-Based Concept Bottleneck Models: Unifying Prediction, Concept Intervention, and Conditional Interpretations | 基于能量的概念瓶颈模型:统一预测、概念干预和条件解释 | Xinyue Xu, Yi Qin, Lu Mi, Hao Wang, Xiaomeng Li | http://arxiv.org/pdf/2401.14142v1 | link |
2024-01-25 | Enabling Cross-Camera Collaboration for Video Analytics on Distributed Smart Cameras | 在分布式智能摄像机上实现视频分析的跨摄像机协作 | Chulhong Min, Juheon Yi, Utku Gunay Acer, Fahim Kawsar | http://arxiv.org/pdf/2401.14132v1 | null |
2024-01-25 | Incorporating Exemplar Optimization into Training with Dual Networks for Human Mesh Recovery | 将示例优化纳入双网络训练中以实现人体网格恢复 | Yongwei Nie, Mingxian Fan, Chengjiang Long, Qing Zhang, Jian Zhu, Xuemiao Xu | http://arxiv.org/pdf/2401.14121v1 | null |
2024-01-25 | Sparse and Transferable Universal Singular Vectors Attack | 稀疏且可转移的通用奇异向量攻击 | Kseniia Kuvshinova, Olga Tsymboi, Ivan Oseledets | http://arxiv.org/pdf/2401.14031v1 | null |
2024-01-25 | An Extensible Framework for Open Heterogeneous Collaborative Perception | 开放异构协作感知的可扩展框架 | Yifan Lu, Yue Hu, Yiqi Zhong, Dequan Wang, Siheng Chen, Yanfeng Wang | http://arxiv.org/pdf/2401.13964v1 | link |
2024-01-25 | Conditional Neural Video Coding with Spatial-Temporal Super-Resolution | 具有时空超分辨率的条件神经视频编码 | Henan Wang, Xiaohan Pan, Runsen Feng, Zongyu Guo, Zhibo Chen | http://arxiv.org/pdf/2401.13959v1 | null |