Skip to content

Latest commit

 

History

History
executable file
·
105 lines (86 loc) · 17.3 KB

2024-01-25.md

File metadata and controls

executable file
·
105 lines (86 loc) · 17.3 KB

[UPDATED!] 2024-01-25 (Publish Time)

分类/检测/识别/分割

Publish Date Title Title_CN Authors PDF Code
2024-01-25 Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities 多模态路径:利用其他模态的不相关数据改进 Transformer Yiyuan Zhang, Xiaohan Ding, Kaixiong Gong, Yixiao Ge, Ying Shan, Xiangyu Yue http://arxiv.org/pdf/2401.14405v1 link
2024-01-25 pix2gestalt: Amodal Segmentation by Synthesizing Wholes pix2gestalt:通过综合整体进行无模态分割 Ege Ozguroglu, Ruoshi Liu, Dídac Surís, Dian Chen, Achal Dave, Pavel Tokmakov, Carl Vondrick http://arxiv.org/pdf/2401.14398v1 link
2024-01-25 Rethinking Patch Dependence for Masked Autoencoders 重新思考屏蔽自动编码器的补丁依赖性 Letian Fu, Long Lian, Renhao Wang, Baifeng Shi, Xudong Wang, Adam Yala, Trevor Darrell, Alexei A. Efros, Ken Goldberg http://arxiv.org/pdf/2401.14391v1 null
2024-01-25 Inconsistency Masks: Removing the Uncertainty from Input-Pseudo-Label Pairs 不一致掩码:消除输入伪标签对的不确定性 Michael R. H. Vorndran, Bernhard F. Roeck http://arxiv.org/pdf/2401.14387v1 link
2024-01-25 UrbanGenAI: Reconstructing Urban Landscapes using Panoptic Segmentation and Diffusion Models UrbanGenAI:使用全景分割和扩散模型重建城市景观 Timo Kapsalis http://arxiv.org/pdf/2401.14379v1 null
2024-01-25 Progressive Multi-task Anti-Noise Learning and Distilling Frameworks for Fine-grained Vehicle Recognition 用于细粒度车辆识别的渐进式多任务抗噪声学习和蒸馏框架 Dichao Liu http://arxiv.org/pdf/2401.14336v1 link
2024-01-25 Unlocking Past Information: Temporal Embeddings in Cooperative Bird's Eye View Prediction 解锁过去的信息:合作鸟瞰预测中的时间嵌入 Dominik Rößle, Jeremias Gerner, Klaus Bogenberger, Daniel Cremers, Stefanie Schmidtner, Torsten Schön http://arxiv.org/pdf/2401.14325v1 null
2024-01-25 Producing Plankton Classifiers that are Robust to Dataset Shift 生成对数据集转换具有鲁棒性的浮游生物分类器 Cheng Chen, Sreenath Kyathanahally, Marta Reyes, Stefanie Merkli, Ewa Merz, Emanuele Francazi, Marvin Hoege, Francesco Pomati, Marco Baity-Jesi http://arxiv.org/pdf/2401.14256v1 null
2024-01-25 On generalisability of segment anything model for nuclear instance segmentation in histology images 组织学图像中核实例分割的分段任意模型的通用性 Kesi Xu, Lea Goetz, Nasir Rajpoot http://arxiv.org/pdf/2401.14248v1 null
2024-01-25 Exploring the Unexplored: Understanding the Impact of Layer Adjustments on Image Classification 探索未探索的事物:了解图层调整对图像分类的影响 Haixia Liu, Tim Brailsford, James Goulding, Gavin Smith, Larry Bull http://arxiv.org/pdf/2401.14236v1 null
2024-01-25 Clinical Melanoma Diagnosis with Artificial Intelligence: Insights from a Prospective Multicenter Study 人工智能临床黑色素瘤诊断:前瞻性多中心研究的见解 Lukas Heinlein, Roman C. Maron, Achim Hekler, Sarah Haggenmüller, Christoph Wies, Jochen S. Utikal, Friedegund Meier, Sarah Hobelsberger, Frank F. Gellrich, Mildred Sergon, et.al. http://arxiv.org/pdf/2401.14193v1 null
2024-01-25 Vivim: a Video Vision Mamba for Medical Video Object Segmentation Vivim:用于医疗视频对象分割的视频视觉 Mamba Yijun Yang, Zhaohu Xing, Lei Zhu http://arxiv.org/pdf/2401.14168v1 null
2024-01-25 Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks 扎根 SAM:为各种视觉任务组装开放世界模型 Tianhe Ren, Shilong Liu, Ailing Zeng, Jing Lin, Kunchang Li, He Cao, Jiayu Chen, Xinyu Huang, Yukang Chen, Feng Yan, et.al. http://arxiv.org/pdf/2401.14159v1 null
2024-01-25 Expression-aware video inpainting for HMD removal in XR applications 用于在 XR 应用程序中移除 HMD 的表情感知视频修复 Fatemeh Ghorbani Lohesara, Karen Egiazarian, Sebastian Knorr http://arxiv.org/pdf/2401.14136v1 null
2024-01-25 Attention-based Efficient Classification for 3D MRI Image of Alzheimer's Disease 基于注意力的阿尔茨海默病 3D MRI 图像高效分类 Yihao Lin, Ximeng Li, Yan Zhang, Jinshan Tang http://arxiv.org/pdf/2401.14130v1 null
2024-01-25 MIFI: MultI-camera Feature Integration for Roust 3D Distracted Driver Activity Recognition MIFI:用于 Roust 3D 分心驾驶员活动识别的多摄像头功能集成 Jian Kuang, Wenjing Li, Fang Li, Jun Zhang, Zhongcheng Wu http://arxiv.org/pdf/2401.14115v1 link
2024-01-25 Double Trouble? Impact and Detection of Duplicates in Face Image Datasets 双重麻烦?人脸图像数据集中重复项的影响和检测 Torsten Schlett, Christian Rathgeb, Juan Tapia, Christoph Busch http://arxiv.org/pdf/2401.14088v1 link
2024-01-25 ProCNS: Progressive Prototype Calibration and Noise Suppression for Weakly-Supervised Medical Image Segmentation ProCNS:弱监督医学图像分割的渐进式原型校准和噪声抑制 Y. Liu, L. Lin, K. K. Y. Wong, X. Tang http://arxiv.org/pdf/2401.14074v1 link
2024-01-25 Unsupervised Spatial-Temporal Feature Enrichment and Fidelity Preservation Network for Skeleton based Action Recognition 用于基于骨架的动作识别的无监督时空特征丰富和保真度网络 Chuankun Li, Shuai Li, Yanbo Gao, Ping Chen, Jian Li, Wanqing Li http://arxiv.org/pdf/2401.14034v1 null
2024-01-25 PLCNet: Patch-wise Lane Correction Network for Automatic Lane Correction in High-definition Maps PLCNet:用于高清地图中自动车道校正的分片车道校正网络 Haiyang Peng, Yi Zhan, Benkang Wang, Hongtao Zhang http://arxiv.org/pdf/2401.14024v1 null
2024-01-25 WAL-Net: Weakly supervised auxiliary task learning network for carotid plaques classification WAL-Net:用于颈动脉斑块分类的弱监督辅助任务学习网络 Haitao Gan, Lingchao Fu, Ran Zhou, Weiyan Gan, Furong Wang, Xiaoyan Wu, Zhi Yang, Zhongwei Huang http://arxiv.org/pdf/2401.13998v1 null
2024-01-25 Deep Learning Innovations in Diagnosing Diabetic Retinopathy: The Potential of Transfer Learning and the DiaCNN Model 诊断糖尿病视网膜病变的深度学习创新:迁移学习和 DiaCNN 模型的潜力 Mohamed R. Shoaib, Heba M. Emara, Jun Zhao, Walid El-Shafai, Naglaa F. Soliman, Ahmed S. Mubarak, Osama A. Omer, Fathi E. Abd El-Samie, Hamada Esmaiel http://arxiv.org/pdf/2401.13990v1 null
2024-01-25 BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models BootPIG:在预训练扩散模型中引导零样本个性化图像生成功能 Senthil Purushwalkam, Akash Gokul, Shafiq Joty, Nikhil Naik http://arxiv.org/pdf/2401.13974v1 null
2024-01-25 Improving Pseudo-labelling and Enhancing Robustness for Semi-Supervised Domain Generalization 改进伪标签并增强半监督域泛化的鲁棒性 Adnan Khan, Mai A. Shaaban, Muhammad Haris Khan http://arxiv.org/pdf/2401.13965v1 link
2024-01-25 TriSAM: Tri-Plane SAM for zero-shot cortical blood vessel segmentation in VEM images TriSAM:用于 VEM 图像中零次皮质血管分割的三平面 SAM Jia Wan, Wanhua Li, Atmadeep Banerjee, Jason Ken Adhinarta, Evelina Sjostedt, Jingpeng Wu, Jeff Lichtman, Hanspeter Pfister, Donglai Wei http://arxiv.org/pdf/2401.13961v1 null
2024-01-25 A New Image Quality Database for Multiple Industrial Processes 适用于多种工业流程的新图像质量数据库 Xuanchao Ma, Zehan Wu, Hongyan Liu, Chengxu Zhou, Ke Gu http://arxiv.org/pdf/2401.13956v1 null
2024-01-25 AM-SORT: Adaptable Motion Predictor with Historical Trajectory Embedding for Multi-Object Tracking AM-SORT:具有历史轨迹嵌入的自适应运动预测器,用于多对象跟踪 Vitaliy Kim, Gunho Jung, Seong-Whan Lee http://arxiv.org/pdf/2401.13950v1 null
2024-01-25 Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention 利用可变形注意力的蒸馏学习进行自监督视频对象分割 Quang-Trung Truong, Duc Thanh Nguyen, Binh-Son Hua, Sai-Kit Yeung http://arxiv.org/pdf/2401.13937v1 null
2024-01-25 AscDAMs: Advanced SLAM-based channel detection and mapping system AscDAMs:基于 SLAM 的高级通道检测和映射系统 Tengfei Wang, Fucheng Lu, Jintao Qin, Taosheng Huang, Hui Kong, Ping Shen http://arxiv.org/pdf/2401.13877v1 null

模型压缩/优化

Publish Date Title Title_CN Authors PDF Code
2024-01-25 Deep Clustering with Diffused Sampling and Hardness-aware Self-distillation 具有扩散采样和硬度感知自蒸馏的深度聚类 Hai-Xin Zhang, Dong Huang http://arxiv.org/pdf/2401.14038v1 null
2024-01-25 StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models StyleInject:文本到图像扩散模型的参数高效调整 Yalong Bai, Mohan Zhou, Qing Yang http://arxiv.org/pdf/2401.13942v1 null

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-01-25 Deconstructing Denoising Diffusion Models for Self-Supervised Learning 解构自监督学习的去噪扩散模型 Xinlei Chen, Zhuang Liu, Saining Xie, Kaiming He http://arxiv.org/pdf/2401.14404v1 null
2024-01-25 Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation Sketch2NeRF:多视图草图引导的文本到 3D 生成 Minglin Chen, Longguang Wang, Weihao Yuan, Yukun Wang, Zhe Sheng, Yisheng He, Zilong Dong, Liefeng Bo, Yulan Guo http://arxiv.org/pdf/2401.14257v1 null
2024-01-25 Scene Graph to Image Synthesis: Integrating CLIP Guidance with Graph Conditioning in Diffusion Models 场景图到图像合成:将 CLIP 指导与扩散模型中的图调节相集成 Rameshwar Mishra, A V Subramanyam http://arxiv.org/pdf/2401.14111v1 null
2024-01-25 CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion CreativeSynth:基于多模态扩散的视觉艺术创意融合与合成 Nisha Huang, Weiming Dong, Yuxin Zhang, Fan Tang, Ronghui Li, Chongyang Ma, Xiu Li, Changsheng Xu http://arxiv.org/pdf/2401.14066v1 link
2024-01-25 Diffusion-based Data Augmentation for Object Counting Problems 针对对象计数问题的基于扩散的数据增强 Zhen Wang, Yuelei Li, Jia Wan, Nuno Vasconcelos http://arxiv.org/pdf/2401.13992v1 null
2024-01-25 Appearance Debiased Gaze Estimation via Stochastic Subject-Wise Adversarial Learning 通过随机主题对抗性学习进行外观去偏注视估计 Suneung Kim, Woo-Jeoung Nam, Seong-Whan Lee http://arxiv.org/pdf/2401.13865v1 null

多模态

Publish Date Title Title_CN Authors PDF Code
2024-01-25 JUMP: A joint multimodal registration pipeline for neuroimaging with minimal preprocessing JUMP:用于神经成像的联合多模式配准管道,只需最少的预处理 Adria Casamitjana, Juan Eugenio Iglesias, Raul Tudela, Aida Ninerola-Baizan, Roser Sala-Llonch http://arxiv.org/pdf/2401.14250v1 link
2024-01-25 LanDA: Language-Guided Multi-Source Domain Adaptation LanDA:语言引导的多源域适应 Zhenbin Wang, Lei Zhang, Lituan Wang, Minjuan Zhu http://arxiv.org/pdf/2401.14148v1 null
2024-01-25 GauU-Scene: A Scene Reconstruction Benchmark on Large Scale 3D Reconstruction Dataset Using Gaussian Splatting GauU-Scene:使用高斯泼溅的大规模 3D 重建数据集的场景重建基准 Butian Xiong, Zhuo Li, Zhen Li http://arxiv.org/pdf/2401.14032v1 null
2024-01-25 MambaMorph: a Mamba-based Backbone with Contrastive Feature Learning for Deformable MR-CT Registration MambaMorph:基于 Mamba 的骨干网,具有用于可变形 MR-CT 配准的对比特征学习 Tao Guo, Yinuo Wang, Cai Meng http://arxiv.org/pdf/2401.13934v1 link
2024-01-25 Knowledge Graph Supported Benchmark and Video Captioning for Basketball 知识图谱支持的篮球基准和视频字幕 Zeyu Xi, Ge Shi, Lifang Wu, Xuefen Li, Junchi Yan, Liang Wang, Zilin Liu http://arxiv.org/pdf/2401.13888v1 null

LLM

Publish Date Title Title_CN Authors PDF Code
2024-01-25 Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression 高保真神经图像压缩的语义集成损失和潜在细化 Daxin Li, Yuanchao Bai, Kai Wang, Junjun Jiang, Xianming Liu http://arxiv.org/pdf/2401.14007v1 null

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-01-25 A real-time rendering method for high albedo anisotropic materials with multiple scattering 一种多重散射高反照率各向异性材料的实时渲染方法 Shun Fang, Xing Feng, Ming Cui http://arxiv.org/pdf/2401.14051v1 null
2024-01-25 Diverse and Lifespan Facial Age Transformation Synthesis with Identity Variation Rationality Metric 具有身份变异理性度量的多样化和寿命面部年龄变换综合 Jiu-Cheng Xie, Jun Yang, Wenqing Wang, Feng Xu, Hao Gao http://arxiv.org/pdf/2401.14036v1 null
2024-01-25 Learning to Manipulate Artistic Images 学习操纵艺术图像 Wei Guo, Yuqi Zhang, De Ma, Qian Zheng http://arxiv.org/pdf/2401.13976v1 link

Nerf

Publish Date Title Title_CN Authors PDF Code
2024-01-25 Learning Robust Generalizable Radiance Field with Visibility and Feature Augmented Point Representation 学习具有可见性和特征增强点表示的鲁棒可泛化辐射场 Jiaxu Wang, Ziyi Zhang, Renjing Xu http://arxiv.org/pdf/2401.14354v1 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-01-25 Range-Agnostic Multi-View Depth Estimation With Keyframe Selection 通过关键帧选择进行与范围无关的多视图深度估计 Andrea Conti, Matteo Poggi, Valerio Cambareri, Stefano Mattoccia http://arxiv.org/pdf/2401.14401v1 link
2024-01-25 Learning to navigate efficiently and precisely in real environments 学习在真实环境中高效、精确地导航 Guillaume Bono, Hervé Poirier, Leonid Antsfeld, Gianluca Monaci, Boris Chidlovskii, Christian Wolf http://arxiv.org/pdf/2401.14349v1 null

其他

Publish Date Title Title_CN Authors PDF Code
2024-01-25 Adaptive Mobile Manipulation for Articulated Objects In the Open World 开放世界中铰接物体的自适应移动操纵 Haoyu Xiong, Russell Mendonca, Kenneth Shaw, Deepak Pathak http://arxiv.org/pdf/2401.14403v1 null
2024-01-25 Generalized People Diversity: Learning a Human Perception-Aligned Diversity Representation for People Images 广义的人物多样性:学习人物图像的与人类感知一致的多样性表示 Hansa Srinivasan, Candice Schumann, Aradhana Sinha, David Madras, Gbolahan Oluwafemi Olanubi, Alex Beutel, Susanna Ricco, Jilin Chen http://arxiv.org/pdf/2401.14322v1 null
2024-01-25 POUR-Net: A Population-Prior-Aided Over-Under-Representation Network for Low-Count PET Attenuation Map Generation POUR-Net:用于生成低计数 PET 衰减图的群体优先辅助过度表示网络 Bo Zhou, Jun Hou, Tianqi Chen, Yinchi Zhou, Xiongchao Chen, Huidong Xie, Qiong Liu, Xueqi Guo, Yu-Jung Tsai, Vladimir Y. Panin, et.al. http://arxiv.org/pdf/2401.14285v1 null
2024-01-25 Energy-Based Concept Bottleneck Models: Unifying Prediction, Concept Intervention, and Conditional Interpretations 基于能量的概念瓶颈模型:统一预测、概念干预和条件解释 Xinyue Xu, Yi Qin, Lu Mi, Hao Wang, Xiaomeng Li http://arxiv.org/pdf/2401.14142v1 link
2024-01-25 Enabling Cross-Camera Collaboration for Video Analytics on Distributed Smart Cameras 在分布式智能摄像机上实现视频分析的跨摄像机协作 Chulhong Min, Juheon Yi, Utku Gunay Acer, Fahim Kawsar http://arxiv.org/pdf/2401.14132v1 null
2024-01-25 Incorporating Exemplar Optimization into Training with Dual Networks for Human Mesh Recovery 将示例优化纳入双网络训练中以实现人体网格恢复 Yongwei Nie, Mingxian Fan, Chengjiang Long, Qing Zhang, Jian Zhu, Xuemiao Xu http://arxiv.org/pdf/2401.14121v1 null
2024-01-25 Sparse and Transferable Universal Singular Vectors Attack 稀疏且可转移的通用奇异向量攻击 Kseniia Kuvshinova, Olga Tsymboi, Ivan Oseledets http://arxiv.org/pdf/2401.14031v1 null
2024-01-25 An Extensible Framework for Open Heterogeneous Collaborative Perception 开放异构协作感知的可扩展框架 Yifan Lu, Yue Hu, Yiqi Zhong, Dequan Wang, Siheng Chen, Yanfeng Wang http://arxiv.org/pdf/2401.13964v1 link
2024-01-25 Conditional Neural Video Coding with Spatial-Temporal Super-Resolution 具有时空超分辨率的条件神经视频编码 Henan Wang, Xiaohan Pan, Runsen Feng, Zongyu Guo, Zhibo Chen http://arxiv.org/pdf/2401.13959v1 null