Skip to content

Latest commit

 

History

History
executable file
·
104 lines (81 loc) · 14.1 KB

2024-02-15.md

File metadata and controls

executable file
·
104 lines (81 loc) · 14.1 KB

[UPDATED!] 2024-02-15 (Publish Time)

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-02-15 Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation 用于文本到图像生成的扩散模型的自玩微调 Huizhuo Yuan, Zixiang Chen, Kaixuan Ji, Quanquan Gu http://arxiv.org/pdf/2402.10210v1 null
2024-02-15 Recovering the Pre-Fine-Tuning Weights of Generative Models 恢复生成模型的预微调权重 Eliahu Horwitz, Jonathan Kahana, Yedid Hoshen http://arxiv.org/pdf/2402.10208v1 null
2024-02-15 Radio-astronomical Image Reconstruction with Conditional Denoising Diffusion Model 条件去噪扩散模型的射电天文图像重建 Mariia Drozdova, Vitaliy Kinakh, Omkar Bait, Olga Taran, Erica Lastufka, Miroslava Dessauges-Zavadsky, Taras Holotyak, Daniel Schaerer, Slava Voloshynovskiy http://arxiv.org/pdf/2402.10204v1 null
2024-02-15 Robust semi-automatic vessel tracing in the human retinal image by an instance segmentation neural network 通过实例分割神经网络在人类视网膜图像中进行鲁棒的半自动血管追踪 Siyi Chen, Amir H. Kashani, Ji Yi http://arxiv.org/pdf/2402.10055v1 null
2024-02-15 Data Augmentation and Transfer Learning Approaches Applied to Facial Expressions Recognition 应用于面部表情识别的数据增强和迁移学习方法 Enrico Randellini, Leonardo Rigutini, Claudio Sacca' http://arxiv.org/pdf/2402.09982v1 null
2024-02-15 Textual Localization: Decomposing Multi-concept Images for Subject-Driven Text-to-Image Generation 文本本地化:分解多概念图像以生成主题驱动的文本到图像 Junjie Shentu, Matthew Watson, Noura Al Moubayed http://arxiv.org/pdf/2402.09966v1 null
2024-02-15 Lester: rotoscope animation through video object segmentation and tracking Lester:通过视频对象分割和跟踪制作转描动画 Ruben Tous http://arxiv.org/pdf/2402.09883v1 null
2024-02-15 DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization DreamMatcher:外观匹配自我关注,实现语义一致的文本到图像个性化 Jisu Nam, Heesu Kim, DongJae Lee, Siyoon Jin, Seungryong Kim, Seunggyu Chang http://arxiv.org/pdf/2402.09812v1 null
2024-02-15 Examining Pathological Bias in a Generative Adversarial Network Discriminator: A Case Study on a StyleGAN3 Model 检查生成对抗网络鉴别器中的病理偏差:StyleGAN3 模型的案例研究 Alvin Grissom II, Ryan F. Lei, Jeova Farias Sales Rocha Neto, Bailey Lin, Ryan Trotter http://arxiv.org/pdf/2402.09786v1 null
2024-02-15 Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement 具有交叉注意力的扩散模型作为解开的归纳偏差 Tao Yang, Cuiling Lan, Yan Lu, Nanning zheng http://arxiv.org/pdf/2402.09712v1 null
2024-02-15 Prompt-based Personalized Federated Learning for Medical Visual Question Answering 基于提示的个性化联合学习医学视觉问答 He Zhu, Ren Togo, Takahiro Ogawa, Miki Haseyama http://arxiv.org/pdf/2402.09677v1 null

多模态

Publish Date Title Title_CN Authors PDF Code
2024-02-15 MM-Point: Multi-View Information-Enhanced Multi-Modal Self-Supervised 3D Point Cloud Understanding MM-Point:多视图信息增强的多模态自监督 3D 点云理解 Hai-Tao Yu, Mofei Song http://arxiv.org/pdf/2402.10002v1 null
2024-02-15 LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition 法学硕士作为桥梁:重新制定扎根多模态命名实体识别 Jinyuan Li, Han Li, Di Sun, Jiahao Wang, Wenkun Zhang, Zan Wang, Gang Pan http://arxiv.org/pdf/2402.09989v1 null
2024-02-15 EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models EFUF:有效的细粒度遗忘框架,用于减轻多模态大语言模型中的幻觉 Shangyu Xing, Fei Zhao, Zhen Wu, Tuo An, Weihao Chen, Chunhui Li, Jianbing Zhang, Xinyu Dai http://arxiv.org/pdf/2402.09801v1 null
2024-02-15 Visually Dehallucinative Instruction Generation: Know What You Don't Know 视觉去幻觉指令生成:知道你不知道的东西 Sungguk Cha, Jusung Lee, Younghyun Lee, Cheoljong Yang http://arxiv.org/pdf/2402.09717v1 null
2024-02-15 Exploiting Alpha Transparency In Language And Vision-Based AI Systems 在基于语言和视觉的人工智能系统中利用 Alpha 透明度 David Noever, Forrest McKee http://arxiv.org/pdf/2402.09671v1 null
2024-02-15 VisIRNet: Deep Image Alignment for UAV-taken Visible and Infrared Image Pairs VisIRNet:无人机拍摄的可见光和红外图像对的深度图像对齐 Sedat Ozer, Alain P. Ndigande http://arxiv.org/pdf/2402.09635v1 null

3DGS

Publish Date Title Title_CN Authors PDF Code
2024-02-15 GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering GES:用于高效辐射场渲染的广义指数泼溅 Abdullah Hamdi, Luke Melas-Kyriazi, Guocheng Qian, Jinjie Mai, Ruoshi Liu, Carl Vondrick, Bernard Ghanem, Andrea Vedaldi http://arxiv.org/pdf/2402.10128v1 null

模型压缩/优化

Publish Date Title Title_CN Authors PDF Code
2024-02-15 Hybrid CNN Bi-LSTM neural network for Hyperspectral image classification 用于高光谱图像分类的混合 CNN Bi-LSTM 神经网络 Alok Ranjan Sahoo, Pavan Chakraborty http://arxiv.org/pdf/2402.10026v1 null

分类/检测/识别/分割/...

Publish Date Title Title_CN Authors PDF Code
2024-02-15 Is Continual Learning Ready for Real-world Challenges? 持续学习准备好应对现实世界的挑战了吗? Theodora Kontogianni, Yuanwen Yue, Siyu Tang, Konrad Schindler http://arxiv.org/pdf/2402.10130v1 null
2024-02-15 MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations MIM-Refiner:中间预训练表示的对比学习提升 Benedikt Alkin, Lukas Miklautz, Sepp Hochreiter, Johannes Brandstetter http://arxiv.org/pdf/2402.10093v1 null
2024-02-15 Investigation of Federated Learning Algorithms for Retinal Optical Coherence Tomography Image Classification with Statistical Heterogeneity 具有统计异质性的视网膜光学相干断层扫描图像分类的联邦学习算法研究 Sanskar Amgain, Prashant Shrestha, Sophia Bano, Ignacio del Valle Torres, Michael Cunniffe, Victor Hernandez, Phil Beales, Binod Bhattarai http://arxiv.org/pdf/2402.10035v1 null
2024-02-15 SAWEC: Sensing-Assisted Wireless Edge Computing SAWEC:传感辅助无线边缘计算 Khandaker Foysal Haque, Francesca Meneghello, Md. Ebtidaul Karim, Francesco Restuccia http://arxiv.org/pdf/2402.10021v1 null
2024-02-15 TIAViz: A Browser-based Visualization Tool for Computational Pathology Models TIAViz:基于浏览器的计算病理学模型可视化工具 Mark Eastwood, John Pocock, Mostafa Jahanifar, Adam Shephard, Skiros Habib, Ethar Alzaid, Abdullah Alsalemi, Jan Lukas Robertus, Nasir Rajpoot, Shan Raza, et.al. http://arxiv.org/pdf/2402.09990v1 null
2024-02-15 Current and future roles of artificial intelligence in retinopathy of prematurity 人工智能当前和未来在早产儿视网膜病变中的作用 Ali Jafarizadeh, Shadi Farabi Maleki, Parnia Pouya, Navid Sobhi, Mirsaeed Abdollahi, Siamak Pedrammehr, Chee Peng Lim, Houshyar Asadi, Roohallah Alizadehsani, Ru-San Tan, et.al. http://arxiv.org/pdf/2402.09975v1 null
2024-02-15 ViGEO: an Assessment of Vision GNNs in Earth Observation ViGEO:对地球观测中视觉 GNN 的评估 Luca Colomba, Paolo Garza http://arxiv.org/pdf/2402.09962v1 null
2024-02-15 Social Reward: Evaluating and Enhancing Generative AI through Million-User Feedback from an Online Creative Community 社会奖励:通过在线创意社区的数百万用户反馈评估和增强生成式人工智能 Arman Isajanyan, Artur Shatveryan, David Kocharyan, Zhangyang Wang, Humphrey Shi http://arxiv.org/pdf/2402.09872v1 null
2024-02-15 Characterizing Accuracy Trade-offs of EEG Applications on Embedded HMPs 表征嵌入式 HMP 上 EEG 应用的准确性权衡 Zain Taufique, Muhammad Awais Bin Altaf, Antonio Miele, Pasi Liljeberg, Anil Kanduri http://arxiv.org/pdf/2402.09867v1 null
2024-02-15 Beyond Kalman Filters: Deep Learning-Based Filters for Improved Object Tracking 超越卡尔曼滤波器:用于改进对象跟踪的基于深度学习的滤波器 Momir Adžemović, Predrag Tadić, Andrija Petrović, Mladen Nikolić http://arxiv.org/pdf/2402.09865v1 null
2024-02-15 Mind the Modality Gap: Towards a Remote Sensing Vision-Language Model via Cross-modal Alignment 注意模态差距:通过跨模态对齐实现遥感视觉语言模型 Angelos Zavras, Dimitrios Michail, Begüm Demir, Ioannis Papoutsis http://arxiv.org/pdf/2402.09816v1 null
2024-02-15 TEXTRON: Weakly Supervised Multilingual Text Detection through Data Programming TEXTRON:通过数据编程进行弱监督多语言文本检测 Dhruv Kudale, Badri Vishal Kasuba, Venkatapathy Subramanian, Parag Chaudhuri, Ganesh Ramakrishnan http://arxiv.org/pdf/2402.09811v1 null
2024-02-15 A Comprehensive Review on Computer Vision Analysis of Aerial Data 航空数据计算机视觉分析的综合综述 Vivek Tetarwal, Sandeep Kumar http://arxiv.org/pdf/2402.09781v1 null
2024-02-15 Less is more: Ensemble Learning for Retinal Disease Recognition Under Limited Resources 少即是多:有限资源下的视网膜疾病识别集成学习 Jiahao Wang, Hong Peng, Shengchao Chen, Sufen Ren http://arxiv.org/pdf/2402.09747v1 null
2024-02-15 Region Feature Descriptor Adapted to High Affine Transformations 适应高仿射变换的区域特征描述符 Shaojie Zhang, Yinghui Wang, Peixuan Liu, Jinlong Yang, Tao Yan, Liangyi Huang, Mingfeng Wang http://arxiv.org/pdf/2402.09724v1 null
2024-02-15 Hand Shape and Gesture Recognition using Multiscale Template Matching, Background Subtraction and Binary Image Analysis 使用多尺度模板匹配、背景扣除和二值图像分析进行手形和手势识别 Ketan Suhaas Saichandran http://arxiv.org/pdf/2402.09663v1 null
2024-02-15 Spatiotemporal Disentanglement of Arteriovenous Malformations in Digital Subtraction Angiography 数字减影血管造影中动静脉畸形的时空解缠 Kathleen Baur, Xin Xiong, Erickson Torio, Rose Du, Parikshit Juvekar, Reuben Dorent, Alexandra Golby, Sarah Frisken, Nazim Haouchine http://arxiv.org/pdf/2402.09636v1 null

图像理解

Publish Date Title Title_CN Authors PDF Code
2024-02-15 X-maps: Direct Depth Lookup for Event-based Structured Light Systems X-maps:基于事件的结构光系统的直接深度查找 Wieland Morgenstern, Niklas Gard, Simon Baumann, Anna Hilsmann, Peter Eisert http://arxiv.org/pdf/2402.10061v1 null

LLM

Publish Date Title Title_CN Authors PDF Code
2024-02-15 RS-DPO: A Hybrid Rejection Sampling and Direct Preference Optimization Method for Alignment of Large Language Models RS-DPO:一种用于大型语言模型对齐的混合拒绝采样和直接偏好优化方法 Saeed Khaki, JinJin Li, Lan Ma, Liu Yang, Prathap Ramachandra http://arxiv.org/pdf/2402.10038v1 null

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-02-15 Any-Shift Prompting for Generalization over Distributions Any-Shift 提示对分布的泛化 Zehao Xiao, Jiayi Shen, Mohammad Mahdi Derakhshani, Shengcai Liao, Cees G. M. Snoek http://arxiv.org/pdf/2402.10099v1 null
2024-02-15 NYCTALE: Neuro-Evidence Transformer for Adaptive and Personalized Lung Nodule Invasiveness Prediction NYCTALE:用于自适应和个性化肺结节侵袭性预测的神经证据变压器 Sadaf Khademi, Anastasia Oikonomou, Konstantinos N. Plataniotis, Arash Mohammadi http://arxiv.org/pdf/2402.10066v1 null
2024-02-15 Feature Accentuation: Revealing 'What' Features Respond to in Natural Images 特征强调:揭示自然图像中“什么”特征的反应 Chris Hamblin, Thomas Fel, Srijani Saha, Talia Konkle, George Alvarez http://arxiv.org/pdf/2402.10039v1 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-02-15 Reg-NF: Efficient Registration of Implicit Surfaces within Neural Fields Reg-NF:神经场内隐式表面的有效配准 Stephen Hausler, David Hall, Sutharsan Mahendren, Peyman Moghadam http://arxiv.org/pdf/2402.09722v1 null

各类学习方式

Publish Date Title Title_CN Authors PDF Code
2024-02-15 Seed Optimization with Frozen Generator for Superior Zero-shot Low-light Enhancement 使用冷冻发生器进行种子优化,实现卓越的零次低光增强 Yuxuan Gu, Yi Jin, Ben Wang, Zhixiang Wei, Xiaoxiao Ma, Pengyang Ling, Haoxuan Wang, Huaian Chen, Enhong Chen http://arxiv.org/pdf/2402.09694v1 null

其他

Publish Date Title Title_CN Authors PDF Code
2024-02-15 Enhancing signal detectability in learning-based CT reconstruction with a model observer inspired loss function 利用模型观察者启发的损失函数增强基于学习的 CT 重建中的信号可检测性 Megan Lantz, Emil Y. Sidky, Ingrid S. Reiser, Xiaochuan Pan, Gregory Ongie http://arxiv.org/pdf/2402.10010v1 null
2024-02-15 POBEVM: Real-time Video Matting via Progressively Optimize the Target Body and Edge POBEVM:通过逐步优化目标主体和边缘进行实时视频抠图 Jianming Xian http://arxiv.org/pdf/2402.09731v1 null
2024-02-15 Towards Precision Cardiovascular Analysis in Zebrafish: The ZACAF Paradigm 实现斑马鱼精密心血管分析:ZACAF 范式 Amir Mohammad Naderi, Jennifer G. Casey, Mao-Hsiang Huang, Rachelle Victorio, David Y. Chiang, Calum MacRae, Hung Cao, Vandana A. Gupta http://arxiv.org/pdf/2402.09658v1 null
2024-02-15 Foul prediction with estimated poses from soccer broadcast video 根据足球转播视频中的估计姿势进行犯规预测 Jiale Fang, Calvin Yeung, Keisuke Fujii http://arxiv.org/pdf/2402.09650v1 null