Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-15 | Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation | 用于文本到图像生成的扩散模型的自玩微调 | Huizhuo Yuan, Zixiang Chen, Kaixuan Ji, Quanquan Gu | http://arxiv.org/pdf/2402.10210v1 | null |
2024-02-15 | Recovering the Pre-Fine-Tuning Weights of Generative Models | 恢复生成模型的预微调权重 | Eliahu Horwitz, Jonathan Kahana, Yedid Hoshen | http://arxiv.org/pdf/2402.10208v1 | null |
2024-02-15 | Radio-astronomical Image Reconstruction with Conditional Denoising Diffusion Model | 条件去噪扩散模型的射电天文图像重建 | Mariia Drozdova, Vitaliy Kinakh, Omkar Bait, Olga Taran, Erica Lastufka, Miroslava Dessauges-Zavadsky, Taras Holotyak, Daniel Schaerer, Slava Voloshynovskiy | http://arxiv.org/pdf/2402.10204v1 | null |
2024-02-15 | Robust semi-automatic vessel tracing in the human retinal image by an instance segmentation neural network | 通过实例分割神经网络在人类视网膜图像中进行鲁棒的半自动血管追踪 | Siyi Chen, Amir H. Kashani, Ji Yi | http://arxiv.org/pdf/2402.10055v1 | null |
2024-02-15 | Data Augmentation and Transfer Learning Approaches Applied to Facial Expressions Recognition | 应用于面部表情识别的数据增强和迁移学习方法 | Enrico Randellini, Leonardo Rigutini, Claudio Sacca' | http://arxiv.org/pdf/2402.09982v1 | null |
2024-02-15 | Textual Localization: Decomposing Multi-concept Images for Subject-Driven Text-to-Image Generation | 文本本地化:分解多概念图像以生成主题驱动的文本到图像 | Junjie Shentu, Matthew Watson, Noura Al Moubayed | http://arxiv.org/pdf/2402.09966v1 | null |
2024-02-15 | Lester: rotoscope animation through video object segmentation and tracking | Lester:通过视频对象分割和跟踪制作转描动画 | Ruben Tous | http://arxiv.org/pdf/2402.09883v1 | null |
2024-02-15 | DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization | DreamMatcher:外观匹配自我关注,实现语义一致的文本到图像个性化 | Jisu Nam, Heesu Kim, DongJae Lee, Siyoon Jin, Seungryong Kim, Seunggyu Chang | http://arxiv.org/pdf/2402.09812v1 | null |
2024-02-15 | Examining Pathological Bias in a Generative Adversarial Network Discriminator: A Case Study on a StyleGAN3 Model | 检查生成对抗网络鉴别器中的病理偏差:StyleGAN3 模型的案例研究 | Alvin Grissom II, Ryan F. Lei, Jeova Farias Sales Rocha Neto, Bailey Lin, Ryan Trotter | http://arxiv.org/pdf/2402.09786v1 | null |
2024-02-15 | Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement | 具有交叉注意力的扩散模型作为解开的归纳偏差 | Tao Yang, Cuiling Lan, Yan Lu, Nanning zheng | http://arxiv.org/pdf/2402.09712v1 | null |
2024-02-15 | Prompt-based Personalized Federated Learning for Medical Visual Question Answering | 基于提示的个性化联合学习医学视觉问答 | He Zhu, Ren Togo, Takahiro Ogawa, Miki Haseyama | http://arxiv.org/pdf/2402.09677v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-15 | MM-Point: Multi-View Information-Enhanced Multi-Modal Self-Supervised 3D Point Cloud Understanding | MM-Point:多视图信息增强的多模态自监督 3D 点云理解 | Hai-Tao Yu, Mofei Song | http://arxiv.org/pdf/2402.10002v1 | null |
2024-02-15 | LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition | 法学硕士作为桥梁:重新制定扎根多模态命名实体识别 | Jinyuan Li, Han Li, Di Sun, Jiahao Wang, Wenkun Zhang, Zan Wang, Gang Pan | http://arxiv.org/pdf/2402.09989v1 | null |
2024-02-15 | EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models | EFUF:有效的细粒度遗忘框架,用于减轻多模态大语言模型中的幻觉 | Shangyu Xing, Fei Zhao, Zhen Wu, Tuo An, Weihao Chen, Chunhui Li, Jianbing Zhang, Xinyu Dai | http://arxiv.org/pdf/2402.09801v1 | null |
2024-02-15 | Visually Dehallucinative Instruction Generation: Know What You Don't Know | 视觉去幻觉指令生成:知道你不知道的东西 | Sungguk Cha, Jusung Lee, Younghyun Lee, Cheoljong Yang | http://arxiv.org/pdf/2402.09717v1 | null |
2024-02-15 | Exploiting Alpha Transparency In Language And Vision-Based AI Systems | 在基于语言和视觉的人工智能系统中利用 Alpha 透明度 | David Noever, Forrest McKee | http://arxiv.org/pdf/2402.09671v1 | null |
2024-02-15 | VisIRNet: Deep Image Alignment for UAV-taken Visible and Infrared Image Pairs | VisIRNet:无人机拍摄的可见光和红外图像对的深度图像对齐 | Sedat Ozer, Alain P. Ndigande | http://arxiv.org/pdf/2402.09635v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-15 | GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering | GES:用于高效辐射场渲染的广义指数泼溅 | Abdullah Hamdi, Luke Melas-Kyriazi, Guocheng Qian, Jinjie Mai, Ruoshi Liu, Carl Vondrick, Bernard Ghanem, Andrea Vedaldi | http://arxiv.org/pdf/2402.10128v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-15 | Hybrid CNN Bi-LSTM neural network for Hyperspectral image classification | 用于高光谱图像分类的混合 CNN Bi-LSTM 神经网络 | Alok Ranjan Sahoo, Pavan Chakraborty | http://arxiv.org/pdf/2402.10026v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-15 | Is Continual Learning Ready for Real-world Challenges? | 持续学习准备好应对现实世界的挑战了吗? | Theodora Kontogianni, Yuanwen Yue, Siyu Tang, Konrad Schindler | http://arxiv.org/pdf/2402.10130v1 | null |
2024-02-15 | MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations | MIM-Refiner:中间预训练表示的对比学习提升 | Benedikt Alkin, Lukas Miklautz, Sepp Hochreiter, Johannes Brandstetter | http://arxiv.org/pdf/2402.10093v1 | null |
2024-02-15 | Investigation of Federated Learning Algorithms for Retinal Optical Coherence Tomography Image Classification with Statistical Heterogeneity | 具有统计异质性的视网膜光学相干断层扫描图像分类的联邦学习算法研究 | Sanskar Amgain, Prashant Shrestha, Sophia Bano, Ignacio del Valle Torres, Michael Cunniffe, Victor Hernandez, Phil Beales, Binod Bhattarai | http://arxiv.org/pdf/2402.10035v1 | null |
2024-02-15 | SAWEC: Sensing-Assisted Wireless Edge Computing | SAWEC:传感辅助无线边缘计算 | Khandaker Foysal Haque, Francesca Meneghello, Md. Ebtidaul Karim, Francesco Restuccia | http://arxiv.org/pdf/2402.10021v1 | null |
2024-02-15 | TIAViz: A Browser-based Visualization Tool for Computational Pathology Models | TIAViz:基于浏览器的计算病理学模型可视化工具 | Mark Eastwood, John Pocock, Mostafa Jahanifar, Adam Shephard, Skiros Habib, Ethar Alzaid, Abdullah Alsalemi, Jan Lukas Robertus, Nasir Rajpoot, Shan Raza, et.al. | http://arxiv.org/pdf/2402.09990v1 | null |
2024-02-15 | Current and future roles of artificial intelligence in retinopathy of prematurity | 人工智能当前和未来在早产儿视网膜病变中的作用 | Ali Jafarizadeh, Shadi Farabi Maleki, Parnia Pouya, Navid Sobhi, Mirsaeed Abdollahi, Siamak Pedrammehr, Chee Peng Lim, Houshyar Asadi, Roohallah Alizadehsani, Ru-San Tan, et.al. | http://arxiv.org/pdf/2402.09975v1 | null |
2024-02-15 | ViGEO: an Assessment of Vision GNNs in Earth Observation | ViGEO:对地球观测中视觉 GNN 的评估 | Luca Colomba, Paolo Garza | http://arxiv.org/pdf/2402.09962v1 | null |
2024-02-15 | Social Reward: Evaluating and Enhancing Generative AI through Million-User Feedback from an Online Creative Community | 社会奖励:通过在线创意社区的数百万用户反馈评估和增强生成式人工智能 | Arman Isajanyan, Artur Shatveryan, David Kocharyan, Zhangyang Wang, Humphrey Shi | http://arxiv.org/pdf/2402.09872v1 | null |
2024-02-15 | Characterizing Accuracy Trade-offs of EEG Applications on Embedded HMPs | 表征嵌入式 HMP 上 EEG 应用的准确性权衡 | Zain Taufique, Muhammad Awais Bin Altaf, Antonio Miele, Pasi Liljeberg, Anil Kanduri | http://arxiv.org/pdf/2402.09867v1 | null |
2024-02-15 | Beyond Kalman Filters: Deep Learning-Based Filters for Improved Object Tracking | 超越卡尔曼滤波器:用于改进对象跟踪的基于深度学习的滤波器 | Momir Adžemović, Predrag Tadić, Andrija Petrović, Mladen Nikolić | http://arxiv.org/pdf/2402.09865v1 | null |
2024-02-15 | Mind the Modality Gap: Towards a Remote Sensing Vision-Language Model via Cross-modal Alignment | 注意模态差距:通过跨模态对齐实现遥感视觉语言模型 | Angelos Zavras, Dimitrios Michail, Begüm Demir, Ioannis Papoutsis | http://arxiv.org/pdf/2402.09816v1 | null |
2024-02-15 | TEXTRON: Weakly Supervised Multilingual Text Detection through Data Programming | TEXTRON:通过数据编程进行弱监督多语言文本检测 | Dhruv Kudale, Badri Vishal Kasuba, Venkatapathy Subramanian, Parag Chaudhuri, Ganesh Ramakrishnan | http://arxiv.org/pdf/2402.09811v1 | null |
2024-02-15 | A Comprehensive Review on Computer Vision Analysis of Aerial Data | 航空数据计算机视觉分析的综合综述 | Vivek Tetarwal, Sandeep Kumar | http://arxiv.org/pdf/2402.09781v1 | null |
2024-02-15 | Less is more: Ensemble Learning for Retinal Disease Recognition Under Limited Resources | 少即是多:有限资源下的视网膜疾病识别集成学习 | Jiahao Wang, Hong Peng, Shengchao Chen, Sufen Ren | http://arxiv.org/pdf/2402.09747v1 | null |
2024-02-15 | Region Feature Descriptor Adapted to High Affine Transformations | 适应高仿射变换的区域特征描述符 | Shaojie Zhang, Yinghui Wang, Peixuan Liu, Jinlong Yang, Tao Yan, Liangyi Huang, Mingfeng Wang | http://arxiv.org/pdf/2402.09724v1 | null |
2024-02-15 | Hand Shape and Gesture Recognition using Multiscale Template Matching, Background Subtraction and Binary Image Analysis | 使用多尺度模板匹配、背景扣除和二值图像分析进行手形和手势识别 | Ketan Suhaas Saichandran | http://arxiv.org/pdf/2402.09663v1 | null |
2024-02-15 | Spatiotemporal Disentanglement of Arteriovenous Malformations in Digital Subtraction Angiography | 数字减影血管造影中动静脉畸形的时空解缠 | Kathleen Baur, Xin Xiong, Erickson Torio, Rose Du, Parikshit Juvekar, Reuben Dorent, Alexandra Golby, Sarah Frisken, Nazim Haouchine | http://arxiv.org/pdf/2402.09636v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-15 | X-maps: Direct Depth Lookup for Event-based Structured Light Systems | X-maps:基于事件的结构光系统的直接深度查找 | Wieland Morgenstern, Niklas Gard, Simon Baumann, Anna Hilsmann, Peter Eisert | http://arxiv.org/pdf/2402.10061v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-15 | RS-DPO: A Hybrid Rejection Sampling and Direct Preference Optimization Method for Alignment of Large Language Models | RS-DPO:一种用于大型语言模型对齐的混合拒绝采样和直接偏好优化方法 | Saeed Khaki, JinJin Li, Lan Ma, Liu Yang, Prathap Ramachandra | http://arxiv.org/pdf/2402.10038v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-15 | Any-Shift Prompting for Generalization over Distributions | Any-Shift 提示对分布的泛化 | Zehao Xiao, Jiayi Shen, Mohammad Mahdi Derakhshani, Shengcai Liao, Cees G. M. Snoek | http://arxiv.org/pdf/2402.10099v1 | null |
2024-02-15 | NYCTALE: Neuro-Evidence Transformer for Adaptive and Personalized Lung Nodule Invasiveness Prediction | NYCTALE:用于自适应和个性化肺结节侵袭性预测的神经证据变压器 | Sadaf Khademi, Anastasia Oikonomou, Konstantinos N. Plataniotis, Arash Mohammadi | http://arxiv.org/pdf/2402.10066v1 | null |
2024-02-15 | Feature Accentuation: Revealing 'What' Features Respond to in Natural Images | 特征强调:揭示自然图像中“什么”特征的反应 | Chris Hamblin, Thomas Fel, Srijani Saha, Talia Konkle, George Alvarez | http://arxiv.org/pdf/2402.10039v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-15 | Reg-NF: Efficient Registration of Implicit Surfaces within Neural Fields | Reg-NF:神经场内隐式表面的有效配准 | Stephen Hausler, David Hall, Sutharsan Mahendren, Peyman Moghadam | http://arxiv.org/pdf/2402.09722v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-15 | Seed Optimization with Frozen Generator for Superior Zero-shot Low-light Enhancement | 使用冷冻发生器进行种子优化,实现卓越的零次低光增强 | Yuxuan Gu, Yi Jin, Ben Wang, Zhixiang Wei, Xiaoxiao Ma, Pengyang Ling, Haoxuan Wang, Huaian Chen, Enhong Chen | http://arxiv.org/pdf/2402.09694v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-15 | Enhancing signal detectability in learning-based CT reconstruction with a model observer inspired loss function | 利用模型观察者启发的损失函数增强基于学习的 CT 重建中的信号可检测性 | Megan Lantz, Emil Y. Sidky, Ingrid S. Reiser, Xiaochuan Pan, Gregory Ongie | http://arxiv.org/pdf/2402.10010v1 | null |
2024-02-15 | POBEVM: Real-time Video Matting via Progressively Optimize the Target Body and Edge | POBEVM:通过逐步优化目标主体和边缘进行实时视频抠图 | Jianming Xian | http://arxiv.org/pdf/2402.09731v1 | null |
2024-02-15 | Towards Precision Cardiovascular Analysis in Zebrafish: The ZACAF Paradigm | 实现斑马鱼精密心血管分析:ZACAF 范式 | Amir Mohammad Naderi, Jennifer G. Casey, Mao-Hsiang Huang, Rachelle Victorio, David Y. Chiang, Calum MacRae, Hung Cao, Vandana A. Gupta | http://arxiv.org/pdf/2402.09658v1 | null |
2024-02-15 | Foul prediction with estimated poses from soccer broadcast video | 根据足球转播视频中的估计姿势进行犯规预测 | Jiale Fang, Calvin Yeung, Keisuke Fujii | http://arxiv.org/pdf/2402.09650v1 | null |