Skip to content

Latest commit

 

History

History
executable file
·
92 lines (71 loc) · 11.9 KB

2024-04-05.md

File metadata and controls

executable file
·
92 lines (71 loc) · 11.9 KB

[UPDATED!] 2024-04-05 (Publish Time)

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-04-05 Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models 文本到图像模型多主体个性化的身份解耦 Sangwon Jang, Jaehyeong Jo, Kimin Lee, Sung Ju Hwang http://arxiv.org/pdf/2404.04243v1 null
2024-04-05 Deep-learning Segmentation of Small Volumes in CT images for Radiotherapy Treatment Planning 用于放射治疗计划的 CT 图像中小体积的深度学习分割 Jianxin Zhou, Kadishe Fejza, Massimiliano Salvatori, Daniele Della Latta, Gregory M. Hermann, Angela Di Fulvio http://arxiv.org/pdf/2404.04202v1 null
2024-04-05 Dynamic Prompt Optimizing for Text-to-Image Generation 文本到图像生成的动态提示优化 Wenyi Mo, Tianyu Zhang, Yalong Bai, Bing Su, Ji-Rong Wen, Qing Yang http://arxiv.org/pdf/2404.04095v1 null
2024-04-05 Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation 分数恒等蒸馏:用于一步生成的预训练扩散模型的指数快速蒸馏 Mingyuan Zhou, Huangjie Zheng, Zhendong Wang, Mingzhang Yin, Hai Huang http://arxiv.org/pdf/2404.04057v1 null
2024-04-05 InstructHumans: Editing Animated 3D Human Textures with Instructions InstructHumans:使用说明编辑动画 3D 人体纹理 Jiayin Zhu, Linlin Yang, Angela Yao http://arxiv.org/pdf/2404.04037v1 null
2024-04-05 Physics-Inspired Synthesized Underwater Image Dataset 受物理启发的合成水下图像数据集 Reina Kaneko, Hiroshi Higashi, Yuichi Tanaka http://arxiv.org/pdf/2404.03998v1 null
2024-04-05 Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models Concept Weaver:在文本到图像模型中实现多概念融合 Gihyun Kwon, Simon Jenni, Dingzeyu Li, Joon-Young Lee, Jong Chul Ye, Fabian Caba Heilbron http://arxiv.org/pdf/2404.03913v1 null

多模态

Publish Date Title Title_CN Authors PDF Code
2024-04-05 Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation Sigma:用于多模态语义分割的暹罗曼巴网络 Zifu Wan, Yuhao Wang, Silong Yong, Pingping Zhang, Simon Stepputtis, Katia Sycara, Yaqi Xie http://arxiv.org/pdf/2404.04256v1 null
2024-04-05 MM-Gaussian: 3D Gaussian-based Multi-modal Fusion for Localization and Reconstruction in Unbounded Scenes MM-Gaussian:基于 3D 高斯的多模态融合,用于无界场景中的定位和重建 Chenyang Wu, Yifan Duan, Xinran Zhang, Yu Sheng, Jianmin Ji, Yanyong Zhang http://arxiv.org/pdf/2404.04026v1 null
2024-04-05 Enhancing Breast Cancer Diagnosis in Mammography: Evaluation and Integration of Convolutional Neural Networks and Explainable AI 增强乳房 X 光检查中的乳腺癌诊断:卷积神经网络和可解释人工智能的评估和集成 Maryam Ahmed, Tooba Bibi, Rizwan Ahmed Khan, Sidra Nasir http://arxiv.org/pdf/2404.03892v1 null
2024-04-05 Mitigating Heterogeneity in Federated Multimodal Learning with Biomedical Vision-Language Pre-training 通过生物医学视觉语言预训练减轻联合多模态学习中的异质性 Zitao Shuai, Liyue Shen http://arxiv.org/pdf/2404.03854v1 null

Nerf

Publish Date Title Title_CN Authors PDF Code
2024-04-05 Robust Gaussian Splatting 鲁棒高斯泼溅 François Darmon, Lorenzo Porzi, Samuel Rota-Bulò, Peter Kontschieder http://arxiv.org/pdf/2404.04211v1 null

模型压缩/优化

Publish Date Title Title_CN Authors PDF Code
2024-04-05 Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism 评估对抗鲁棒性:FGSM、Carlini-Wagner 攻击的比较以及蒸馏作为防御机制的作用 Trilokesh Ranjan Sarkar, Nilanjan Das, Pralay Sankar Maitra, Bijoy Some, Ritwik Saha, Orijita Adhikary, Bishal Bose, Jaydip Sen http://arxiv.org/pdf/2404.04245v1 null
2024-04-05 Deep Learning for Satellite Image Time Series Analysis: A Review 卫星图像时间序列分析的深度学习:综述 Lynn Miller, Charlotte Pelletier, Geoffrey I. Webb http://arxiv.org/pdf/2404.03936v1 null
2024-04-05 VoltaVision: A Transfer Learning model for electronic component classification VoltaVision:电子元件分类的迁移学习模型 Anas Mohammad Ishfaqul Muktadir Osmani, Taimur Rahman, Salekul Islam http://arxiv.org/pdf/2404.03898v1 null

分类/检测/识别/分割/...

Publish Date Title Title_CN Authors PDF Code
2024-04-05 Watermark-based Detection and Attribution of AI-Generated Content 基于水印的 AI 生成内容检测和归因 Zhengyuan Jiang, Moyang Guo, Yuepeng Hu, Neil Zhenqiang Gong http://arxiv.org/pdf/2404.04254v1 null
2024-04-05 Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation 用于文本监督语义分割的图像-文本联合分解 Ji-Jia Wu, Andy Chia-Hao Chang, Chieh-Yu Chuang, Chun-Pei Chen, Yu-Lun Liu, Min-Hung Chen, Hou-Ning Hu, Yung-Yu Chuang, Yen-Yu Lin http://arxiv.org/pdf/2404.04231v1 null
2024-04-05 SCAResNet: A ResNet Variant Optimized for Tiny Object Detection in Transmission and Distribution Towers SCAResNet:针对输电和配电塔中微小物体检测而优化的 ResNet 变体 Weile Li, Muqing Shi, Zhonghua Hong http://arxiv.org/pdf/2404.04179v1 null
2024-04-05 Noisy Label Processing for Classification: A Survey 用于分类的噪声标签处理:调查 Mengting Li, Chuang Zhu http://arxiv.org/pdf/2404.04159v1 null
2024-04-05 MarsSeg: Mars Surface Semantic Segmentation with Multi-level Extractor and Connector MarsSeg:具有多级提取器和连接器的火星表面语义分割 Junbo Li, Keyan Chen, Gengju Tian, Lu Li, Zhenwei Shi http://arxiv.org/pdf/2404.04155v1 null
2024-04-05 Improving Detection in Aerial Images by Capturing Inter-Object Relationships 通过捕获对象间的关系来改进航空图像的检测 Botao Ren, Botian Xu, Yifan Pu, Jingyi Wang, Zhidong Deng http://arxiv.org/pdf/2404.04140v1 null
2024-04-05 Label Propagation for Zero-shot Classification with Vision-Language Models 使用视觉语言模型进行零样本分类的标签传播 Vladan Stojnić, Yannis Kalantidis, Giorgos Tolias http://arxiv.org/pdf/2404.04072v1 null
2024-04-05 No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation 没有时间训练:支持非参数网络进行少镜头 3D 场景分割 Xiangyang Zhu, Renrui Zhang, Bowei He, Ziyu Guo, Jiaming Liu, Han Xiao, Chaoyou Fu, Hao Dong, Peng Gao http://arxiv.org/pdf/2404.04050v1 null
2024-04-05 Towards Efficient and Accurate CT Segmentation via Edge-Preserving Probabilistic Downsampling 通过边缘保留概率下采样实现高效、准确的 CT 分割 Shahzad Ali, Yu Rim Lee, Soo Young Park, Won Young Tak, Soon Ki Jung http://arxiv.org/pdf/2404.03991v1 null
2024-04-05 Learning Correlation Structures for Vision Transformers 学习视觉变压器的相关结构 Manjin Kim, Paul Hongsuck Seo, Cordelia Schmid, Minsu Cho http://arxiv.org/pdf/2404.03924v1 null
2024-04-05 LiDAR-Guided Cross-Attention Fusion for Hyperspectral Band Selection and Image Classification LiDAR 引导的交叉注意融合用于高光谱波段选择和图像分类 Judy X Yang, Jun Zhou, Jing Wang, Hui Tian, Wee Chung Liew http://arxiv.org/pdf/2404.03883v1 null
2024-04-05 Increasing Fairness in Classification of Out of Distribution Data for Facial Recognition 提高面部识别的分布外数据分类的公平性 Gianluca Barone, Aashrit Cunchala, Rudy Nunez http://arxiv.org/pdf/2404.03876v1 null

图像理解

Publish Date Title Title_CN Authors PDF Code
2024-04-05 3D Facial Expressions through Analysis-by-Neural-Synthesis 通过神经综合分析的 3D 面部表情 George Retsinas, Panagiotis P. Filntisis, Radek Danecek, Victoria F. Abrevaya, Anastasios Roussos, Timo Bolkart, Petros Maragos http://arxiv.org/pdf/2404.04104v1 null
2024-04-05 LightOctree: Lightweight 3D Spatially-Coherent Indoor Lighting Estimation LightOctree:轻量级 3D 空间相干室内照明估计 Xuecan Wang, Shibang Xiao, Xiaohui Liang http://arxiv.org/pdf/2404.03925v1 null
2024-04-05 Deep Phase Coded Image Prior 深相位编码图像先验 Nimrod Shabtay, Eli Schwartz, Raja Giryes http://arxiv.org/pdf/2404.03906v1 null

LLM

Publish Date Title Title_CN Authors PDF Code
2024-04-05 Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2) 谁来评估评价?使用 T2IScoreScore (TS2) 客观地对文本到图像提示一致性指标进行评分 Michael Saxon, Fatima Jahara, Mahsa Khoshnoodi, Yujie Lu, Aditya Sharma, William Yang Wang http://arxiv.org/pdf/2404.04251v1 null

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-04-05 DiffOp-net: A Differential Operator-based Fully Convolutional Network for Unsupervised Deformable Image Registration DiffOp-net:基于差分算子的全卷积网络,用于无监督可变形图像配准 Jiong Wu http://arxiv.org/pdf/2404.04244v1 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-04-05 Physical Property Understanding from Language-Embedded Feature Fields 从语言嵌入的特征字段理解物理属性 Albert J. Zhai, Yuan Shen, Emily Y. Chen, Gloria X. Wang, Xinlei Wang, Sheng Wang, Kaiyu Guan, Shenlong Wang http://arxiv.org/pdf/2404.04242v1 null
2024-04-05 RaSim: A Range-aware High-fidelity RGB-D Data Simulation Pipeline for Real-world Applications RaSim:适用于实际应用的范围感知高保真 RGB-D 数据模拟管道 Xingyu Liu, Chenyangguang Zhang, Gu Wang, Ruida Zhang, Xiangyang Ji http://arxiv.org/pdf/2404.03962v1 null

其他

Publish Date Title Title_CN Authors PDF Code
2024-04-05 Dynamic Risk Assessment Methodology with an LDM-based System for Parking Scenarios 基于 LDM 的停车场景系统的动态风险评估方法 Paola Natalia Cañas, Mikel García, Nerea Aranjuelo, Marcos Nieto, Aitor Iglesias, Igor Rodríguez http://arxiv.org/pdf/2404.04040v1 null
2024-04-05 Framework to generate perfusion map from CT and CTA images in patients with acute ischemic stroke: A longitudinal and cross-sectional study 从急性缺血性中风患者的 CT 和 CTA 图像生成灌注图的框架:纵向和横断面研究 Chayanin Tangwiriyasakul, Pedro Borges, Stefano Moriconi, Paul Wright, Yee-Haur Mah, James Teo, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso http://arxiv.org/pdf/2404.04025v1 null
2024-04-05 Neural-Symbolic VideoQA: Learning Compositional Spatio-Temporal Reasoning for Real-world Video Question Answering 神经符号视频问答:学习现实世界视频问答的组合时空推理 Lili Liang, Guanglu Sun, Jin Qiu, Lizhong Zhang http://arxiv.org/pdf/2404.04007v1 null
2024-04-05 Finsler-Laplace-Beltrami Operators with Application to Shape Analysis Finsler-Laplace-Beltrami 算子在形状分析中的应用 Simon Weber, Thomas Dagès, Maolin Gao, Daniel Cremers http://arxiv.org/pdf/2404.03999v1 null
2024-04-05 Rolling the dice for better deep learning performance: A study of randomness techniques in deep neural networks 掷骰子以获得更好的深度学习性能:深度神经网络中随机性技术的研究 Mohammed Ghaith Altarabichi, Sławomir Nowaczyk, Sepideh Pashami, Peyman Sheikholharam Mashhadi, Julia Handl http://arxiv.org/pdf/2404.03992v1 null
2024-04-05 Real-GDSR: Real-World Guided DSM Super-Resolution via Edge-Enhancing Residual Network Real-GDSR:通过边缘增强残差网络实现现实世界引导的 DSM 超分辨率 Daniel Panangian, Ksenia Bittner http://arxiv.org/pdf/2404.03930v1 null