Skip to content

Latest commit

 

History

History
executable file
·
98 lines (75 loc) · 12.7 KB

2024-11-09.md

File metadata and controls

executable file
·
98 lines (75 loc) · 12.7 KB

[UPDATED!] 2024-11-09 (Publish Time)

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-11-09 Exploring Out-of-distribution Detection for Sparse-view Computed Tomography with Diffusion Models 基于扩散模型的稀疏视图计算机断层扫描分布外检测研究 Ezgi Demircan-Tureyen, Felix Lucka, Tristan van Leeuwen http://arxiv.org/pdf/2411.06308v1 null
2024-11-09 Text2CAD: Text to 3D CAD Generation via Technical Drawings 文本到CAD:通过技术图纸生成3D CAD Mohsen Yavartanoo, Sangmin Hong, Reyhaneh Neshatavar, Kyoung Mu Lee http://arxiv.org/pdf/2411.06206v1 null
2024-11-09 Multi-object Tracking by Detection and Query: an efficient end-to-end manner 基于检测与查询的多目标跟踪:一种高效端到端方法 Shukun Jia, Yichao Cao, Feng Yang, Xin Lu, Xiaobo Lu http://arxiv.org/pdf/2411.06197v1 null
2024-11-09 Scalable, Tokenization-Free Diffusion Model Architectures with Efficient Initial Convolution and Fixed-Size Reusable Structures for On-Device Image Generation 可扩展、无分词的扩散模型架构,带有高效的初始卷积和固定大小的可重复结构,用于设备上的图像生成 Sanchar Palit, Sathya Veera Reddy Dendi, Mallikarjuna Talluri, Raj Narayana Gadde http://arxiv.org/pdf/2411.06119v1 null
2024-11-09 AI-Driven Stylization of 3D Environments 人工智能驱动的3D环境风格化 Yuanbo Chen, Yixiao Kang, Yukun Song, Cyrus Vachha, Sining Huang http://arxiv.org/pdf/2411.06067v1 null
2024-11-09 Towards Kinetic Manipulation of the Latent Space 向潜在空间的动力学操作 Diego Porres http://arxiv.org/pdf/2409.09867v2 link
2024-11-09 Neural Gaffer: Relighting Any Object via Diffusion 神经光栅:通过扩散重光照任何物体 Haian Jin, Yuan Li, Fujun Luan, Yuanbo Xiangli, Sai Bi, Kai Zhang, Zexiang Xu, Jin Sun, Noah Snavely http://arxiv.org/pdf/2406.07520v2 null
2024-11-09 Disentangling Hippocampal Shape Variations: A Study of Neurological Disorders Using Mesh Variational Autoencoder with Contrastive Learning 解析海马体形状变化:基于对比学习的网格变分自编码器在神经疾病研究中的应用 Jakaria Rabbi, Johannes Kiechle, Christian Beaulieu, Nilanjan Ray, Dana Cobzas http://arxiv.org/pdf/2404.00785v3 link

多模态

Publish Date Title Title_CN Authors PDF Code
2024-11-09 Personalize to generalize: Towards a universal medical multi-modality generalization through personalization 个性化到泛化:通过个性化实现通用医学多模态泛化 Zhaorui Tan, Xi Yang, Tan Pan, Tianyi Liu, Chen Jiang, Xin Guo, Qiufeng Wang, Anh Nguyen, Yuan Qi, Kaizhu Huang, et.al. http://arxiv.org/pdf/2411.06106v1 null
2024-11-09 An Empirical Analysis on Spatial Reasoning Capabilities of Large Multimodal Models 大模态模型空间推理能力实证分析 Fatemeh Shiri, Xiao-Yu Guo, Mona Golestan Far, Xin Yu, Gholamreza Haffari, Yuan-Fang Li http://arxiv.org/pdf/2411.06048v1 null
2024-11-09 COSMIC: Compress Satellite Images Efficiently via Diffusion Compensation COSMIC:通过扩散补偿高效压缩卫星图像 Ziyuan Zhang, Han Qiu, Maosen Zhang, Jun Liu, Bin Chen, Tianwei Zhang, Hewu Li http://arxiv.org/pdf/2410.01698v2 link
2024-11-09 DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model DriveGPT4:通过大型语言模型实现的解释性端到端自动驾驶 Zhenhua Xu, Yujia Zhang, Enze Xie, Zhen Zhao, Yong Guo, Kwan-Yee. K. Wong, Zhenguo Li, Hengshuang Zhao http://arxiv.org/pdf/2310.01412v5 null

Nerf

Publish Date Title Title_CN Authors PDF Code
2024-11-09 UC-NeRF: Uncertainty-aware Conditional Neural Radiance Fields from Endoscopic Sparse Views UC-NeRF:基于内窥镜稀疏视图的感知不确定性的条件神经辐射场 Jiaxin Guo, Jiangliu Wang, Ruofeng Wei, Di Kang, Qi Dou, Yun-hui Liu http://arxiv.org/pdf/2409.02917v2 link

3DGS

Publish Date Title Title_CN Authors PDF Code
2024-11-09 GaussianSpa: An "Optimizing-Sparsifying" Simplification Framework for Compact and High-Quality 3D Gaussian Splatting 高斯斯普莱特优化稀疏简化框架:用于紧凑和高质量3D高斯斯普莱特的“优化-稀疏”简化框架 Yangming Zhang, Wenqi Jia, Wei Niu, Miao Yin http://arxiv.org/pdf/2411.06019v1 null
2024-11-09 Implicit Gaussian Splatting with Efficient Multi-Level Tri-Plane Representation 隐式高斯碎喷与高效多级三平面表示 Minye Wu, Tinne Tuytelaars http://arxiv.org/pdf/2408.10041v2 null
2024-11-09 AtomGS: Atomizing Gaussian Splatting for High-Fidelity Radiance Field 原子GS:高保真辐射场的高斯分层原子化 Rong Liu, Rui Xu, Yue Hu, Meida Chen, Andrew Feng http://arxiv.org/pdf/2405.12369v3 link

模型压缩/优化

Publish Date Title Title_CN Authors PDF Code
2024-11-09 Zero-Shot NAS via the Suppression of Local Entropy Decrease 通过抑制局部熵减少的零样本NAS方法 Ning Wu, Han Huang, Yueting Xu, Zhifeng Hao http://arxiv.org/pdf/2411.06236v1 null
2024-11-09 Expansion Quantization Network: An Efficient Micro-emotion Annotation and Detection Framework 扩展量化网络:一种高效的微表情标注与检测框架 Jingyi Zhou, Senlin Luo, Haofan Chen http://arxiv.org/pdf/2411.06160v1 null
2024-11-09 Dynamic Textual Prompt For Rehearsal-free Lifelong Person Re-identification 动态文本提示用于无需排练的终身人物重识别 Hongyu Chen, Bingliang Jiao, Wenxuan Wang, Peng Wang http://arxiv.org/pdf/2411.06023v1 null
2024-11-09 Relational Self-supervised Distillation with Compact Descriptors for Image Copy Detection 基于紧凑描述符的关系式自监督蒸馏图像复制检测 Juntae Kim, Sungwon Woo, Jongho Nang http://arxiv.org/pdf/2405.17928v5 null

分类/检测/识别/分割/...

Publish Date Title Title_CN Authors PDF Code
2024-11-09 Hidden in Plain Sight: Evaluating Abstract Shape Recognition in Vision-Language Models 隐藏在平凡之中:评估视觉-语言模型中的抽象形状识别 Arshia Hemmat, Adam Davies, Tom A. Lamb, Jianhao Yuan, Philip Torr, Ashkan Khakzar, Francesco Pinto http://arxiv.org/pdf/2411.06287v1 null
2024-11-09 LSSInst: Improving Geometric Modeling in LSS-Based BEV Perception with Instance Representation LSSInst:基于实例表示提升LSS在BEV感知中几何建模的方法 Weijie Ma, Jingwei Jiang, Yang Yang, Zehui Chen, Hao Chen http://arxiv.org/pdf/2411.06173v1 null
2024-11-09 LT-DARTS: An Architectural Approach to Enhance Deep Long-Tailed Learning LT-DARTS:增强深度长尾学习的架构方法 Yuhan Pan, Yanan Sun, Wei Gong http://arxiv.org/pdf/2411.06098v1 null
2024-11-09 Pattern Integration and Enhancement Vision Transformer for Self-Supervised Learning in Remote Sensing 遥感中的自监督学习:模式集成与增强视觉Transformer Kaixuan Lu, Ruiqian Zhang, Xiao Huang, Yuxing Xie, Xiaogang Ning, Hanchao Zhang, Mengke Yuan, Pan Zhang, Tao Wang, Tongkui Liao http://arxiv.org/pdf/2411.06091v1 null
2024-11-09 GlocalCLIP: Object-agnostic Global-Local Prompt Learning for Zero-shot Anomaly Detection GlocalCLIP:面向零样本异常检测的对象无关全局-局部提示学习 Jiyul Ham, Yonggon Jung, Jun-Geol Baek http://arxiv.org/pdf/2411.06071v1 null
2024-11-09 MagicFace: Training-free Universal-Style Human Image Customized Synthesis 无训练通用风格人脸图像定制合成 Yibin Wang, Weizhong Zhang, Cheng Jin http://arxiv.org/pdf/2408.07433v4 null
2024-11-09 MSA$^2$Net: Multi-scale Adaptive Attention-guided Network for Medical Image Segmentation MSA$^2$Net:多尺度自适应注意力引导网络用于医学图像分割 Sina Ghorbani Kolahi, Seyed Kamal Chaharsooghi, Toktam Khatibi, Afshin Bozorgpour, Reza Azad, Moein Heidari, Ilker Hacihaliloglu, Dorit Merhof http://arxiv.org/pdf/2407.21640v3 link
2024-11-09 AttEntropy: On the Generalization Ability of Supervised Semantic Segmentation Transformers to New Objects in New Domains AttEntropy:关于监督语义分割Transformer对新领域新对象的泛化能力研究 Krzysztof Lis, Matthias Rottmann, Annika Mütze, Sina Honari, Pascal Fua, Mathieu Salzmann http://arxiv.org/pdf/2212.14397v2 null

LLM

Publish Date Title Title_CN Authors PDF Code
2024-11-09 Aquila-plus: Prompt-Driven Visual-Language Models for Pixel-Level Remote Sensing Image Understanding Aquila-plus:基于提示的视觉语言模型,用于像素级遥感图像理解 Kaixuan Lu http://arxiv.org/pdf/2411.06142v1 null
2024-11-09 Aquila: A Hierarchically Aligned Visual-Language Model for Enhanced Remote Sensing Image Comprehension Aquila:一种用于增强遥感图像理解的层次对齐视觉-语言模型 Kaixuan Lu, Ruiqian Zhang, Xiao Huang, Yuxing Xie http://arxiv.org/pdf/2411.06074v1 null
2024-11-09 Give me a hint: Can LLMs take a hint to solve math problems? 大型语言模型能否根据提示解决数学问题? Vansh Agrawal, Pratham Singla, Amitoj Singh Miglani, Shivank Garg, Ayush Mangal http://arxiv.org/pdf/2410.05915v2 null
2024-11-09 ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs ConMe:重新思考现代视觉语言模型组合推理的评估方法 Irene Huang, Wei Lin, M. Jehanzeb Mirza, Jacob A. Hansen, Sivan Doveh, Victor Ion Butoi, Roei Herzig, Assaf Arbelle, Hilde Kuhene, Trevor Darrel, et.al. http://arxiv.org/pdf/2406.08164v2 link

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-11-09 NeuReg: Domain-invariant 3D Image Registration on Human and Mouse Brains NeuReg:人类和小鼠大脑的域不变3D图像配准 Taha Razzaq, Asim Iqbal http://arxiv.org/pdf/2411.06315v1 null
2024-11-09 Adaptive Aspect Ratios with Patch-Mixup-ViT-based Vehicle ReID 基于Patch-Mixup-ViT的车辆重识别自适应宽高比 Mei Qiu, Lauren Ann Christopher, Stanley Chien, Lingxi Li http://arxiv.org/pdf/2411.06297v1 null
2024-11-09 TranSPORTmer: A Holistic Approach to Trajectory Understanding in Multi-Agent Sports TranSPORTmer:多智能体运动轨迹理解的整体方法 Guillem Capellera, Luis Ferraz, Antonio Rubio, Antonio Agudo, Francesc Moreno-Noguer http://arxiv.org/pdf/2410.17785v2 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-11-09 Crowd3D++: Robust Monocular Crowd Reconstruction with Upright Space Crowd3D++:基于竖直空间的鲁棒单目人群重建 Jing Huang, Hao Wen, Tianyi Zhou, Haozhe Lin, Yu-Kun Lai, Kun Li http://arxiv.org/pdf/2411.06232v1 null
2024-11-09 Epi-NAF: Enhancing Neural Attenuation Fields for Limited-Angle CT with Epipolar Consistency Conditions Epi-NAF:基于视差一致性条件的有限角度CT神经衰减场增强 Daniel Gilo, Tzofi Klinghoffer, Or Litany http://arxiv.org/pdf/2411.06181v1 null
2024-11-09 PointCG: Self-supervised Point Cloud Learning via Joint Completion and Generation 点云自监督学习:联合完成与生成 Yun Liu, Peng Li, Xuefeng Yan, Liangliang Nan, Bing Wang, Honghua Chen, Lina Gong, Wei Zhao, Mingqiang Wei http://arxiv.org/pdf/2411.06041v1 null
2024-11-09 Transientangelo: Few-Viewpoint Surface Reconstruction Using Single-Photon Lidar 瞬变天使:单光子激光雷达的少视角表面重建 Weihan Luo, Anagh Malik, David B. Lindell http://arxiv.org/pdf/2408.12191v3 null

各类学习方式

Publish Date Title Title_CN Authors PDF Code
2024-11-09 HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics HERMES:基于片段和语义的时间一致长形式理解 Gueter Josmy Faure, Jia-Fong Yeh, Min-Hung Chen, Hung-Ting Su, Shang-Hong Lai, Winston H. Hsu http://arxiv.org/pdf/2408.17443v3 link

其他

Publish Date Title Title_CN Authors PDF Code
2024-11-09 Alleviating Hyperparameter-Tuning Burden in SVM Classifiers for Pulmonary Nodules Diagnosis with Multi-Task Bayesian Optimization 减轻SVM分类器在肺结节诊断中多任务贝叶斯优化的超参数调整负担 Wenhao Chi, Haiping Liu, Hongqiao Dong, Wenhua Liang, Bo Liu http://arxiv.org/pdf/2411.06184v1 null