Skip to content

Latest commit

 

History

History
executable file
·
115 lines (92 loc) · 16.3 KB

2024-11-10.md

File metadata and controls

executable file
·
115 lines (92 loc) · 16.3 KB

[UPDATED!] 2024-11-10 (Publish Time)

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-11-10 DDIM-Driven Coverless Steganography Scheme with Real Key 基于DDIM的无密钥隐写术方案 Mingyu Yu, Haonan Miao, Zhengping Jin, Sujuan Qing http://arxiv.org/pdf/2411.06486v1 null
2024-11-10 Improved Video VAE for Latent Video Diffusion Model 改进的视频VAE用于潜在视频扩散模型 Pingyu Wu, Kai Zhu, Yu Liu, Liming Zhao, Wei Zhai, Yang Cao, Zheng-Jun Zha http://arxiv.org/pdf/2411.06449v1 null
2024-11-10 Detecting AutoEncoder is Enough to Catch LDM Generated Images 检测自编码器足以捕捉LDM生成的图像 Dmitry Vesnin, Dmitry Levshun, Andrey Chechulin http://arxiv.org/pdf/2411.06441v1 null
2024-11-10 A Hybrid Approach for COVID-19 Detection: Combining Wasserstein GAN with Transfer Learning 基于Wasserstein GAN的COVID-19检测混合方法:结合迁移学习 Sumera Rounaq, Shahid Munir Shah, Mahmoud Aljawarneh, Sarah Khan, Ghulam Muhammad http://arxiv.org/pdf/2411.06397v1 null
2024-11-10 CityGuessr: City-Level Video Geo-Localization on a Global Scale CityGuessr:全球范围内的城市级别视频地理定位 Parth Parag Kulkarni, Gaurav Kumar Nayak, Mubarak Shah http://arxiv.org/pdf/2411.06344v1 null
2024-11-10 CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense 因果差异:基于扩散模型的因果解耦用于对抗防御 Mingkun Zhang, Keping Bi, Wei Chen, Quanrun Chen, Jiafeng Guo, Xueqi Cheng http://arxiv.org/pdf/2410.23091v2 null
2024-11-10 Uni-3DAD: GAN-Inversion Aided Universal 3D Anomaly Detection on Model-free Products 基于GAN反演的无模型产品通用3D异常检测:Uni-3DAD Jiayu Liu, Shancong Mou, Nathan Gaw, Yinan Wang http://arxiv.org/pdf/2408.16201v2 null
2024-11-10 AirSketch: Generative Motion to Sketch AirSketch:生成性运动至草图 Hui Xian Grace Lim, Xuanming Cui, Ser-Nam Lim, Yogesh S Rawat http://arxiv.org/pdf/2407.08906v2 null
2024-11-10 Automatic Fused Multimodal Deep Learning for Plant Identification 自动融合多模态深度学习植物识别 Alfreds Lapkovskis, Natalia Nefedova, Ali Beikmohammadi http://arxiv.org/pdf/2406.01455v2 link
2024-11-10 MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models MambaTalk:基于选择性状态空间模型的效率整体手势合成 Zunnan Xu, Yukang Lin, Haonan Han, Sicheng Yang, Ronghui Li, Yachao Zhang, Xiu Li http://arxiv.org/pdf/2403.09471v3 null
2024-11-10 UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control UniCtrl:通过无训练统一注意力控制提升文本到视频扩散模型的时空一致性 Tian Xia, Xuweiyi Chen, Sihan Xu http://arxiv.org/pdf/2403.02332v4 link
2024-11-10 Diffusion Models With Learned Adaptive Noise 学习自适应噪声的扩散模型 Subham Sekhar Sahoo, Aaron Gokaslan, Chris De Sa, Volodymyr Kuleshov http://arxiv.org/pdf/2312.13236v3 link

多模态

Publish Date Title Title_CN Authors PDF Code
2024-11-10 KMM: Key Frame Mask Mamba for Extended Motion Generation KMM:用于扩展运动生成的关键帧掩码Mamba Zeyu Zhang, Hang Gao, Akide Liu, Qi Chen, Feng Chen, Yiran Wang, Danning Li, Hao Tang http://arxiv.org/pdf/2411.06481v1 null
2024-11-10 A Multimodal Approach For Endoscopic VCE Image Classification Using BiomedCLIP-PubMedBERT 基于BiomedCLIP-PubMedBERT的多模态内镜VCE图像分类方法 Nagarajan Ganapathy, Podakanti Satyajith Chary, Teja Venkata Ramana Kumar Pithani, Pavan Kavati, Arun Kumar S http://arxiv.org/pdf/2410.19944v2 link
2024-11-10 TeaserGen: Generating Teasers for Long Documentaries TeaserGen:为长篇纪录片生成预告片 Weihan Xu, Paul Pu Liang, Haven Kim, Julian McAuley, Taylor Berg-Kirkpatrick, Hao-Wen Dong http://arxiv.org/pdf/2410.05586v2 null
2024-11-10 Visual Mamba: A Survey and New Outlooks 视觉蟒蛇:综述与新视角 Rui Xu, Shu Yang, Yihui Wang, Yu Cai, Bo Du, Hao Chen http://arxiv.org/pdf/2404.18861v3 link

Nerf

Publish Date Title Title_CN Authors PDF Code
2024-11-10 Through the Curved Cover: Synthesizing Cover Aberrated Scenes with Refractive Field 透过弯曲封面:利用折射场合成覆盖畸变场景 Liuyue Xie, Jiancong Guo, Laszlo A. Jeni, Zhiheng Jia, Mingyang Li, Yunwen Zhou, Chao Guo http://arxiv.org/pdf/2411.06365v1 null

3DGS

Publish Date Title Title_CN Authors PDF Code
2024-11-10 Adaptive and Temporally Consistent Gaussian Surfels for Multi-view Dynamic Reconstruction 自适应且时间一致的高斯球面多视图动态重建 Decai Chen, Brianne Oberson, Ingo Feldmann, Oliver Schreer, Anna Hilsmann, Peter Eisert http://arxiv.org/pdf/2411.06602v1 null
2024-11-10 SplatFormer: Point Transformer for Robust 3D Gaussian Splatting SplatFormer:用于鲁棒3D高斯Splatting的点变换器 Yutong Chen, Marko Mihajlovic, Xiyi Chen, Yiming Wang, Sergey Prokudin, Siyu Tang http://arxiv.org/pdf/2411.06390v1 null

模型压缩/优化

Publish Date Title Title_CN Authors PDF Code
2024-11-10 Diffusion Sampling Correction via Approximately 10 Parameters 基于约10个参数的扩散采样校正 Guangyi Wang, Wei Peng, Lijiang Li, Wenyu Chen, Yuren Cai, Songzhi Su http://arxiv.org/pdf/2411.06503v1 null
2024-11-10 RL-Pruner: Structured Pruning Using Reinforcement Learning for CNN Compression and Acceleration RL-Pruner:基于强化学习的CNN压缩与加速结构化剪枝 Boyao Wang, Volodymyr Kindratenko http://arxiv.org/pdf/2411.06463v1 null

分类/检测/识别/分割/...

Publish Date Title Title_CN Authors PDF Code
2024-11-10 Few-shot Semantic Learning for Robust Multi-Biome 3D Semantic Mapping in Off-Road Environments 少样本语义学习在越野环境中的鲁棒多生物圈3D语义映射 Deegan Atha, Xianmei Lei, Shehryar Khattak, Anna Sabel, Elle Miller, Aurelio Noca, Grace Lim, Jeffrey Edlund, Curtis Padgett, Patrick Spieler http://arxiv.org/pdf/2411.06632v1 null
2024-11-10 Enhancing frozen histological section images using permanent-section-guided deep learning with nuclei attention 基于永久切片引导的核注意力深度学习增强冷冻组织切片图像 Elad Yoshai, Gil Goldinger, Miki Haifler, Natan T. Shaked http://arxiv.org/pdf/2411.06583v1 null
2024-11-10 Extended multi-stream temporal-attention module for skeleton-based human action recognition (HAR) 基于骨骼的动作识别(HAR)的扩展多流时间注意力模块 Faisal Mehmood, Xin Guo, Enqing Chen, Muhammad Azeem Akbar, Arif Ali Khan, Sami Ullah http://arxiv.org/pdf/2411.06553v1 null
2024-11-10 Image Segmentation from Shadow-Hints using Minimum Spanning Trees 基于最小生成树的阴影提示图像分割 Moritz Heep, Eduard Zell http://arxiv.org/pdf/2411.06530v1 null
2024-11-10 PRISM: Privacy-preserving Inter-Site MRI Harmonization via Disentangled Representation Learning PRISM:基于解耦表示学习的隐私保护跨站点MRI调和 Sarang Galada, Tanurima Halder, Kunal Deo, Ram P Krish, Kshitij Jadhav http://arxiv.org/pdf/2411.06513v1 null
2024-11-10 Understanding the Role of Equivariance in Self-supervised Learning 理解等变性在自监督学习中的作用 Yifei Wang, Kaiwen Hu, Sharut Gupta, Ziyu Ye, Yisen Wang, Stefanie Jegelka http://arxiv.org/pdf/2411.06508v1 null
2024-11-10 Mitigating covariate shift in non-colocated data with learned parameter priors 减轻非同位数据中的协变量偏移:利用学习参数先验 Behraj Khan, Behroz Mirza, Nouman Durrani, Tahir Syed http://arxiv.org/pdf/2411.06499v1 null
2024-11-10 Superpixel Segmentation: A Long-Lasting Ill-Posed Problem 超像素分割:一个经久不衰的病态问题 Rémi Giraud, Michaël Clément http://arxiv.org/pdf/2411.06478v1 null
2024-11-10 SAN: Structure-Aware Network for Complex and Long-tailed Chinese Text Recognition 结构感知网络:用于复杂和长尾中文文本识别 Junyi Zhang, Chang Liu, Chun Yang http://arxiv.org/pdf/2411.06381v1 null
2024-11-10 PKF: Probabilistic Data Association Kalman Filter for Multi-Object Tracking PKF:用于多目标跟踪的概率数据关联卡尔曼滤波器 Hanwen Cao, George J. Pappas, Nikolay Atanasov http://arxiv.org/pdf/2411.06378v1 null
2024-11-10 Layer-Wise Feature Metric of Semantic-Pixel Matching for Few-Shot Learning 语义像素匹配的分层特征度量在少样本学习中的应用 Hao Tang, Junhao Lu, Guoheng Huang, Ming Li, Xuhang Chen, Guo Zhong, Zhengguang Tan, Zinuo Li http://arxiv.org/pdf/2411.06363v1 null
2024-11-10 Deep Active Learning in the Open World 开放世界中的深度主动学习 Tian Xie, Jifan Zhang, Haoyue Bai, Robert Nowak http://arxiv.org/pdf/2411.06353v1 null
2024-11-10 Classification in Japanese Sign Language Based on Dynamic Facial Expressions 基于动态面部表情的日本手语分类 Yui Tatsumi, Shoko Tanaka, Shunsuke Akamatsu, Takahiro Shindo, Hiroshi Watanabe http://arxiv.org/pdf/2411.06347v1 null
2024-11-10 Self-supervised Representation Learning for Cell Event Recognition through Time Arrow Prediction 基于时间箭头预测的细胞事件识别的自监督表征学习 Cangxiong Chen, Vinay P. Namboodiri, Julia E. Sero http://arxiv.org/pdf/2411.03924v2 null
2024-11-10 RSNet: A Light Framework for The Detection of Multi-scale Remote Sensing Targets RSNet:用于多尺度遥感目标检测的轻量级框架 Hongyu Chen, Chengcheng Chen, Fei Wang, Yuhu Shi, Weiming Zeng http://arxiv.org/pdf/2410.23073v3 null
2024-11-10 Multi-Stage Airway Segmentation in Lung CT Based on Multi-scale Nested Residual UNet 基于多尺度嵌套残差UNet的肺CT多阶段气道分割 Bingyu Yang, Huai Liao, Xinyan Huang, Qingyao Tian, Jinlin Wu, Jingdi Hu, Hongbin Liu http://arxiv.org/pdf/2410.18456v2 null
2024-11-10 AlphaChimp: Tracking and Behavior Recognition of Chimpanzees AlphaChimp:黑猩猩追踪与行为识别 Xiaoxuan Ma, Yutang Lin, Yuan Xu, Stephan P. Kaufhold, Jack Terwilliger, Andres Meza, Yixin Zhu, Federico Rossano, Yizhou Wang http://arxiv.org/pdf/2410.17136v2 link
2024-11-10 Attention Normalization Impacts Cardinality Generalization in Slot Attention 注意力归一化对槽位注意力中的基数泛化影响 Markus Krimmel, Jan Achterhold, Joerg Stueckler http://arxiv.org/pdf/2407.04170v2 link
2024-11-10 SegNet4D: Efficient Instance-Aware 4D LiDAR Semantic Segmentation for Driving Scenarios SegNet4D:高效实例感知4D激光雷达语义分割用于驾驶场景 Neng Wang, Ruibin Guo, Chenghao Shi, Ziyue Wang, Hui Zhang, Huimin Lu, Zhiqiang Zheng, Xieyuanli Chen http://arxiv.org/pdf/2406.16279v2 link
2024-11-10 CLIPScope: Enhancing Zero-Shot OOD Detection with Bayesian Scoring CLIPScope:利用贝叶斯评分增强零样本OODD检测 Hao Fu, Naman Patel, Prashanth Krishnamurthy, Farshad Khorrami http://arxiv.org/pdf/2405.14737v2 null
2024-11-10 Perturbing the Gradient for Alleviating Meta Overfitting 扰动梯度以减轻元过拟合 Manas Gogoi, Sambhavi Tiwari, Shekhar Verma http://arxiv.org/pdf/2405.12299v2 link
2024-11-10 Hierarchical Randomized Smoothing 分层随机平滑 Yan Scholten, Jan Schuchardt, Aleksandar Bojchevski, Stephan Günnemann http://arxiv.org/pdf/2310.16221v5 null

GNN

Publish Date Title Title_CN Authors PDF Code
2024-11-10 Graph Neural Networks for modelling breast biomechanical compression 基于图神经网络的乳腺生物力学压缩建模 Hadeel Awwad, Eloy García, Robert Martí http://arxiv.org/pdf/2411.06596v1 null

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-11-10 Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement 区域感知的基于硬绑定和软细化的文本到图像生成 Zhennan Chen, Yajie Li, Haofan Wang, Zhibo Chen, Zhengkai Jiang, Jun Li, Qian Wang, Jian Yang, Ying Tai http://arxiv.org/pdf/2411.06558v1 null
2024-11-10 Local Implicit Wavelet Transformer for Arbitrary-Scale Super-Resolution 局部隐式小波变换器在任意尺度超分辨率中的应用 Minghong Duan, Linhao Qu, Shaolei Liu, Manning Wang http://arxiv.org/pdf/2411.06442v1 null
2024-11-10 SEM-Net: Efficient Pixel Modelling for image inpainting with Spatially Enhanced SSM SEM-Net:基于空间增强SSM的图像修复高效像素建模 Shuang Chen, Haozheng Zhang, Amir Atapour-Abarghouei, Hubert P. H. Shum http://arxiv.org/pdf/2411.06318v1 null
2024-11-10 FilterViT and DropoutViT FilterViT与DropoutViT Bohang Sun http://arxiv.org/pdf/2410.22709v3 null
2024-11-10 Ada-VE: Training-Free Consistent Video Editing Using Adaptive Motion Prior Ada-VE:自适应运动先验的无监督一致性视频编辑 Tanvir Mahmud, Mustafa Munir, Radu Marculescu, Diana Marculescu http://arxiv.org/pdf/2406.04873v2 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-11-10 A novel algorithm for optimizing bundle adjustment in image sequence alignment 图像序列对齐中优化捆绑调整的新型算法 Hailin Xu, Hongxia Wang, Huanshui Zhang http://arxiv.org/pdf/2411.06343v1 null

各类学习方式

Publish Date Title Title_CN Authors PDF Code
2024-11-10 SelEx: Self-Expertise in Fine-Grained Generalized Category Discovery SelEx:细粒度泛化类别发现的自我专长 Sarah Rastegar, Mohammadreza Salehi, Yuki M. Asano, Hazel Doughty, Cees G. M. Snoek http://arxiv.org/pdf/2408.14371v2 link

其他

Publish Date Title Title_CN Authors PDF Code
2024-11-10 I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength I2VControl-Camera:可调节运动强度的精确视频摄像头控制 Wanquan Feng, Jiawei Liu, Pengqi Tu, Tianhao Qi, Mingzhen Sun, Tianxiang Ma, Songtao Zhao, Siyu Zhou, Qian He http://arxiv.org/pdf/2411.06525v1 null
2024-11-10 Offline Handwritten Signature Verification Using a Stream-Based Approach 基于流式方法的离线手写签名验证 Kecia G. de Moura, Rafael M. O. Cruz, Robert Sabourin http://arxiv.org/pdf/2411.06510v1 null
2024-11-10 Dropout the High-rate Downsampling: A Novel Design Paradigm for UHD Image Restoration 高率下采样Dropout:UHD图像恢复的新型设计范式 Chen Wu, Ling Wang, Long Peng, Dianjie Lu, Zhuoran Zheng http://arxiv.org/pdf/2411.06456v1 null
2024-11-10 SamRobNODDI: Q-Space Sampling-Augmented Continuous Representation Learning for Robust and Generalized NODDI SamRobNODDI:基于Q空间采样的连续表示学习,用于鲁棒和泛化的NODDI Taohui Xiao, Jian Cheng, Wenxin Fan, Enqing Dong, Hairong Zheng, Shanshan Wang http://arxiv.org/pdf/2411.06444v1 null
2024-11-10 Activation Map Compression through Tensor Decomposition for Deep Learning 深度学习中的张量分解激活图压缩 Le-Trung Nguyen, Aël Quélennec, Enzo Tartaglione, Samuel Tardieu, Van-Tam Nguyen http://arxiv.org/pdf/2411.06346v1 null
2024-11-10 Comparing ImageNet Pre-training with Digital Pathology Foundation Models for Whole Slide Image-Based Survival Analysis 比较ImageNet预训练与数字病理基础模型在基于全切片图像的生存分析中的应用 Kleanthis Marios Papadopoulos http://arxiv.org/pdf/2405.17446v2 null
2024-11-10 Variational Imbalanced Regression: Fair Uncertainty Quantification via Probabilistic Smoothing 变分不平衡回归:通过概率平滑实现公平的不确定性量化 Ziyan Wang, Hao Wang http://arxiv.org/pdf/2306.06599v8 null