Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-11-08 | StdGEN: Semantic-Decomposed 3D Character Generation from Single Images | StdGEN:基于语义分解的单张图像3D角色生成 | Yuze He, Yanning Zhou, Wang Zhao, Zhongkai Wu, Kaiwen Xiao, Wei Yang, Yong-Jin Liu, Xiao Han | http://arxiv.org/pdf/2411.05738v1 | null |
2024-11-08 | Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models | 图像到文本到图像:用于无标签评估图像到文本生成与文本到图像扩散模型的创新框架 | Jia-Hong Huang, Hongyi Zhu, Yixian Shen, Stevan Rudinac, Evangelos Kanoulas | http://arxiv.org/pdf/2411.05706v1 | null |
2024-11-08 | Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion | 迈向终身少量样本的文本到图像扩散定制 | Nan Song, Xiaofeng Yang, Ze Yang, Guosheng Lin | http://arxiv.org/pdf/2411.05544v1 | null |
2024-11-08 | Improving image synthesis with diffusion-negative sampling | 提升图像生成中的扩散负采样技术 | Alakh Desai, Nuno Vasconcelos | http://arxiv.org/pdf/2411.05473v1 | null |
2024-11-08 | POC-SLT: Partial Object Completion with SDF Latent Transformers | 基于SDF潜在变换器的部分物体补全 | Faezeh Zakeri, Raphael Braun, Lukas Ruppert, Henrik P. A. Lensch | http://arxiv.org/pdf/2411.05419v1 | null |
2024-11-08 | A Real-time Face Mask Detection and Social Distancing System for COVID-19 using Attention-InceptionV3 Model | 基于Attention-InceptionV3模型的COVID-19实时口罩检测与社交距离监控系统 | Abdullah Al Asif, Farhana Chowdhury Tisha | http://arxiv.org/pdf/2411.05312v1 | null |
2024-11-08 | Adaptive Whole-Body PET Image Denoising Using 3D Diffusion Models with ControlNet | 基于控制网络的3D扩散模型自适应全身PET图像降噪 | Boxiao Yu, Kuang Gong | http://arxiv.org/pdf/2411.05302v1 | null |
2024-11-08 | SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models | SVDQuant:通过低秩成分吸收异常值以用于4位扩散模型 | Muyang Li, Yujun Lin, Zhekai Zhang, Tianle Cai, Xiuyu Li, Junxian Guo, Enze Xie, Chenlin Meng, Jun-Yan Zhu, Song Han | http://arxiv.org/pdf/2411.05007v2 | link |
2024-11-08 | A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization | 猫就是猫(不是狗!):通过因果分析和嵌入优化揭示文本到图像编码器中的信息混淆 | Chieh-Yun Chen, Chiang Tseng, Li-Wu Tsao, Hong-Han Shuai | http://arxiv.org/pdf/2410.00321v5 | link |
2024-11-08 | TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation | 时间对齐字幕:多场景文本到视频生成的时间对齐字幕 | Hritik Bansal, Yonatan Bitton, Michal Yarom, Idan Szpektor, Aditya Grover, Kai-Wei Chang | http://arxiv.org/pdf/2405.04682v4 | null |
2024-11-08 | A Survey on Future Frame Synthesis: Bridging Deterministic and Generative Approaches | 关于未来帧合成的调查:连接确定性和生成性方法 | Ruibo Ming, Zhewei Huang, Zhuoxuan Ju, Jianming Hu, Lihui Peng, Shuchang Zhou | http://arxiv.org/pdf/2401.14718v5 | null |
2024-11-08 | Text-to-image Diffusion Models in Generative AI: A Survey | 生成AI中的文本到图像扩散模型:综述 | Chenshuang Zhang, Chaoning Zhang, Mengchun Zhang, In So Kweon, Junmo Kim | http://arxiv.org/pdf/2303.07909v3 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-11-08 | Tell What You Hear From What You See -- Video to Audio Generation Through Text | 从所见之音中讲述你所听——基于文本的视频音频生成 | Xiulong Liu, Kun Su, Eli Shlizerman | http://arxiv.org/pdf/2411.05679v1 | null |
2024-11-08 | Predicting Stroke through Retinal Graphs and Multimodal Self-supervised Learning | 通过视网膜图和多模态自监督学习预测中风 | Yuqing Huang, Bastian Wittmann, Olga Demler, Bjoern Menze, Neda Davoudi | http://arxiv.org/pdf/2411.05597v1 | null |
2024-11-08 | AuthFormer: Adaptive Multimodal biometric authentication transformer for middle-aged and elderly people | 自适应多模态生物特征认证Transformer,适用于中老年人群 | Yang rui, Meng ling-tao, Zhang qiu-yu | http://arxiv.org/pdf/2411.05395v1 | null |
2024-11-08 | ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving | ZOPP:自动驾驶零样本离线全景感知框架 | Tao Ma, Hongbin Zhou, Qiusheng Huang, Xuemeng Yang, Jianfei Guo, Bo Zhang, Min Dou, Yu Qiao, Botian Shi, Hongsheng Li | http://arxiv.org/pdf/2411.05311v1 | null |
2024-11-08 | Hierarchical Visual Feature Aggregation for OCR-Free Document Understanding | 基于层次视觉特征聚合的无OCR文档理解 | Jaeyoo Park, Jin Young Choi, Jeonghyung Park, Bohyung Han | http://arxiv.org/pdf/2411.05254v1 | null |
2024-11-08 | SaSR-Net: Source-Aware Semantic Representation Network for Enhancing Audio-Visual Question Answering | SaSR-Net:增强音视频问答的源感知语义表示网络 | Tianyu Yang, Yiyang Nan, Lisen Dai, Zhenwen Liang, Yapeng Tian, Xiangliang Zhang | http://arxiv.org/pdf/2411.04933v2 | null |
2024-11-08 | Enhancing Osteoporosis Detection: An Explainable Multi-Modal Learning Framework with Feature Fusion and Variable Clustering | 提升骨质疏松症检测:一种具有特征融合和变量聚类的可解释多模态学习框架 | Mehdi Hosseini Chagahi, Saeed Mohammadi Dashtaki, Niloufar Delfan, Nadia Mohammadi, Alireza Samari, Behzad Moshiri, Md. Jalil Piran, Oliver Faust | http://arxiv.org/pdf/2411.00916v2 | null |
2024-11-08 | Accessible, At-Home Detection of Parkinson's Disease via Multi-task Video Analysis | 基于多任务视频分析的易于在家检测帕金森病 | Md Saiful Islam, Tariq Adnan, Jan Freyberg, Sangwu Lee, Abdelrahman Abdelkader, Meghan Pawlik, Cathe Schwartz, Karen Jaffe, Ruth B. Schneider, E Ray Dorsey, et.al. | http://arxiv.org/pdf/2406.14856v3 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-11-08 | A Nerf-Based Color Consistency Method for Remote Sensing Images | 基于Nerf的遥感影像色彩一致性方法 | Zongcheng Zuo, Yuanxiang Li, Tongtong Zhang | http://arxiv.org/pdf/2411.05557v1 | null |
2024-11-08 | From Transparent to Opaque: Rethinking Neural Implicit Surfaces with |
从透明到不透明:α-NeuS 重构神经网络隐式曲面 | Haoran Zhang, Junkai Deng, Xuhui Chen, Fei Hou, Wencheng Wang, Hong Qin, Chen Qian, Ying He | http://arxiv.org/pdf/2411.05362v1 | null |
2024-11-08 | Rate-aware Compression for NeRF-based Volumetric Video | 基于NeRF的体积视频的速率感知压缩 | Zhiyu Zhang, Guo Lu, Huanxiong Liang, Zhengxue Cheng, Anni Tang, Li Song | http://arxiv.org/pdf/2411.05322v1 | null |
2024-11-08 | Efficient Dynamic-NeRF Based Volumetric Video Coding with Rate Distortion Optimization | 基于高效动态NeRF的体素视频编码及速率失真优化 | Zhiyu Zhang, Guo Lu, Huanxiong Liang, Anni Tang, Qiang Hu, Li Song | http://arxiv.org/pdf/2402.01380v2 | null |
2024-11-08 | $α$Surf: Implicit Surface Reconstruction for Semi-Transparent and Thin Objects with Decoupled Geometry and Opacity | $α$Surf:基于解耦几何和透明度的半透明和薄物体隐式表面重建 | Tianhao Wu, Hanxue Liang, Fangcheng Zhong, Gernot Riegler, Shimon Vainer, Jiankang Deng, Cengiz Oztireli | http://arxiv.org/pdf/2303.10083v2 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-11-08 | GS2Pose: Two-stage 6D Object Pose Estimation Guided by Gaussian Splatting | GS2Pose:基于高斯散布的二级6D目标姿态估计 | Jilan Mei, Junbo Li, Cai Meng | http://arxiv.org/pdf/2411.03807v3 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-11-08 | FGGP: Fixed-Rate Gradient-First Gradual Pruning | FGGP:固定率梯度优先渐变剪枝 | Lingkai Zhu, Can Deniz Bezek, Orcun Goksel | http://arxiv.org/pdf/2411.05500v1 | null |
2024-11-08 | SASWISE-UE: Segmentation and Synthesis with Interpretable Scalable Ensembles for Uncertainty Estimation | SASWISE-UE:用于不确定性估计的可解释可扩展集成分割与合成 | Weijie Chen, Alan McMillan | http://arxiv.org/pdf/2411.05324v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-11-08 | Curriculum Learning for Few-Shot Domain Adaptation in CT-based Airway Tree Segmentation | 基于CT的气道树分割中少样本领域自适应的课程学习 | Maxime Jacovella, Ali Keshavarzi, Elsa Angelini | http://arxiv.org/pdf/2411.05779v1 | null |
2024-11-08 | FisherMask: Enhancing Neural Network Labeling Efficiency in Image Classification Using Fisher Information | FisherMask:利用Fisher信息提升图像分类中神经网络标记效率 | Shreen Gul, Mohamed Elmahallawy, Sanjay Madria, Ardhendu Tripathy | http://arxiv.org/pdf/2411.05752v1 | null |
2024-11-08 | WavShadow: Wavelet Based Shadow Segmentation and Removal | 基于小波的阴影分割与去除 | Shreyans Jain, Aadya Arora, Viraj Vekaria, Karan Gandhi | http://arxiv.org/pdf/2411.05747v1 | null |
2024-11-08 | Scaling Laws for Task-Optimized Models of the Primate Visual Ventral Stream | 灵长类视觉腹侧流任务优化模型的可扩展性法则 | Abdulkadir Gokce, Martin Schrimpf | http://arxiv.org/pdf/2411.05712v1 | null |
2024-11-08 | Visual-TCAV: Concept-based Attribution and Saliency Maps for Post-hoc Explainability in Image Classification | 视觉-TCAV:图像分类后验可解释性中的基于概念的可解释性和显著性图 | Antonio De Santis, Riccardo Campi, Matteo Bianchi, Marco Brambilla | http://arxiv.org/pdf/2411.05698v1 | null |
2024-11-08 | Autoregressive Adaptive Hypergraph Transformer for Skeleton-based Activity Recognition | 自回归自适应超图Transformer在骨骼动作识别中的应用 | Abhisek Ray, Ayush Raj, Maheshkumar H. Kolekar | http://arxiv.org/pdf/2411.05692v1 | null |
2024-11-08 | Online-LoRA: Task-free Online Continual Learning via Low Rank Adaptation | 在线LoRA:基于低秩调整的无任务在线持续学习 | Xiwen Wei, Guihong Li, Radu Marculescu | http://arxiv.org/pdf/2411.05663v1 | null |
2024-11-08 | Video RWKV:Video Action Recognition Based RWKV | 视频RWKV:基于RWKV的视频动作识别 | Zhuowen Yin, Chengru Li, Xingbo Dong | http://arxiv.org/pdf/2411.05636v1 | null |
2024-11-08 | SynDroneVision: A Synthetic Dataset for Image-Based Drone Detection | SynDroneVision:基于图像的无人机检测合成数据集 | Tamara R. Lenhard, Andreas Weinmann, Kai Franke, Tobias Koch | http://arxiv.org/pdf/2411.05633v1 | null |
2024-11-08 | Efficient Audio-Visual Fusion for Video Classification | 高效的视频分类中的视听融合 | Mahrukh Awan, Asmar Nadeem, Armin Mustafa | http://arxiv.org/pdf/2411.05603v1 | null |
2024-11-08 | Open-set object detection: towards unified problem formulation and benchmarking | 开放集目标检测:迈向统一问题表述和基准测试 | Hejer Ammar, Nikita Kiselov, Guillaume Lapouge, Romaric Audigier | http://arxiv.org/pdf/2411.05564v1 | null |
2024-11-08 | Training objective drives the consistency of representational similarity across datasets | 训练目标驱动数据集间表征相似性的一致性 | Laure Ciernik, Lorenz Linhardt, Marco Morik, Jonas Dippel, Simon Kornblith, Lukas Muttenthaler | http://arxiv.org/pdf/2411.05561v1 | null |
2024-11-08 | DeepArUco++: Improved detection of square fiducial markers in challenging lighting conditions | DeepArUco++:在复杂光照条件下改进的方形标识符标记检测 | Rafael Berral-Soler, Rafael Muñoz-Salinas, Rafael Medina-Carnicer, Manuel J. Marín-Jiménez | http://arxiv.org/pdf/2411.05552v1 | null |
2024-11-08 | Do Histopathological Foundation Models Eliminate Batch Effects? A Comparative Study | 病理学基础模型能否消除批效应?一项比较研究 | Jonah Kömen, Hannah Marienwald, Jonas Dippel, Julius Hense | http://arxiv.org/pdf/2411.05489v1 | null |
2024-11-08 | Comparative Study of Probabilistic Atlas and Deep Learning Approaches for Automatic Brain Tissue Segmentation from MRI Using N4 Bias Field Correction and Anisotropic Diffusion Pre-processing Techniques | 基于N4偏场校正和各向异性扩散预处理技术的MRI自动脑组织分割中概率图谱与深度学习方法的比较研究 | Mohammad Imran Hossain, Muhammad Zain Amin, Daniel Tweneboah Anyimadu, Taofik Ahmed Suleiman | http://arxiv.org/pdf/2411.05456v1 | null |
2024-11-08 | Agricultural Landscape Understanding At Country-Scale | 国家尺度下的农业景观理解 | Radhika Dua, Nikita Saxena, Aditi Agarwal, Alex Wilson, Gaurav Singh, Hoang Tran, Ishan Deshpande, Amandeep Kaur, Gaurav Aggarwal, Chandan Nath, et.al. | http://arxiv.org/pdf/2411.05359v1 | null |
2024-11-08 | Enhancing Visual Classification using Comparative Descriptors | 利用比较描述符增强视觉分类 | Hankyeol Lee, Gawon Seo, Wonseok Choi, Geunyoung Jung, Kyungwoo Song, Jiyoung Jung | http://arxiv.org/pdf/2411.05357v1 | null |
2024-11-08 | A Quality-Centric Framework for Generic Deepfake Detection | 基于质量的通用深度伪造检测框架 | Wentang Song, Zhiyuan Yan, Yuzhen Lin, Taiping Yao, Changsheng Chen, Shen Chen, Yandan Zhao, Shouhong Ding, Bin Li | http://arxiv.org/pdf/2411.05335v1 | null |
2024-11-08 | Revisiting Network Perturbation for Semi-Supervised Semantic Segmentation | 重新审视网络扰动在半监督语义分割中的应用 | Sien Li, Tao Wang, Ruizhe Hu, Wenxi Liu | http://arxiv.org/pdf/2411.05307v1 | null |
2024-11-08 | SimpleBEV: Improved LiDAR-Camera Fusion Architecture for 3D Object Detection | SimpleBEV:改进的激光雷达-摄像头融合架构用于3D目标检测 | Yun Zhao, Zhan Gong, Peiru Zheng, Hong Zhu, Shaohua Wu | http://arxiv.org/pdf/2411.05292v1 | null |
2024-11-08 | Cancer-Net SCa-Synth: An Open Access Synthetically Generated 2D Skin Lesion Dataset for Skin Cancer Classification | 癌症网络SCa-Synth:一个开放获取的合成生成2D皮肤病变数据集,用于皮肤癌分类 | Chi-en Amy Tai, Oustan Ding, Alexander Wong | http://arxiv.org/pdf/2411.05269v1 | null |
2024-11-08 | ZAHA: Introducing the Level of Facade Generalization and the Large-Scale Point Cloud Facade Semantic Segmentation Benchmark Dataset | ZAHA:引入立面泛化级别和大规模点云立面语义分割基准数据集 | Olaf Wysocki, Yue Tan, Thomas Froech, Yan Xia, Magdalena Wysocki, Ludwig Hoegner, Daniel Cremers, Christoph Holst | http://arxiv.org/pdf/2411.04865v2 | link |
2024-11-08 | From CNN to ConvRNN: Adapting Visualization Techniques for Time-Series Anomaly Detection | 从CNN到ConvRNN:适应时间序列异常检测的可视化技术 | Fabien Poirier | http://arxiv.org/pdf/2411.04707v2 | null |
2024-11-08 | Region-Guided Attack on the Segment Anything Model (SAM) | 基于区域引导的Segment Anything Model(SAM)攻击 | Xiaoliang Liu, Furao Shen, Jian Zhao | http://arxiv.org/pdf/2411.02974v2 | null |
2024-11-08 | ROAD-Waymo: Action Awareness at Scale for Autonomous Driving | 道路-Waymo:大规模自动驾驶的动作意识 | Salman Khan, Izzeddin Teeti, Reza Javanmard Alitappeh, Mihaela C. Stoian, Eleonora Giunchiglia, Gurkirt Singh, Andrew Bradley, Fabio Cuzzolin | http://arxiv.org/pdf/2411.01683v2 | link |
2024-11-08 | Lung tumor segmentation in MRI mice scans using 3D nnU-Net with minimum annotations | 基于3D nnU-Net的MRI小鼠扫描肺癌分割及最小标注需求 | Piotr Kaniewski, Fariba Yousefi, Yeman Brhane Hagos, Talha Qaiser, Nikolay Burlutskiy | http://arxiv.org/pdf/2411.00922v2 | null |
2024-11-08 | Spiking Neural Network as Adaptive Event Stream Slicer | 脉冲神经网络作为自适应事件流切片器 | Jiahang Cao, Mingyuan Sun, Ziqing Wang, Hao Cheng, Qiang Zhang, Shibo Zhou, Renjing Xu | http://arxiv.org/pdf/2410.02249v2 | null |
2024-11-08 | Rethinking Pre-trained Feature Extractor Selection in Multiple Instance Learning for Whole Slide Image Classification | 重新思考多实例学习中全切片图像分类的预训练特征提取器选择 | Bryan Wong, Mun Yong Yi | http://arxiv.org/pdf/2408.01167v2 | null |
2024-11-08 | Leveraging Bi-Focal Perspectives and Granular Feature Integration for Accurate Reliable Early Alzheimer's Detection | 利用双焦点视角和粒度特征集成进行准确可靠的早期阿尔茨海默病检测 | Pandiyaraju V, Shravan Venkatraman, Abeshek A, Aravintakshan S A, Pavan Kumar S, Kannan A | http://arxiv.org/pdf/2407.10921v2 | null |
2024-11-08 | Bounding Boxes and Probabilistic Graphical Models: Video Anomaly Detection Simplified | 边界框与概率图模型:简化视频异常检测 | Mia Siemon, Thomas B. Moeslund, Barry Norton, Kamal Nasrollahi | http://arxiv.org/pdf/2407.06000v2 | null |
2024-11-08 | Boosting 3D Object Detection with Semantic-Aware Multi-Branch Framework | 基于语义感知多分支框架增强3D物体检测 | Hao Jing, Anhong Wang, Lijun Zhao, Yakun Yang, Donghan Bu, Jing Zhang, Yifan Zhang, Junhui Hou | http://arxiv.org/pdf/2407.05769v2 | null |
2024-11-08 | IMDL-BenCo: A Comprehensive Benchmark and Codebase for Image Manipulation Detection & Localization | IMDL-BenCo:图像操纵检测与定位的全面基准和代码库 | Xiaochen Ma, Xuekang Zhu, Lei Su, Bo Du, Zhuohang Jiang, Bingkui Tong, Zeyu Lei, Xinyu Yang, Chi-Man Pun, Jiancheng Lv, et.al. | http://arxiv.org/pdf/2406.10580v2 | link |
2024-11-08 | HPE-CogVLM: Advancing Vision Language Models with a Head Pose Grounding Task | HPE-CogVLM:通过头部姿态定位任务推动视觉语言模型的发展 | Yu Tian, Tianqi Shao, Tsukasa Demizu, Xuyang Wu, Hsin-Tai Wu | http://arxiv.org/pdf/2406.01914v2 | null |
2024-11-08 | *: Improving the 3D detector by introducing Voxel2Pillar feature encoding and extracting multi-scale features | 通过引入Voxel2Pillar特征编码和提取多尺度特征改进3D检测器 | Xusheng Li, Chengliang Wang, Shumao Wang, Zhuo Zeng, Ji Liu | http://arxiv.org/pdf/2405.09828v3 | null |
2024-11-08 | Enhancing Vision-Language Few-Shot Adaptation with Negative Learning | 增强视觉-语言小样本适应性的负学习 | Ce Zhang, Simon Stepputtis, Katia Sycara, Yaqi Xie | http://arxiv.org/pdf/2403.12964v2 | link |
2024-11-08 | VM-UNet: Vision Mamba UNet for Medical Image Segmentation | VM-UNet:用于医学图像分割的视觉Mamba UNet | Jiacheng Ruan, Jincheng Li, Suncheng Xiang | http://arxiv.org/pdf/2402.02491v2 | link |
2024-11-08 | Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation | Hi-SAM:融合Segment Anything Model进行分层文本分割 | Maoyuan Ye, Jing Zhang, Juhua Liu, Chenyu Liu, Baocai Yin, Cong Liu, Bo Du, Dacheng Tao | http://arxiv.org/pdf/2401.17904v2 | link |
2024-11-08 | Fight Fire with Fire: Combating Adversarial Patch Attacks using Pattern-randomized Defensive Patches | 以火攻火:利用模式随机化防御补丁对抗对抗性补丁攻击 | Jianan Feng, Jiachun Li, Changqing Miao, Jianjun Huang, Wei You, Wenchang Shi, Bin Liang | http://arxiv.org/pdf/2311.06122v2 | link |
2024-11-08 | CompaCT: Fractal-Based Heuristic Pixel Segmentation for Lossless Compression of High-Color DICOM Medical Images | CompaCT:基于分形启发式像素分割的高彩色DICOM医学图像无损压缩 | Taaha Khan | http://arxiv.org/pdf/2308.13097v2 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-11-08 | A Two-Step Concept-Based Approach for Enhanced Interpretability and Trust in Skin Lesion Diagnosis | 基于概念的两步法提升皮肤病变诊断的可解释性和可信度 | Cristiano Patrício, Luís F. Teixeira, João C. Neves | http://arxiv.org/pdf/2411.05609v1 | null |
2024-11-08 | VISTA: Visual Integrated System for Tailored Automation in Math Problem Generation Using LLM | VISTA:基于大型语言模型在数学问题生成中定制自动化视觉集成系统 | Jeongwoo Lee, Kwangsuk Park, Jihyeon Park | http://arxiv.org/pdf/2411.05423v1 | null |
2024-11-08 | Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks | 针对语言模型防御越狱攻击的鲁棒提示优化 | Andy Zhou, Bo Li, Haohan Wang | http://arxiv.org/pdf/2401.17263v5 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-11-08 | PEP-GS: Perceptually-Enhanced Precise Structured 3D Gaussians for View-Adaptive Rendering | PEP-GS:感知增强精确结构化3D高斯函数用于视域自适应渲染 | Junxi Jin, Xiulai Li, Haiping Huang, Lianjun Liu, Yujie Sun | http://arxiv.org/pdf/2411.05731v1 | null |
2024-11-08 | STARS: Sensor-agnostic Transformer Architecture for Remote Sensing | STARS:适用于遥感的光传感器无关的Transformer架构 | Ethan King, Jaime Rodriguez, Diego Llanes, Timothy Doster, Tegan Emerson, James Koch | http://arxiv.org/pdf/2411.05714v1 | null |
2024-11-08 | Image inpainting enhancement by replacing the original mask with a self-attended region from the input image | 基于输入图像自注意力区域的原始掩码替换的图像修复增强 | Kourosh Kiani, Razieh Rastgoo, Alireza Chaji, Sergio Escalera | http://arxiv.org/pdf/2411.05705v1 | null |
2024-11-08 | CFPNet: Improving Lightweight ToF Depth Completion via Cross-zone Feature Propagation | CFPNet:通过跨区域特征传播提升轻量级ToF深度补全 | Laiyan Ding, Hualie Jiang, Rui Xu, Rui Huang | http://arxiv.org/pdf/2411.04480v2 | link |
2024-11-08 | Bootstrapping Top-down Information for Self-modulating Slot Attention | 自调制槽注意力机制的顶层信息自举 | Dongwon Kim, Seoyeon Kim, Suha Kwak | http://arxiv.org/pdf/2411.01801v2 | null |
2024-11-08 | Benchmarking Ultra-High-Definition Image Reflection Removal | 超高清图像反射去除性能评估 | Zhenyuan Zhang, Zhenbo Song, Kaihao Zhang, Zhaoxin Fan, Jianfeng Lu | http://arxiv.org/pdf/2308.00265v2 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-11-08 | Alignment of 3D woodblock geometrical models and 2D orthographic projection image | 三维木版几何模型与二维正射投影图像的对齐 | Minh DUc Nguyen, Cong Thuong Le, Trong Lam Nguyen | http://arxiv.org/pdf/2411.05524v1 | null |
2024-11-08 | Untrained neural networks can demonstrate memorization-independent abstract reasoning | 未训练神经网络的记忆无关抽象推理 | Tomer Barak, Yonatan Loewenstein | http://arxiv.org/pdf/2407.17791v2 | link |
2024-11-08 | "Where am I?" Scene Retrieval with Language | “我在哪里?”基于语言的场景检索 | Jiaqi Chen, Daniel Barath, Iro Armeni, Marc Pollefeys, Hermann Blum | http://arxiv.org/pdf/2404.14565v2 | null |
2024-11-08 | Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling | 通过3D建模实现逼真衣物纹理以规避人体检测器 | Zhanhao Hu, Wenda Chu, Xiaopei Zhu, Hui Zhang, Bo Zhang, Xiaolin Hu | http://arxiv.org/pdf/2307.01778v2 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-11-08 | Sketched Equivariant Imaging Regularization and Deep Internal Learning for Inverse Problems | 草图等变成像正则化与深度内部学习在逆问题中的应用 | Guixian Xu, Jinglai Li, Junqi Tang | http://arxiv.org/pdf/2411.05771v1 | null |
2024-11-08 | End-to-End Navigation with Vision Language Models: Transforming Spatial Reasoning into Question-Answering | 端到端视觉语言模型导航:将空间推理转化为问答 | Dylan Goetting, Himanshu Gaurav Singh, Antonio Loquercio | http://arxiv.org/pdf/2411.05755v1 | null |
2024-11-08 | Advancing Meteorological Forecasting: AI-based Approach to Synoptic Weather Map Analysis | 推进气象预报:基于AI的天气图分析技术 | Yo-Hwan Choi, Seon-Yu Kang, Minjong Cheon | http://arxiv.org/pdf/2411.05384v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-11-08 | ASL STEM Wiki: Dataset and Benchmark for Interpreting STEM Articles | ASL STEM Wiki:STEM文章翻译数据集和基准 | Kayo Yin, Chinmay Singh, Fyodor O. Minakov, Vanessa Milan, Hal Daumé III, Cyril Zhang, Alex X. Lu, Danielle Bragg | http://arxiv.org/pdf/2411.05783v1 | null |
2024-11-08 | GazeSearch: Radiology Findings Search Benchmark | 视觉搜索:放射学发现搜索基准 | Trong Thang Pham, Tien-Phat Nguyen, Yuki Ikebe, Akash Awasthi, Zhigang Deng, Carol C. Wu, Hien Nguyen, Ngan Le | http://arxiv.org/pdf/2411.05780v1 | null |
2024-11-08 | Poze: Sports Technique Feedback under Data Constraints | 基于数据约束的体育技巧反馈 | Agamdeep Singh, Sujit PB, Mayank Vatsa | http://arxiv.org/pdf/2411.05734v1 | null |
2024-11-08 | Towards Scalable Foundation Models for Digital Dermatology | 面向数字皮肤病学可扩展的基础模型 | Fabian Gröger, Philippe Gottfrois, Ludovic Amruthalingam, Alvaro Gonzalez-Jimenez, Simone Lionetti, Luis R. Soenksen-Martinez, Alexander A. Navarini, Marc Pouly | http://arxiv.org/pdf/2411.05514v1 | null |
2024-11-08 | Tightly-Coupled, Speed-aided Monocular Visual-Inertial Localization in Topological Map | 紧耦合、速度辅助的单目视觉惯性定位在拓扑地图中 | Chanuk Yang, Hayeon O, Kunsoo Huh | http://arxiv.org/pdf/2411.05497v1 | null |
2024-11-08 | WeatherGFM: Learning A Weather Generalist Foundation Model via In-context Learning | 天气通用基础模型:通过上下文学习学习天气通用模型 | Xiangyu Zhao, Zhiwang Zhou, Wenlong Zhang, Yihao Liu, Xiangyu Chen, Junchao Gong, Hao Chen, Ben Fei, Shiqi Chen, Wanli Ouyang, et.al. | http://arxiv.org/pdf/2411.05420v1 | null |
2024-11-08 | Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation | 图像分解:理论、数值方法和性能评估 | Jerome Gilles | http://arxiv.org/pdf/2411.05265v1 | null |
2024-11-08 | Decoding Report Generators: A Cyclic Vision-Language Adapter for Counterfactual Explanations | 解码报告生成器:循环视觉语言适配器用于反事实解释 | Yingying Fang, Zihao Jin, Shaojie Guo, Jinda Liu, Yijian Gao, Junzhi Ning, Zhiling Yue, Zhi Li, Simon LF Walsh, Guang Yang | http://arxiv.org/pdf/2411.05261v1 | null |
2024-11-08 | Super-resolution in disordered media using neural networks | 基于神经网络的失序介质超分辨率 | Alexander Christie, Matan Leibovich, Miguel Moscoso, Alexei Novikov, George Papanicolaou, Chrysoula Tsogka | http://arxiv.org/pdf/2410.21556v3 | null |
2024-11-08 | Improvement of Spiking Neural Network with Bit Planes and Color Models | 基于位平面和颜色模型的脉冲神经网络改进 | Nhan T. Luu, Duong T. Luu, Nam N. Pham, Thang C. Truong | http://arxiv.org/pdf/2410.08229v2 | null |
2024-11-08 | TropNNC: Structured Neural Network Compression Using Tropical Geometry | TropNNC:使用热带几何的神经网络结构压缩 | Konstantinos Fotopoulos, Petros Maragos, Panagiotis Misiakos | http://arxiv.org/pdf/2409.03945v2 | null |
2024-11-08 | MLAAN: Scaling Supervised Local Learning with Multilaminar Leap Augmented Auxiliary Network | MLAAN:基于多层跳跃增强辅助网络的监督局部学习扩展 | Yuming Zhang, Shouxin Zhang, Peizhe Wang, Feiyu Zhu, Dongzhi Guan, Junhao Su, Jiabin Liu, Changpeng Cai | http://arxiv.org/pdf/2406.16633v5 | null |