Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-24 | FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models | FreeTraj:视频传播模型中的无调节轨迹控制 | Haonan Qiu, Zhaoxi Chen, Zhouxia Wang, Yingqing He, Menghan Xia, Ziwei Liu | http://arxiv.org/pdf/2406.16863v1 | link |
2024-06-24 | Dreamitate: Real-World Visuomotor Policy Learning via Video Generation | Dreamitate:通过视频生成进行现实世界的视觉运动策略学习 | Junbang Liang, Ruoshi Liu, Ege Ozguroglu, Sruthi Sudhakar, Achal Dave, Pavel Tokmakov, Shuran Song, Carl Vondrick | http://arxiv.org/pdf/2406.16862v1 | null |
2024-06-24 | DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation | DreamBench++:个性化图像生成的人机对齐基准 | Yuang Peng, Yuxin Cui, Haomiao Tang, Zekun Qi, Runpei Dong, Jing Bai, Chunrui Han, Zheng Ge, Xiangyu Zhang, Shu-Tao Xia | http://arxiv.org/pdf/2406.16855v1 | link |
2024-06-24 | Portrait3D: 3D Head Generation from Single In-the-wild Portrait Image | Portrait3D:从单个野生肖像图像生成 3D 头部 | Jinkun Hao, Junshu Tang, Jiangning Zhang, Ran Yi, Yijia Hong, Moran Li, Weijian Cao, Yating Wang, Lizhuang Ma | http://arxiv.org/pdf/2406.16710v1 | null |
2024-06-24 | Geometry-Aware Score Distillation via 3D Consistent Noising and Gradient Consistency Modeling | 通过 3D 一致性噪声和梯度一致性建模实现几何感知分数蒸馏 | Min-Seop Kwak, Donghoon Ahn, Ines Hyeonsu Kim, Jin-wha Kim, Seungryong Kim | http://arxiv.org/pdf/2406.16695v1 | null |
2024-06-24 | Repulsive Score Distillation for Diverse Sampling of Diffusion Models | 扩散模型多样化采样的排斥分数蒸馏 | Nicolas Zilberstein, Morteza Mardani, Santiago Segarra | http://arxiv.org/pdf/2406.16683v1 | null |
2024-06-24 | Do As I Do: Pose Guided Human Motion Copy | 照我做:姿势引导人体动作复制 | Sifan Wu, Zhenguang Liu, Beibei Zhang, Roger Zimmermann, Zhongjie Ba, Xiaosong Zhang, Kui Ren | http://arxiv.org/pdf/2406.16601v1 | null |
2024-06-24 | FASTC: A Fast Attentional Framework for Semantic Traversability Classification Using Point Cloud | FASTC:使用点云进行语义可遍历性分类的快速注意力框架 | Yirui Chen, Pengjin Wei, Zhenhuan Liu, Bingchao Wang, Jie Yang, Wei Liu | http://arxiv.org/pdf/2406.16564v1 | link |
2024-06-24 | EvalAlign: Evaluating Text-to-Image Models through Precision Alignment of Multimodal Large Models with Supervised Fine-Tuning to Human Annotations | EvalAlign:通过对多模态大型模型进行精确对齐并对人工注释进行监督微调来评估文本到图像模型 | Zhiyu Tan, Xiaomeng Yang, Luozheng Qin, Mengping Yang, Cheng Zhang, Hao Li | http://arxiv.org/pdf/2406.16562v1 | link |
2024-06-24 | GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization | GIM:用于生成图像处理检测和定位的百万级基准 | Yirui Chen, Xudong Huang, Quan Zhang, Wei Li, Mingjian Zhu, Qiangyu Yan, Simiao Li, Hanting Chen, Hailin Hu, Jie Yang, et.al. | http://arxiv.org/pdf/2406.16531v1 | null |
2024-06-24 | DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution | DaLPSR:利用降级对齐语言提示实现真实世界图像超分辨率 | Aiwen Jiang, Zhi Wei, Long Peng, Feiqiang Liu, Wenbo Li, Mingwen Wang | http://arxiv.org/pdf/2406.16477v1 | null |
2024-06-24 | ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance | ResMaster:通过结构化和细粒度指导掌握高分辨率图像生成 | Shuwei Shi, Wenbo Li, Yuechen Zhang, Jingwen He, Biao Gong, Yinqiang Zheng | http://arxiv.org/pdf/2406.16476v1 | null |
2024-06-24 | Improving Generative Adversarial Networks for Video Super-Resolution | 改进视频超分辨率的生成对抗网络 | Daniel Wen | http://arxiv.org/pdf/2406.16359v1 | null |
2024-06-24 | Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion Models | 快速一致性图像生成 (PCIG):集成 LLM、知识图谱和可控扩散模型的统一框架 | Yichen Sun, Zhixuan Chu, Zhan Qin, Kui Ren | http://arxiv.org/pdf/2406.16333v1 | null |
2024-06-24 | YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals | YouDream:生成解剖学上可控制的一致文本到 3D 动物 | Sandeep Mishra, Oindrila Saha, Alan C. Bovik | http://arxiv.org/pdf/2406.16273v1 | null |
2024-06-24 | Repairing Catastrophic-Neglect in Text-to-Image Diffusion Models via Attention-Guided Feature Enhancement | 通过注意力引导特征增强修复文本到图像扩散模型中的灾难性忽视 | Zhiyuan Chang, Mingyang Li, Junjie Wang, Yi Liu, Qing Wang, Yang Liu | http://arxiv.org/pdf/2406.16272v1 | null |
2024-06-24 | Video-Infinity: Distributed Long Video Generation | Video-Infinity:分布式长视频生成 | Zhenxiong Tan, Xingyi Yang, Songhua Liu, Xinchao Wang | http://arxiv.org/pdf/2406.16260v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-24 | Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models | 重新审视大型多模态模型时代的指称表达理解评估 | Jierun Chen, Fangyun Wei, Jinjing Zhao, Sizhe Song, Bohuai Wu, Zhuoxuan Peng, S. -H. Gary Chan, Hongyang Zhang | http://arxiv.org/pdf/2406.16866v1 | link |
2024-06-24 | Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs | Cambrian-1:全面开放、以视觉为中心的多模态法学硕士探索 | Shengbang Tong, Ellis Brown, Penghao Wu, Sanghyun Woo, Manoj Middepogu, Sai Charitha Akula, Jihan Yang, Shusheng Yang, Adithya Iyer, Xichen Pan, et.al. | http://arxiv.org/pdf/2406.16860v1 | null |
2024-06-24 | Long Context Transfer from Language to Vision | 从语言到视觉的长上下文迁移 | Peiyuan Zhang, Kaichen Zhang, Bo Li, Guangtao Zeng, Jingkang Yang, Yuanhan Zhang, Ziyue Wang, Haoran Tan, Chunyuan Li, Ziwei Liu | http://arxiv.org/pdf/2406.16852v1 | link |
2024-06-24 | Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts | 在图像大海捞针:视觉语言模型在短距离和长距离语境中容易分心 | Aditya Sharma, Michael Saxon, William Yang Wang | http://arxiv.org/pdf/2406.16851v1 | null |
2024-06-24 | From Perfect to Noisy World Simulation: Customizable Embodied Multi-modal Perturbations for SLAM Robustness Benchmarking | 从完美到嘈杂的世界模拟:用于 SLAM 鲁棒性基准测试的可定制体现多模态扰动 | Xiaohao Xu, Tianyi Zhang, Sibo Wang, Xiang Li, Yongqi Chen, Ye Li, Bhiksha Raj, Matthew Johnson-Roberson, Xiaonan Huang | http://arxiv.org/pdf/2406.16850v1 | link |
2024-06-24 | Vision-Language Consistency Guided Multi-modal Prompt Learning for Blind AI Generated Image Quality Assessment | 视觉语言一致性引导的多模式提示学习,用于盲人人工智能生成的图像质量评估 | Jun Fu, Wei Zhou, Qiuping Jiang, Hantao Liu, Guangtao Zhai | http://arxiv.org/pdf/2406.16641v1 | null |
2024-06-24 | OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer | OmAgent:一种用于复杂视频理解的具有任务分而治之的多模式代理框架 | Lu Zhang, Tiancheng Zhao, Heting Ying, Yibo Ma, Kyusong Lee | http://arxiv.org/pdf/2406.16620v1 | null |
2024-06-24 | Multi-Modal Vision Transformers for Crop Mapping from Satellite Image Time Series | 利用卫星图像时间序列进行农作物制图的多模态视觉变换器 | Theresa Follath, David Mickisch, Jan Hemmerling, Stefan Erasmi, Marcel Schwieder, Begüm Demir | http://arxiv.org/pdf/2406.16513v1 | null |
2024-06-24 | InterCLIP-MEP: Interactive CLIP and Memory-Enhanced Predictor for Multi-modal Sarcasm Detection | InterCLIP-MEP:用于多模态讽刺检测的交互式 CLIP 和记忆增强预测器 | Junjie Chen, Subin Huang | http://arxiv.org/pdf/2406.16464v1 | link |
2024-06-24 | EmoLLM: Multimodal Emotional Understanding Meets Large Language Models | EmoLLM:多模态情感理解与大型语言模型的结合 | Qu Yang, Mang Ye, Bo Du | http://arxiv.org/pdf/2406.16442v1 | null |
2024-06-24 | Directed Domain Fine-Tuning: Tailoring Separate Modalities for Specific Training Tasks | 定向域微调:为特定训练任务定制单独的模式 | Daniel Wen, Nafisa Hussain | http://arxiv.org/pdf/2406.16346v1 | null |
2024-06-24 | VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models | VideoHallucer:评估大型视频语言模型中的内在和外在幻觉 | Yuxuan Wang, Yueqian Wang, Dongyan Zhao, Cihang Xie, Zilong Zheng | http://arxiv.org/pdf/2406.16338v1 | null |
2024-06-24 | Priorformer: A UGC-VQA Method with content and distortion priors | Priorformer:具有内容和失真先验的 UGC-VQA 方法 | Yajing Pei, Shiyu Huang, Yiting Lu, Xin Li, Zhibo Chen | http://arxiv.org/pdf/2406.16297v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-24 | Articulate your NeRF: Unsupervised articulated object modeling via conditional view synthesis | 清晰表达你的 NeRF:通过条件视图合成实现无监督清晰对象建模 | Jianning Deng, Kartic Subr, Hakan Bilen | http://arxiv.org/pdf/2406.16623v1 | null |
2024-06-24 | Crowd-Sourced NeRF: Collecting Data from Production Vehicles for 3D Street View Reconstruction | 众包 NeRF:从生产车辆收集数据以进行 3D 街景重建 | Tong Qin, Changze Li, Haoyang Ye, Shaowei Wan, Minzhen Li, Hongwei Liu, Ming Yang | http://arxiv.org/pdf/2406.16289v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-24 | ClotheDreamer: Text-Guided Garment Generation with 3D Gaussians | ClotheDreamer:使用 3D 高斯算法生成文本引导的服装 | Yufei Liu, Junshu Tang, Chu Zheng, Shijie Zhang, Jinkun Hao, Junwei Zhu, Dongjin Huang | http://arxiv.org/pdf/2406.16815v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-24 | Seeking Certainty In Uncertainty: Dual-Stage Unified Framework Solving Uncertainty in Dynamic Facial Expression Recognition | 在不确定性中寻求确定性:解决动态面部表情识别不确定性的双阶段统一框架 | Haoran Wang, Xinji Mai, Zeng Tao, Xuan Tong, Junxiong Lin, Yan Wang, Jiawen Yu, Boyang Wang, Shaoqi Yan, Qing Zhao, et.al. | http://arxiv.org/pdf/2406.16473v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-24 | Unsupervised Domain Adaptation for Pediatric Brain Tumor Segmentation | 儿童脑肿瘤分割的无监督域自适应 | Jingru Fu, Simone Bendazzoli, Örjan Smedby, Rodrigo Moreno | http://arxiv.org/pdf/2406.16848v1 | null |
2024-06-24 | The Progression of Transformers from Language to Vision to MOT: A Literature Review on Multi-Object Tracking with Transformers | Transformer 从语言到视觉再到 MOT 的进展:使用 Transformer 进行多目标跟踪的文献综述 | Abhi Kamboj | http://arxiv.org/pdf/2406.16784v1 | null |
2024-06-24 | Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation | 半监督 3D 实例分割的实例一致性正则化 | Yizheng Wu, Zhiyu Pan, Kewei Wang, Xingyi Li, Jiahao Cui, Liwen Xiao, Guosheng Lin, Zhiguo Cao | http://arxiv.org/pdf/2406.16776v1 | link |
2024-06-24 | μ-Net: A Deep Learning-Based Architecture for μ-CT Segmentation | μ-Net:基于深度学习的 μ-CT 分割架构 | Pierangela Bruno, Edoardo De Rose, Carlo Adornetto, Francesco Calimeri, Sandro Donato, Raffaele Giuseppe Agostino, Daniela Amelio, Riccardo Barberi, Maria Carmela Cerra, Maria Caterina Crocco, et.al. | http://arxiv.org/pdf/2406.16724v1 | null |
2024-06-24 | Demystifying the Effect of Receptive Field Size in U-Net Models for Medical Image Segmentation | 揭秘医学图像分割 U-Net 模型中感受野大小的影响 | Vincent Loos, Rohit Pardasani, Navchetan Awasthi | http://arxiv.org/pdf/2406.16701v1 | null |
2024-06-24 | Feature Fusion for Human Activity Recognition using Parameter-Optimized Multi-Stage Graph Convolutional Network and Transformer Models | 使用参数优化的多阶段图卷积网络和 Transformer 模型进行特征融合以实现人类活动识别 | Mohammad Belal, Taimur Hassan, Abdelfatah Ahmed, Ahmad Aljarah, Nael Alsheikh, Irfan Hussain | http://arxiv.org/pdf/2406.16638v1 | null |
2024-06-24 | The Championship-Winning Solution for the 5th CLVISION Challenge 2024 | 第五届 CLVISION 挑战赛 2024 冠军解决方案 | Sishun Pan, Tingmin Li, Yang Yang | http://arxiv.org/pdf/2406.16615v1 | null |
2024-06-24 | Toward Fairer Face Recognition Datasets | 迈向更公平的人脸识别数据集 | Alexandre Fournier-Mongieux, Michael Soumm, Adrian Popescu, Bertrand Luvison, Hervé Le Borgne | http://arxiv.org/pdf/2406.16592v1 | null |
2024-06-24 | Personalized federated learning based on feature fusion | 基于特征融合的个性化联邦学习 | Wolong Xing, Zhenkui Shi, Hongyan Peng, Xiantao Hu, Xianxian Li | http://arxiv.org/pdf/2406.16583v1 | null |
2024-06-24 | Improving robustness to corruptions with multiplicative weight perturbations | 利用乘性权重扰动提高对腐败的鲁棒性 | Trung Trinh, Markus Heinonen, Luigi Acerbi, Samuel Kaski | http://arxiv.org/pdf/2406.16540v1 | null |
2024-06-24 | Character-Adapter: Prompt-Guided Region Control for High-Fidelity Character Customization | 角色适配器:提示引导区域控制,实现高保真角色定制 | Yuhang Ma, Wenting Xu, Jiji Tang, Qinfeng Jin, Rongsheng Zhang, Zeng Zhao, Changjie Fan, Zhipeng Hu | http://arxiv.org/pdf/2406.16537v1 | null |
2024-06-24 | Vision Mamba-based autonomous crack segmentation on concrete, asphalt, and masonry surfaces | 基于 Vision Mamba 的混凝土、沥青和砖石表面裂缝自动分割 | Zhaohui Chen, Elyas Asadi Shamsabadi, Sheng Jiang, Luming Shen, Daniel Dias-da-Costa | http://arxiv.org/pdf/2406.16518v1 | null |
2024-06-24 | LOGCAN++: Local-global class-aware network for semantic segmentation of remote sensing images | LOGCAN++:用于遥感图像语义分割的局部-全局类感知网络 | Xiaowen Ma, Rongrong Lian, Zhenkai Wu, Hongbo Guo, Mengting Ma, Sensen Wu, Zhenhong Du, Siyang Song, Wei Zhang | http://arxiv.org/pdf/2406.16502v1 | link |
2024-06-24 | UNICAD: A Unified Approach for Attack Detection, Noise Reduction and Novel Class Identification | UNICAD:一种用于攻击检测、降噪和新类别识别的统一方法 | Alvaro Lopez Pellicer, Kittipos Giatgong, Yi Li, Neeraj Suri, Plamen Angelov | http://arxiv.org/pdf/2406.16501v1 | null |
2024-06-24 | Improving Quaternion Neural Networks with Quaternionic Activation Functions | 使用四元数激活函数改进四元数神经网络 | Johannes Pöppelbaum, Andreas Schwung | http://arxiv.org/pdf/2406.16481v1 | null |
2024-06-24 | Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration | 评估视觉和文化解释:K-Viscuit 基准与 Human-VLM 协作 | Yujin Baek, ChaeHun Park, Jaeseok Kim, Yu-Jung Heo, Du-Seong Chang, Jaegul Choo | http://arxiv.org/pdf/2406.16469v1 | null |
2024-06-24 | SLOctolyzer: Fully automatic analysis toolkit for segmentation and feature extracting in scanning laser ophthalmoscopy images | SLOctolyzer:用于扫描激光检眼镜图像分割和特征提取的全自动分析工具包 | Jamie Burke, Samuel Gibbon, Justin Engelmann, Adam Threlfall, Ylenia Giarratano, Charlene Hamid, Stuart King, Ian J. C. MacCormick, Tom MacGillivray | http://arxiv.org/pdf/2406.16466v1 | null |
2024-06-24 | Exploring Test-Time Adaptation for Object Detection in Continually Changing Environments | 探索不断变化的环境中物体检测的测试时间适应性 | Shilei Cao, Yan Liu, Juepeng Zheng, Weijia Li, Runmin Dong, Haohuan Fu | http://arxiv.org/pdf/2406.16439v1 | null |
2024-06-24 | Multi-threshold Deep Metric Learning for Facial Expression Recognition | 用于面部表情识别的多阈值深度度量学习 | Wenwu Yang, Jinyi Yu, Tuo Chen, Zhenguang Liu, Xun Wang, Jianbing Shen | http://arxiv.org/pdf/2406.16434v1 | null |
2024-06-24 | Dynamic Pseudo Label Optimization in Point-Supervised Nuclei Segmentation | 点监督核分割中的动态伪标签优化 | Ziyue Wang, Ye Zhang, Yifeng Wang, Linghan Cai, Yongbing Zhang | http://arxiv.org/pdf/2406.16427v1 | null |
2024-06-24 | Exploring Cross-Domain Few-Shot Classification via Frequency-Aware Prompting | 通过频率感知提示探索跨领域小样本分类 | Tiange Zhang, Qing Cai, Feng Gao, Lin Qi, Junyu Dong | http://arxiv.org/pdf/2406.16422v1 | link |
2024-06-24 | Lesion-Aware Cross-Phase Attention Network for Renal Tumor Subtype Classification on Multi-Phase CT Scans | 用于多期 CT 扫描中肾肿瘤亚型分类的病变感知跨期注意网络 | Kwang-Hyun Uhm, Seung-Won Jung, Sung-Hoo Hong, Sung-Jea Ko | http://arxiv.org/pdf/2406.16322v1 | null |
2024-06-24 | Artistic-style text detector and a new Movie-Poster dataset | 艺术风格文本检测器和新的电影海报数据集 | Aoxiang Ning, Yiting Wei, Minglong Xue, Senming Zhong | http://arxiv.org/pdf/2406.16307v1 | null |
2024-06-24 | SegNet4D: Effective and Efficient 4D LiDAR Semantic Segmentation in Autonomous Driving Environments | SegNet4D:自动驾驶环境中有效且高效的 4D LiDAR 语义分割 | Neng Wang, Ruibin Guo, Chenghao Shi, Hui Zhang, Huimin Lu, Zhiqiang Zheng, Xieyuanli Chen | http://arxiv.org/pdf/2406.16279v1 | link |
2024-06-24 | Feature-prompting GBMSeg: One-Shot Reference Guided Training-Free Prompt Engineering for Glomerular Basement Membrane Segmentation | 特征提示 GBMSeg:用于肾小球基底膜分割的一次性参考引导无训练提示工程 | Xueyu Liu, Guangze Shi, Rui Wang, Yexin Lai, Jianan Zhang, Lele Sun, Quan Yang, Yongfei Wu, MIng Li, Weixia Han, et.al. | http://arxiv.org/pdf/2406.16271v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-24 | StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal | StableNormal:减少扩散方差以实现稳定和尖锐的正常状态 | Chongjie Ye, Lingteng Qiu, Xiaodong Gu, Qi Zuo, Yushuang Wu, Zilong Dong, Liefeng Bo, Yuliang Xiu, Xiaoguang Han | http://arxiv.org/pdf/2406.16864v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-24 | GPT-4V Explorations: Mining Autonomous Driving | GPT-4V 探索:采矿自动驾驶 | Zixuan Li | http://arxiv.org/pdf/2406.16817v1 | null |
2024-06-24 | MIRReS: Multi-bounce Inverse Rendering using Reservoir Sampling | MIRReS:使用储层采样进行多反射逆向渲染 | Yuxin Dai, Qi Wang, Jingsen Zhu, Dianbing Xi, Yuchi Huo, Chen Qian, Ying He | http://arxiv.org/pdf/2406.16360v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-24 | UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos | UBiSS:视频双模态语义摘要的统一框架 | Yuting Mei, Linli Yao, Qin Jin | http://arxiv.org/pdf/2406.16301v1 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-24 | Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation | 超越赞/踩:解决文本到图像生成的细粒度反馈的挑战 | Katherine M. Collins, Najoung Kim, Yonatan Bitton, Verena Rieser, Shayegan Omidshafiei, Yushi Hu, Sherol Chen, Senjuti Dutta, Minsuk Chang, Kimin Lee, et.al. | http://arxiv.org/pdf/2406.16807v1 | null |
2024-06-24 | Suppressing Uncertainties in Degradation Estimation for Blind Super-Resolution | 抑制盲超分辨率退化估计中的不确定性 | Junxiong Lin, Zeng Tao, Xuan Tong, Xinji Mai, Haoran Wang, Boyang Wang, Yan Wang, Qing Zhao, Jiawen Yu, Yuxuan Lin, et.al. | http://arxiv.org/pdf/2406.16459v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-06-24 | The MRI Scanner as a Diagnostic: Image-less Active Sampling | MRI 扫描仪作为诊断手段:无图像主动采样 | Yuning Du, Rohan Dharmakumar, Sotirios A. Tsaftaris | http://arxiv.org/pdf/2406.16754v1 | null |
2024-06-24 | Sampling Strategies in Bayesian Inversion: A Study of RTO and Langevin Methods | 贝叶斯反演中的采样策略:RTO 和朗之万方法的研究 | Remi Laumont, Yiqiu Dong, Martin Skovgaard Andersen | http://arxiv.org/pdf/2406.16658v1 | null |
2024-06-24 | MLAAN: Scaling Supervised Local Learning with Multilaminar Leap Augmented Auxiliary Network | MLAAN:利用多层跳跃增强辅助网络扩展监督局部学习 | Yuming Zhang, Shouxin Zhang, Peizhe Wang, Feiyu Zhu, Dongzhi Guan, Jiabin Liu, Changpeng Cai | http://arxiv.org/pdf/2406.16633v1 | null |
2024-06-24 | When Invariant Representation Learning Meets Label Shift: Insufficiency and Theoretical Insights | 当不变表征学习遇到标签转移:不足与理论见解 | You-Wei Luo, Chuan-Xian Ren | http://arxiv.org/pdf/2406.16608v1 | null |
2024-06-24 | Measuring the Recyclability of Electronic Components to Assist Automatic Disassembly and Sorting Waste Printed Circuit Boards | 测量电子元件的可回收性,协助自动拆卸和分类废弃印刷电路板 | Muhammad Mohsin, Xianlai Zeng, Stefano Rovetta, Francesco Masulli | http://arxiv.org/pdf/2406.16593v1 | null |
2024-06-24 | Hierarchical B-frame Video Coding for Long Group of Pictures | 针对长组图像的分层 B 帧视频编码 | Ivan Kirillov, Denis Parkhomenko, Kirill Chernyshev, Alexander Pletnev, Yibo Shi, Kai Lin, Dmitry Babin | http://arxiv.org/pdf/2406.16544v1 | null |
2024-06-24 | Evaluating and Analyzing Relationship Hallucinations in LVLMs | 评估和分析 LVLM 中的关系幻觉 | Mingrui Wu, Jiayi Ji, Oucheng Huang, Jiale Li, Yuhang Wu, Xiaoshuai Sun, Rongrong Ji | http://arxiv.org/pdf/2406.16449v1 | null |
2024-06-24 | High-resolution open-vocabulary object 6D pose estimation | 高分辨率开放词汇对象 6D 姿态估计 | Jaime Corsetti, Davide Boscaini, Francesco Giuliari, Changjae Oh, Andrea Cavallaro, Fabio Poiesi | http://arxiv.org/pdf/2406.16384v1 | null |