Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-30 | Denoising Monte Carlo Renders With Diffusion Models | 使用扩散模型对蒙特卡洛渲染进行去噪 | Vaibhav Vavilala, Rahul Vasanth, David Forsyth | http://arxiv.org/pdf/2404.00491v1 | null |
2024-03-30 | DiffHuman: Probabilistic Photorealistic 3D Reconstruction of Humans | DiffHuman:概率真实感 3D 人体重建 | Akash Sengupta, Thiemo Alldieck, Nikos Kolotouros, Enric Corona, Andrei Zanfir, Cristian Sminchisescu | http://arxiv.org/pdf/2404.00485v1 | null |
2024-03-30 | Score-Based Diffusion Models for Photoacoustic Tomography Image Reconstruction | 用于光声断层扫描图像重建的基于分数的扩散模型 | Sreemanti Dey, Snigdha Saha, Berthy T. Feng, Manxiu Cui, Laure Delisle, Oscar Leong, Lihong V. Wang, Katherine L. Bouman | http://arxiv.org/pdf/2404.00471v1 | null |
2024-03-30 | Towards Variable and Coordinated Holistic Co-Speech Motion Generation | 实现可变且协调的整体语音动作生成 | Yifei Liu, Qiong Cao, Yandong Wen, Huaiguang Jiang, Changxing Ding | http://arxiv.org/pdf/2404.00368v1 | null |
2024-03-30 | Spread Your Wings: A Radial Strip Transformer for Image Deblurring | 张开翅膀:用于图像去模糊的径向条形变压器 | Duosheng Chen, Shihao Zhou, Jinshan Pan, Jinglei Shi, Lishen Qu, Jufeng Yang | http://arxiv.org/pdf/2404.00358v1 | null |
2024-03-30 | Grid Diffusion Models for Text-to-Video Generation | 用于文本到视频生成的网格扩散模型 | Taegyeong Lee, Soyeong Kwon, Taehwan Kim | http://arxiv.org/pdf/2404.00234v1 | null |
2024-03-30 | Latent Watermark: Inject and Detect Watermarks in Latent Diffusion Space | 潜在水印:在潜在扩散空间中注入和检测水印 | Zheling Meng, Bo Peng, Jing Dong | http://arxiv.org/pdf/2404.00230v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-30 | SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs | SceneGraphLoc:3D 场景图上的跨模态粗略视觉定位 | Yang Miao, Francis Engelmann, Olga Vysotska, Federico Tombari, Marc Pollefeys, Dániel Béla Baráth | http://arxiv.org/pdf/2404.00469v1 | null |
2024-03-30 | MaGRITTe: Manipulative and Generative 3D Realization from Image, Topview and Text | MaGRITTe:从图像、俯视图和文本中进行操作和生成 3D 实现 | Takayuki Hara, Tatsuya Harada | http://arxiv.org/pdf/2404.00345v1 | null |
2024-03-30 | Learned Scanpaths Aid Blind Panoramic Video Quality Assessment | 学习的扫描路径有助于盲式全景视频质量评估 | Kanglong Fan, Wen Wen, Mu Li, Yifan Peng, Kede Ma | http://arxiv.org/pdf/2404.00252v1 | null |
2024-03-30 | Design as Desired: Utilizing Visual Question Answering for Multimodal Pre-training | 根据需要进行设计:利用视觉问答进行多模式预训练 | Tongkun Su, Jun Li, Xi Zhang, Haibo Jin, Hao Chen, Qiong Wang, Faqin Lv, Baoliang Zhao, Yin Hu | http://arxiv.org/pdf/2404.00226v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-30 | 3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting | 3DGSR:使用 3D 高斯泼溅进行隐式表面重建 | Xiaoyang Lyu, Yang-Tian Sun, Yi-Hua Huang, Xiuzhe Wu, Ziyi Yang, Yilun Chen, Jiangmiao Pang, Xiaojuan Qi | http://arxiv.org/pdf/2404.00409v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-30 | Orchestrate Latent Expertise: Advancing Online Continual Learning with Multi-Level Supervision and Reverse Self-Distillation | 协调潜在专业知识:通过多级监督和反向自我蒸馏推进在线持续学习 | HongWei Yan, Liyuan Wang, Kaisheng Ma, Yi Zhong | http://arxiv.org/pdf/2404.00417v1 | null |
2024-03-30 | TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias | TTD:文本标签自蒸馏增强 CLIP 中的图像文本对齐,以减轻单标签偏差 | Sanghyun Jo, Soohyun Ryu, Sungyub Kim, Eunho Yang, Kyungsu Kim | http://arxiv.org/pdf/2404.00384v1 | link |
2024-03-30 | Long-Tailed Recognition on Binary Networks by Calibrating A Pre-trained Model | 通过校准预训练模型进行二元网络长尾识别 | Jihun Kim, Dahyun Kim, Hyungrok Jung, Taeil Oh, Jonghyun Choi | http://arxiv.org/pdf/2404.00285v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-30 | DHR: Dual Features-Driven Hierarchical Rebalancing in Inter- and Intra-Class Regions for Weakly-Supervised Semantic Segmentation | DHR:弱监督语义分割的类间和类内区域的双特征驱动的层次再平衡 | Sanghyun Jo, Fei Pan, In-Jae Yu, Kyungsu Kim | http://arxiv.org/pdf/2404.00380v1 | link |
2024-03-30 | The Devil is in the Edges: Monocular Depth Estimation with Edge-aware Consistency Fusion | 魔鬼在边缘:具有边缘感知一致性融合的单目深度估计 | Pengzhi Li, Yikang Ding, Haohan Wang, Chengshuai Tang, Zhiheng Li | http://arxiv.org/pdf/2404.00373v1 | null |
2024-03-30 | Efficient Multi-branch Segmentation Network for Situation Awareness in Autonomous Navigation | 用于自主导航态势感知的高效多分支分割网络 | Guan-Cheng Zhou, Chen Chengb, Yan-zhou Chena | http://arxiv.org/pdf/2404.00366v1 | null |
2024-03-30 | Rethinking Attention-Based Multiple Instance Learning for Whole-Slide Pathological Image Classification: An Instance Attribute Viewpoint | 重新思考基于注意力的多实例学习用于全幻灯片病理图像分类:实例属性观点 | Linghan Cai, Shenjin Huang, Ye Zhang, Jinpeng Lu, Yongbing Zhang | http://arxiv.org/pdf/2404.00351v1 | null |
2024-03-30 | YNetr: Dual-Encoder architecture on Plain Scan Liver Tumors (PSLT) | YNetr:平扫肝脏肿瘤 (PSLT) 的双编码器架构 | Wen Sheng, Zhong Zheng, Jiajun Liu, Han Lu, Hanyuan Zhang, Zhengyong Jiang, Zhihong Zhang, Daoping Zhu | http://arxiv.org/pdf/2404.00327v1 | null |
2024-03-30 | CLIP-driven Outliers Synthesis for few-shot OOD detection | 用于小样本 OOD 检测的 CLIP 驱动的离群值合成 | Hao Sun, Rundong He, Zhongyi Han, Zhicong Lin, Yongshun Gong, Yilong Yin | http://arxiv.org/pdf/2404.00323v1 | null |
2024-03-30 | Instrument-tissue Interaction Detection Framework for Surgical Video Understanding | 用于手术视频理解的仪器-组织相互作用检测框架 | Wenjun Lin, Yan Hu, Huazhu Fu, Mingming Yang, Chin-Boon Chng, Ryo Kawasaki, Cheekong Chui, Jiang Liu | http://arxiv.org/pdf/2404.00322v1 | null |
2024-03-30 | Bayesian Exploration of Pre-trained Models for Low-shot Image Classification | 用于低样本图像分类的预训练模型的贝叶斯探索 | Yibo Miao, Yu Lei, Feng Zhou, Zhijie Deng | http://arxiv.org/pdf/2404.00312v1 | null |
2024-03-30 | HSIMamba: Hyperpsectral Imaging Efficient Feature Learning with Bidirectional State Space for Classification | HSIMamba:使用双向状态空间进行分类的超光谱成像高效特征学习 | Judy X Yang, Jun Zhou, Jing Wang, Hui Tian, Alan Wee Chung Liew | http://arxiv.org/pdf/2404.00272v1 | null |
2024-03-30 | Image-to-Image Matching via Foundation Models: A New Perspective for Open-Vocabulary Semantic Segmentation | 通过基础模型进行图像到图像匹配:开放词汇语义分割的新视角 | Yuan Wang, Rui Sun, Naisong Luo, Yuwen Pan, Tianzhu Zhang | http://arxiv.org/pdf/2404.00262v1 | null |
2024-03-30 | YOLOOC: YOLO-based Open-Class Incremental Object Detection with Novel Class Discovery | YOLOOC:基于 YOLO 的开放类增量对象检测与新类发现 | Qian Wan, Xiang Xiang, Qinhao Zhou | http://arxiv.org/pdf/2404.00257v1 | null |
2024-03-30 | Attention-based Shape-Deformation Networks for Artifact-Free Geometry Reconstruction of Lumbar Spine from MR Images | 基于注意力的形状变形网络,用于从 MR 图像中进行腰椎无伪影几何重建 | Linchen Qian, Jiasong Chen, Linhai Ma, Timur Urakov, Weiyong Gu, Liang Liang | http://arxiv.org/pdf/2404.00231v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-30 | Constrained Layout Generation with Factor Graphs | 使用因子图生成约束布局 | Mohammed Haroon Dupty, Yanfei Dong, Sicong Leng, Guoji Fu, Yong Liang Goh, Wei Lu, Wee Sun Lee | http://arxiv.org/pdf/2404.00385v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-30 | Reusable Architecture Growth for Continual Stereo Matching | 用于持续立体匹配的可重用架构增长 | Chenghao Zhang, Gaofeng Meng, Bin Fan, Kun Tian, Zhaoxiang Zhang, Shiming Xiang, Chunhong Pan | http://arxiv.org/pdf/2404.00360v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-30 | SVGCraft: Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout | SVGCraft:超越单一对象文本到 SVG 合成,具有全面的画布布局 | Ayan Banerjee, Nityanand Mathur, Josep Lladós, Umapada Pal, Anjan Dutta | http://arxiv.org/pdf/2404.00412v1 | null |
2024-03-30 | Exploring Unseen Environments with Robots using Large Language and Vision Models through a Procedurally Generated 3D Scene Representation | 通过程序生成的 3D 场景表示,使用大型语言和视觉模型与机器人一起探索看不见的环境 | Arjun P S, Andrew Melnik, Gora Chand Nandi | http://arxiv.org/pdf/2404.00318v1 | null |
2024-03-30 | ST-LLM: Large Language Models Are Effective Temporal Learners | ST-LLM:大型语言模型是有效的时间学习者 | Ruyang Liu, Chen Li, Haoran Tang, Yixiao Ge, Ying Shan, Ge Li | http://arxiv.org/pdf/2404.00308v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-30 | Multiway Point Cloud Mosaicking with Diffusion and Global Optimization | 具有扩散和全局优化的多路点云镶嵌 | Shengze Jin, Iro Armeni, Marc Pollefeys, Daniel Barath | http://arxiv.org/pdf/2404.00429v1 | null |
2024-03-30 | SGDFormer: One-stage Transformer-based Architecture for Cross-Spectral Stereo Image Guided Denoising | SGDFormer:基于变压器的一级架构,用于跨光谱立体图像引导去噪 | Runmin Zhang, Zhu Yu, Zehua Sheng, Jiacheng Ying, Si-Yuan Cao, Shu-Jie Chen, Bailin Yang, Junwei Li, Hui-Liang Shen | http://arxiv.org/pdf/2404.00349v1 | null |
2024-03-30 | Seeing the Unseen: A Frequency Prompt Guided Transformer for Image Restoration | 看到看不见的东西:用于图像恢复的频率提示引导变压器 | Shihao Zhou, Jinshan Pan, Jinglei Shi, Duosheng Chen, Lishen Qu, Jufeng Yang | http://arxiv.org/pdf/2404.00288v1 | null |
2024-03-30 | Look-Around Before You Leap: High-Frequency Injected Transformer for Image Restoration | 跳跃前环顾四周:用于图像恢复的高频注入变压器 | Shihao Zhou, Duosheng Chen, Jinshan Pan, Jufeng Yang | http://arxiv.org/pdf/2404.00279v1 | null |
2024-03-30 | IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images | IPoD:使用点扩散进行隐式场学习,用于从单个 RGB-D 图像重建可泛化的 3D 对象 | Yushuang Wu, Luyue Shi, Junhao Cai, Weihao Yuan, Lingteng Qiu, Zilong Dong, Liefeng Bo, Shuguang Cui, Xiaoguang Han | http://arxiv.org/pdf/2404.00269v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-30 | Monocular Identity-Conditioned Facial Reflectance Reconstruction | 单目身份条件面部反射率重建 | Xingyu Ren, Jiankang Deng, Yuhao Cheng, Jia Guo, Chao Ma, Yichao Yan, Wenhan Zhu, Xiaokang Yang | http://arxiv.org/pdf/2404.00301v1 | null |
2024-04-02 | HOI-M3:Capture Multiple Humans and Objects Interaction within Contextual Environment | HOI-M3:捕捉情境环境中的多个人与物体的交互 | Juze Zhang, Jingyan Zhang, Zining Song, Zhanhe Shi, Chengfeng Zhao, Ye Shi, Jingyi Yu, Lan Xu, Jingya Wang | http://arxiv.org/pdf/2404.00299v2 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-30 | Continual Learning for Autonomous Robots: A Prototype-based Approach | 自主机器人的持续学习:基于原型的方法 | Elvin Hajizada, Balachandran Swaminathan, Yulia Sandamirskaya | http://arxiv.org/pdf/2404.00418v1 | null |
2024-04-02 | InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning | InfLoRA:用于持续学习的无干扰低阶适应 | Yan-Shuo Liang, Wu-Jun Li | http://arxiv.org/pdf/2404.00228v2 | link |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-30 | 94% on CIFAR-10 in 3.29 Seconds on a Single GPU | 在单个 GPU 上,CIFAR-10 在 3.29 秒内达到 94% | Keller Jordan | http://arxiv.org/pdf/2404.00498v1 | null |
2024-03-30 | Extracting Manifold Information from Point Clouds | 从点云中提取流形信息 | Patrick Guidotti | http://arxiv.org/pdf/2404.00427v1 | null |
2024-03-30 | Do Vision-Language Models Understand Compound Nouns? | 视觉语言模型能理解复合名词吗? | Sonal Kumar, Sreyan Ghosh, S Sakshi, Utkarsh Tyagi, Dinesh Manocha | http://arxiv.org/pdf/2404.00419v1 | null |
2024-03-30 | STBA: Towards Evaluating the Robustness of DNNs for Query-Limited Black-box Scenario | STBA:针对查询受限的黑盒场景评估 DNN 的鲁棒性 | Renyang Liu, Kwok-Yan Lam, Wei Zhou, Sixing Wu, Jun Zhao, Dongting Hu, Mingming Gong | http://arxiv.org/pdf/2404.00362v1 | null |
2024-03-30 | Learing Trimaps via Clicks for Image Matting | 通过点击图像抠图来学习 Trimap | Chenyi Zhang, Yihan Hu, Henghui Ding, Humphrey Shi, Yao Zhao, Yunchao Wei | http://arxiv.org/pdf/2404.00335v1 | null |
2024-03-30 | Memory-Scalable and Simplified Functional Map Learning | 内存可扩展且简化的功能图学习 | Robin Magnet, Maks Ovsjanikov | http://arxiv.org/pdf/2404.00330v1 | null |
2024-03-30 | Harmonizing Light and Darkness: A Symphony of Prior-guided Data Synthesis and Adaptive Focus for Nighttime Flare Removal | 协调光明与黑暗:预先引导的数据合成和夜间耀斑去除的自适应聚焦的交响乐 | Lishen Qu, Shihao Zhou, Jinshan Pan, Jinglei Shi, Duosheng Chen, Jufeng Yang | http://arxiv.org/pdf/2404.00313v1 | null |
2024-03-30 | LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion | LAKE-RED:通过潜在背景知识检索增强扩散生成伪装图像 | Pancheng Zhao, Peng Xu, Pengda Qin, Deng-Ping Fan, Zhicheng Zhang, Guoli Jia, Bowen Zhou, Jufeng Yang | http://arxiv.org/pdf/2404.00292v1 | null |
2024-03-30 | Exploiting Self-Supervised Constraints in Image Super-Resolution | 利用图像超分辨率中的自监督约束 | Gang Wu, Junjun Jiang, Kui Jiang, Xianming Liu | http://arxiv.org/pdf/2404.00260v1 | null |