Skip to content

Latest commit

 

History

History
executable file
·
101 lines (80 loc) · 14.3 KB

2024-01-08.md

File metadata and controls

executable file
·
101 lines (80 loc) · 14.3 KB

!UPDATED -- 2024-01-08

各类学习方式

Publish Date Title Title_CN Authors PDF Code
2024-01-08 Unifying Graph Contrastive Learning via Graph Message Augmentation 通过图消息增强统一图对比学习 Ziyan Zhang, Bo Jiang, Jin Tang, Bin Luo http://arxiv.org/pdf/2401.03638v1 null
2024-01-08 Automated Detection of Myopic Maculopathy in MMAC 2023: Achievements in Classification, Segmentation, and Spherical Equivalent Prediction MMAC 2023 中近视黄斑病变的自动检测:分类、分割和球面等效预测方面的成就 Yihao Li, Philippe Zhang, Yubo Tan, Jing Zhang, Zhihan Wang, Weili Jiang, Pierre-Henri Conze, Mathieu Lamard, Gwenolé Quellec, Mostafa El Habib Daho http://arxiv.org/pdf/2401.03615v1 null

分类/检测/识别/分割

Publish Date Title Title_CN Authors PDF Code
2024-01-08 Dr$^2$Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning Dr$^2$Net:用于内存高效微调的动态可逆双残差网络 Chen Zhao, Shuming Liu, Karttikeya Mangalam, Guocheng Qian, Fatimah Zohra, Abdulmohsen Alghannam, Jitendra Malik, Bernard Ghanem http://arxiv.org/pdf/2401.04105v1 null
2024-01-08 Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification 用于音视频分类的高效多尺度多模态瓶颈变压器 Wentao Zhu http://arxiv.org/pdf/2401.04023v1 null
2024-01-08 MS-DETR: Efficient DETR Training with Mixed Supervision MS-DETR:混合监督下的高效 DETR 训练 Chuyang Zhao, Yifan Sun, Wenhao Wang, Qiang Chen, Errui Ding, Yi Yang, Jingdong Wang http://arxiv.org/pdf/2401.03989v1 null
2024-01-08 Multi-scale attention-based instance segmentation for measuring crystals with large size variation 基于多尺度注意力的实例分割,用于测量尺寸变化较大的晶体 Theresa Neubauer, Astrid Berg, Maria Wimmer, Dimitrios Lenis, David Major, Philip Matthias Winter, Gaia Romana De Paolis, Johannes Novotny, Daniel Lüftner, Katja Reinharter, et.al. http://arxiv.org/pdf/2401.03939v1 null
2024-01-08 RoboFusion: Towards Robust Multi-Modal 3D obiect Detection via SAM RoboFusion:通过 SAM 实现稳健的多模态 3D 物体检测 Ziying Song, Guoxing Zhang, Lin Liu, Lei Yang, Shaoqing Xu, Caiyan Jia, Feiyang Jia, Li Wang http://arxiv.org/pdf/2401.03907v1 null
2024-01-08 A New Dataset and a Distractor-Aware Architecture for Transparent Object Tracking 用于透明对象跟踪的新数据集和干扰感知架构 Alan Lukezic, Ziga Trojer, Jiri Matas, Matej Kristan http://arxiv.org/pdf/2401.03872v1 null
2024-01-08 UFO: Unidentified Foreground Object Detection in 3D Point Cloud UFO:3D 点云中的不明前景物体检测 Hyunjun Choi, Hawook Jeong, Jin Young Choi http://arxiv.org/pdf/2401.03846v1 null
2024-01-08 Fully Attentional Networks with Self-emerging Token Labeling 具有自我出现的令牌标签的完全注意力网络 Bingyin Zhao, Zhiding Yu, Shiyi Lan, Yutao Cheng, Anima Anandkumar, Yingjie Lao, Jose M. Alvarez http://arxiv.org/pdf/2401.03844v1 null
2024-01-08 WidthFormer: Toward Efficient Transformer-based BEV View Transformation WidthFormer:实现基于 Transformer 的高效 BEV 视图转换 Chenhongyi Yang, Tianwei Lin, Lichao Huang, Elliot J. Crowley http://arxiv.org/pdf/2401.03836v1 null
2024-01-08 A multimodal gesture recognition dataset for desktop human-computer interaction 用于桌面人机交互的多模态手势识别数据集 Qi Wang, Fengchao Zhu, Guangming Zhu, Liang Zhang, Ning Li, Eryang Gao http://arxiv.org/pdf/2401.03828v1 null
2024-01-08 Color-$S^{4}L$: Self-supervised Semi-supervised Learning with Image Colorization Color-$S^{4}L$:具有图像着色的自监督半监督学习 Hanxiao Chen http://arxiv.org/pdf/2401.03753v1 null
2024-01-08 Flying Bird Object Detection Algorithm in Surveillance Video 监控视频中的飞鸟目标检测算法 Ziwei Sun, Zexi Hua, Hengchao Li, Yan Li http://arxiv.org/pdf/2401.03749v1 null
2024-01-08 Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach Flowmind2Digital:第一个全面的 Flowmind 识别和转换方法 Huanyu Liu, Jianfeng Cai, Tingjia Zhang, Hongsheng Li, Siyuan Wang, Guangming Zhu, Syed Afaq Ali Shah, Mohammed Bennamoun, Liang Zhang http://arxiv.org/pdf/2401.03742v1 null
2024-01-08 A Large-scale Empirical Study on Improving the Fairness of Deep Learning Models 提高深度学习模型公平性的大规模实证研究 Junjie Yang, Jiajun Jiang, Zeyu Sun, Junjie Chen http://arxiv.org/pdf/2401.03695v1 link
2024-01-08 Primitive Geometry Segment Pre-training for 3D Medical Image Segmentation 3D 医学图像分割的原始几何分割预训练 Ryu Tadokoro, Ryosuke Yamada, Kodai Nakashima, Ryo Nakamura, Hirokatsu Kataoka http://arxiv.org/pdf/2401.03665v1 null
2024-01-08 Dual-Channel Reliable Breast Ultrasound Image Classification Based on Explainable Attribution and Uncertainty Quantification 基于可解释归因和不确定性量化的双通道可靠乳腺超声图像分类 Shuge Lei, Haonan Hu, Dasheng Sun, Huabin Zhang, Kehong Yuan, Jian Dai, Jijun Tang, Yan Tong http://arxiv.org/pdf/2401.03664v1 null
2024-01-08 Inverse-like Antagonistic Scene Text Spotting via Reading-Order Estimation and Dynamic Sampling 通过阅读顺序估计和动态采样进行类逆对抗场景文本识别 Shi-Xue Zhang, Chun Yang, Xiaobin Zhu, Hongyang Zhou, Hongfa Wang, Xu-Cheng Yin http://arxiv.org/pdf/2401.03637v1 null

OCR

Publish Date Title Title_CN Authors PDF Code
2024-01-08 D3PRefiner: A Diffusion-based Denoise Method for 3D Human Pose Refinement D3PRefiner:用于 3D 人体姿势细化的基于扩散的降噪方法 Danqi Yan, Qing Gao, Yuepeng Qian, Xinxing Chen, Chenglong Fu, Yuquan Leng http://arxiv.org/pdf/2401.03914v1 null

模型压缩/优化

Publish Date Title Title_CN Authors PDF Code
2024-01-08 AGG: Amortized Generative 3D Gaussians for Single Image to 3D AGG:用于单图像到 3D 的摊销生成 3D 高斯 Dejia Xu, Ye Yuan, Morteza Mardani, Sifei Liu, Jiaming Song, Zhangyang Wang, Arash Vahdat http://arxiv.org/pdf/2401.04099v1 null

Nerf

Publish Date Title Title_CN Authors PDF Code
2024-01-08 A Survey on 3D Gaussian Splatting 3D 高斯泼溅综述 Guikun Chen, Wenguan Wang http://arxiv.org/pdf/2401.03890v1 null
2024-01-08 NeRFmentation: NeRF-based Augmentation for Monocular Depth Estimation NeRFmentation:基于 NeRF 的单目深度估计增强 Casimir Feldmann, Niall Siegenheim, Nikolas Hars, Lovro Rabuzin, Mert Ertugrul, Luca Wolfart, Marc Pollefeys, Zuria Bauer, Martin R. Oswald http://arxiv.org/pdf/2401.03771v1 null

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-01-08 GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation GPT-4V(ision) 是一款用于文本转 3D 生成的人性化评估器 Tong Wu, Guandao Yang, Zhibing Li, Kai Zhang, Ziwei Liu, Leonidas Guibas, Dahua Lin, Gordon Wetzstein http://arxiv.org/pdf/2401.04092v1 null
2024-01-08 TIER: Text and Image Encoder-based Regression for AIGC Image Quality Assessment TIER:用于 AIGC 图像质量评估的基于文本和图像编码器的回归 Jiquan Yuan, Xinyan Cao, Jinming Che, Qinyuan Wang, Sen Liang, Wei Ren, Jinlong Lin, Xixin Cao http://arxiv.org/pdf/2401.03854v1 null
2024-01-08 3D-SSGAN: Lifting 2D Semantics for 3D-Aware Compositional Portrait Synthesis 3D-SSGAN:提升 2D 语义以实现 3D 感知构图合成 Ruiqi Liu, Peng Zheng, Ye Wang, Rui Ma http://arxiv.org/pdf/2401.03764v1 null
2024-01-08 Deep Learning for Visual Neuroprosthesis 视觉神经假体的深度学习 Peter Beech, Shanshan Jia, Zhaofei Yu, Jian K. Liu http://arxiv.org/pdf/2401.03639v1 null

多模态

Publish Date Title Title_CN Authors PDF Code
2024-01-08 Aligned with LLM: a new multi-modal training paradigm for encoding fMRI activity in visual cortex 与法学硕士一致:一种新的多模式训练范例,用于编码视觉皮层的功能磁共振成像活动 Shuxiao Ma, Linyuan Wang, Senbao Hou, Bin Yan http://arxiv.org/pdf/2401.03851v1 null
2024-01-08 FM-AE: Frequency-masked Multimodal Autoencoder for Zinc Electrolysis Plate Contact Abnormality Detection FM-AE:用于锌电解板接触异常检测的频率屏蔽多模态自动编码器 Canzong Zhou, Can Zhou, Hongqiu Zhu, Tianhao Liu http://arxiv.org/pdf/2401.03806v1 null

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-01-08 Attention-Guided Erasing: A Novel Augmentation Method for Enhancing Downstream Breast Density Classification 注意力引导擦除:一种增强下游乳腺密度分类的新型增强方法 Adarsh Bhandary Panambur, Hui Yu, Sheethal Bhat, Prathmesh Madhu, Siming Bayer, Andreas Maier http://arxiv.org/pdf/2401.03912v1 null
2024-01-08 STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering STAIR:用于视频问答的具有可审核中间结果的时空推理 Yueqian Wang, Yuxuan Wang, Kai Chen, Dongyan Zhao http://arxiv.org/pdf/2401.03901v1 null
2024-01-08 Gramformer: Learning Crowd Counting via Graph-Modulated Transformer Gramformer:通过图形调制变压器学习人群计数 Hui Lin, Zhiheng Ma, Xiaopeng Hong, Qinnan Shangguan, Deyu Meng http://arxiv.org/pdf/2401.03870v1 null
2024-01-08 Monitoring water contaminants in coastal areas through ML algorithms leveraging atmospherically corrected Sentinel-2 data 利用经大气校正的 Sentinel-2 数据通过机器学习算法监测沿海地区的水污染物 Francesca Razzano, Francesco Mauro, Pietro Di Stasio, Gabriele Meoni, Marco Esposito, Gilda Schirinzi, Silvia Liberata Ullo http://arxiv.org/pdf/2401.03792v1 null
2024-01-08 Identifying Important Group of Pixels using Interactions 使用交互识别重要的像素组 Kosuke Sumiyasu, Kazuhiko Kawamoto, Hiroshi Kera http://arxiv.org/pdf/2401.03785v1 null
2024-01-08 FMA-Net: Flow-Guided Dynamic Filtering and Iterative Feature Refinement with Multi-Attention for Joint Video Super-Resolution and Deblurring FMA-Net:流引导动态过滤和迭代特征细化,用于联合视频超分辨率和去模糊 Geunhyuk Youk, Jihyong Oh, Munchurl Kim http://arxiv.org/pdf/2401.03707v1 null
2024-01-08 GloTSFormer: Global Video Text Spotting Transformer GloTSFormer:全球视频文本识别变压器 Han Wang, Yanjie Wang, Yang Li, Can Huang http://arxiv.org/pdf/2401.03694v1 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-01-08 Structure-focused Neurodegeneration Convolutional Neural Network for Modeling and Classification of Alzheimer's Disease 用于阿尔茨海默氏病建模和分类的结构聚焦神经变性卷积神经网络 Simisola Odimayo, Chollette C. Olisah, Khadija Mohammed http://arxiv.org/pdf/2401.03922v1 null
2024-01-08 InvariantOODG: Learning Invariant Features of Point Clouds for Out-of-Distribution Generalization InvariantOODG:学习点云的不变特征以实现分布外泛化 Zhimin Zhang, Xiang Gao, Wei Hu http://arxiv.org/pdf/2401.03765v1 null
2024-01-08 Sur2f: A Hybrid Representation for High-Quality and Efficient Surface Reconstruction from Multi-view Images Sur2f:从多视图图像中实现高质量和高效表面重建的混合表示 Zhangjin Huang, Zhihao Liang, Haojie Zhang, Yangkai Lin, Kui Jia http://arxiv.org/pdf/2401.03704v1 null
2024-01-08 DME-Driver: Integrating Human Decision Logic and 3D Scene Perception in Autonomous Driving DME-Driver:在自动驾驶中集成人类决策逻辑和 3D 场景感知 Wencheng Han, Dongqian Guo, Cheng-Zhong Xu, Jianbing Shen http://arxiv.org/pdf/2401.03641v1 null

其他

Publish Date Title Title_CN Authors PDF Code
2024-01-08 RudolfV: A Foundation Model by Pathologists for Pathologists RudolfV:病理学家为病理学家提供的基础模型 Jonas Dippel, Barbara Feulner, Tobias Winterhoff, Simon Schallenberg, Gabriel Dernbach, Andreas Kunft, Stephan Tietz, Philipp Jurmeister, David Horst, Lukas Ruff, et.al. http://arxiv.org/pdf/2401.04079v1 null
2024-01-08 Fun with Flags: Robust Principal Directions via Flag Manifolds 旗帜的乐趣:通过旗帜流形实现稳健的主要方向 Nathan Mankovich, Gustau Camps-Valls, Tolga Birdal http://arxiv.org/pdf/2401.04071v1 null
2024-01-08 Behavioural Cloning in VizDoom VizDoom 中的行为克隆 Ryan Spick, Timothy Bradley, Ayush Raina, Pierluigi Vito Amadori, Guy Moss http://arxiv.org/pdf/2401.03993v1 null
2024-01-08 Limitations of Data-Driven Spectral Reconstruction -- An Optics-Aware Analysis 数据驱动的光谱重建的局限性——光学感知分析 Qiang Fu, Matheus Souza, Eunsue Choi, Suhyun Shin, Seung-Hwan Baek, Wolfgang Heidrich http://arxiv.org/pdf/2401.03835v1 null
2024-01-08 A foundation for exact binarized morphological neural networks 精确二值化形态神经网络的基础 Theodore Aouad, Hugues Talbot http://arxiv.org/pdf/2401.03830v1 null
2024-01-08 Gnuastro: visualizing the full dynamic range in color images Gnuastro:可视化彩色图像的完整动态范围 Raúl Infante-Sainz, Mohammad Akhlaghi http://arxiv.org/pdf/2401.03814v1 null
2024-01-08 MvKSR: Multi-view Knowledge-guided Scene Recovery for Hazy and Rainy Degradation MvKSR:多视图知识引导的雾霾和雨天退化场景恢复 Dong Yang, Wenyu Xu, Yuxu Lu, Yuan Gao, Jingming Zhang, Yu Guo http://arxiv.org/pdf/2401.03800v1 null
2024-01-08 Low-light Image Enhancement via CLIP-Fourier Guided Wavelet Diffusion 通过 CLIP-傅立叶引导小波扩散实现低光图像增强 Minglong Xue, Jinhong He, Yanyi He, Zhipu Liu, Wenhai Wang, Mingliang Zhou http://arxiv.org/pdf/2401.03788v1 null
2024-01-08 Machine Learning Applications in Traumatic Brain Injury Diagnosis and Prognosis: A Spotlight on Mild TBI and CT Imaging 机器学习在创伤性脑损伤诊断和预后中的应用:聚焦轻度 TBI 和 CT 成像 Hanem Ellethy, Shekhar S. Chandra, Viktor Vegh http://arxiv.org/pdf/2401.03621v1 null