Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-08 | Unifying Graph Contrastive Learning via Graph Message Augmentation | 通过图消息增强统一图对比学习 | Ziyan Zhang, Bo Jiang, Jin Tang, Bin Luo | http://arxiv.org/pdf/2401.03638v1 | null |
2024-01-08 | Automated Detection of Myopic Maculopathy in MMAC 2023: Achievements in Classification, Segmentation, and Spherical Equivalent Prediction | MMAC 2023 中近视黄斑病变的自动检测:分类、分割和球面等效预测方面的成就 | Yihao Li, Philippe Zhang, Yubo Tan, Jing Zhang, Zhihan Wang, Weili Jiang, Pierre-Henri Conze, Mathieu Lamard, Gwenolé Quellec, Mostafa El Habib Daho | http://arxiv.org/pdf/2401.03615v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-08 | Dr$^2$Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning | Dr$^2$Net:用于内存高效微调的动态可逆双残差网络 | Chen Zhao, Shuming Liu, Karttikeya Mangalam, Guocheng Qian, Fatimah Zohra, Abdulmohsen Alghannam, Jitendra Malik, Bernard Ghanem | http://arxiv.org/pdf/2401.04105v1 | null |
2024-01-08 | Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification | 用于音视频分类的高效多尺度多模态瓶颈变压器 | Wentao Zhu | http://arxiv.org/pdf/2401.04023v1 | null |
2024-01-08 | MS-DETR: Efficient DETR Training with Mixed Supervision | MS-DETR:混合监督下的高效 DETR 训练 | Chuyang Zhao, Yifan Sun, Wenhao Wang, Qiang Chen, Errui Ding, Yi Yang, Jingdong Wang | http://arxiv.org/pdf/2401.03989v1 | null |
2024-01-08 | Multi-scale attention-based instance segmentation for measuring crystals with large size variation | 基于多尺度注意力的实例分割,用于测量尺寸变化较大的晶体 | Theresa Neubauer, Astrid Berg, Maria Wimmer, Dimitrios Lenis, David Major, Philip Matthias Winter, Gaia Romana De Paolis, Johannes Novotny, Daniel Lüftner, Katja Reinharter, et.al. | http://arxiv.org/pdf/2401.03939v1 | null |
2024-01-08 | RoboFusion: Towards Robust Multi-Modal 3D obiect Detection via SAM | RoboFusion:通过 SAM 实现稳健的多模态 3D 物体检测 | Ziying Song, Guoxing Zhang, Lin Liu, Lei Yang, Shaoqing Xu, Caiyan Jia, Feiyang Jia, Li Wang | http://arxiv.org/pdf/2401.03907v1 | null |
2024-01-08 | A New Dataset and a Distractor-Aware Architecture for Transparent Object Tracking | 用于透明对象跟踪的新数据集和干扰感知架构 | Alan Lukezic, Ziga Trojer, Jiri Matas, Matej Kristan | http://arxiv.org/pdf/2401.03872v1 | null |
2024-01-08 | UFO: Unidentified Foreground Object Detection in 3D Point Cloud | UFO:3D 点云中的不明前景物体检测 | Hyunjun Choi, Hawook Jeong, Jin Young Choi | http://arxiv.org/pdf/2401.03846v1 | null |
2024-01-08 | Fully Attentional Networks with Self-emerging Token Labeling | 具有自我出现的令牌标签的完全注意力网络 | Bingyin Zhao, Zhiding Yu, Shiyi Lan, Yutao Cheng, Anima Anandkumar, Yingjie Lao, Jose M. Alvarez | http://arxiv.org/pdf/2401.03844v1 | null |
2024-01-08 | WidthFormer: Toward Efficient Transformer-based BEV View Transformation | WidthFormer:实现基于 Transformer 的高效 BEV 视图转换 | Chenhongyi Yang, Tianwei Lin, Lichao Huang, Elliot J. Crowley | http://arxiv.org/pdf/2401.03836v1 | null |
2024-01-08 | A multimodal gesture recognition dataset for desktop human-computer interaction | 用于桌面人机交互的多模态手势识别数据集 | Qi Wang, Fengchao Zhu, Guangming Zhu, Liang Zhang, Ning Li, Eryang Gao | http://arxiv.org/pdf/2401.03828v1 | null |
2024-01-08 | Color-$S^{4}L$: Self-supervised Semi-supervised Learning with Image Colorization | Color-$S^{4}L$:具有图像着色的自监督半监督学习 | Hanxiao Chen | http://arxiv.org/pdf/2401.03753v1 | null |
2024-01-08 | Flying Bird Object Detection Algorithm in Surveillance Video | 监控视频中的飞鸟目标检测算法 | Ziwei Sun, Zexi Hua, Hengchao Li, Yan Li | http://arxiv.org/pdf/2401.03749v1 | null |
2024-01-08 | Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach | Flowmind2Digital:第一个全面的 Flowmind 识别和转换方法 | Huanyu Liu, Jianfeng Cai, Tingjia Zhang, Hongsheng Li, Siyuan Wang, Guangming Zhu, Syed Afaq Ali Shah, Mohammed Bennamoun, Liang Zhang | http://arxiv.org/pdf/2401.03742v1 | null |
2024-01-08 | A Large-scale Empirical Study on Improving the Fairness of Deep Learning Models | 提高深度学习模型公平性的大规模实证研究 | Junjie Yang, Jiajun Jiang, Zeyu Sun, Junjie Chen | http://arxiv.org/pdf/2401.03695v1 | link |
2024-01-08 | Primitive Geometry Segment Pre-training for 3D Medical Image Segmentation | 3D 医学图像分割的原始几何分割预训练 | Ryu Tadokoro, Ryosuke Yamada, Kodai Nakashima, Ryo Nakamura, Hirokatsu Kataoka | http://arxiv.org/pdf/2401.03665v1 | null |
2024-01-08 | Dual-Channel Reliable Breast Ultrasound Image Classification Based on Explainable Attribution and Uncertainty Quantification | 基于可解释归因和不确定性量化的双通道可靠乳腺超声图像分类 | Shuge Lei, Haonan Hu, Dasheng Sun, Huabin Zhang, Kehong Yuan, Jian Dai, Jijun Tang, Yan Tong | http://arxiv.org/pdf/2401.03664v1 | null |
2024-01-08 | Inverse-like Antagonistic Scene Text Spotting via Reading-Order Estimation and Dynamic Sampling | 通过阅读顺序估计和动态采样进行类逆对抗场景文本识别 | Shi-Xue Zhang, Chun Yang, Xiaobin Zhu, Hongyang Zhou, Hongfa Wang, Xu-Cheng Yin | http://arxiv.org/pdf/2401.03637v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-08 | D3PRefiner: A Diffusion-based Denoise Method for 3D Human Pose Refinement | D3PRefiner:用于 3D 人体姿势细化的基于扩散的降噪方法 | Danqi Yan, Qing Gao, Yuepeng Qian, Xinxing Chen, Chenglong Fu, Yuquan Leng | http://arxiv.org/pdf/2401.03914v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-08 | AGG: Amortized Generative 3D Gaussians for Single Image to 3D | AGG:用于单图像到 3D 的摊销生成 3D 高斯 | Dejia Xu, Ye Yuan, Morteza Mardani, Sifei Liu, Jiaming Song, Zhangyang Wang, Arash Vahdat | http://arxiv.org/pdf/2401.04099v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-08 | A Survey on 3D Gaussian Splatting | 3D 高斯泼溅综述 | Guikun Chen, Wenguan Wang | http://arxiv.org/pdf/2401.03890v1 | null |
2024-01-08 | NeRFmentation: NeRF-based Augmentation for Monocular Depth Estimation | NeRFmentation:基于 NeRF 的单目深度估计增强 | Casimir Feldmann, Niall Siegenheim, Nikolas Hars, Lovro Rabuzin, Mert Ertugrul, Luca Wolfart, Marc Pollefeys, Zuria Bauer, Martin R. Oswald | http://arxiv.org/pdf/2401.03771v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-08 | GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation | GPT-4V(ision) 是一款用于文本转 3D 生成的人性化评估器 | Tong Wu, Guandao Yang, Zhibing Li, Kai Zhang, Ziwei Liu, Leonidas Guibas, Dahua Lin, Gordon Wetzstein | http://arxiv.org/pdf/2401.04092v1 | null |
2024-01-08 | TIER: Text and Image Encoder-based Regression for AIGC Image Quality Assessment | TIER:用于 AIGC 图像质量评估的基于文本和图像编码器的回归 | Jiquan Yuan, Xinyan Cao, Jinming Che, Qinyuan Wang, Sen Liang, Wei Ren, Jinlong Lin, Xixin Cao | http://arxiv.org/pdf/2401.03854v1 | null |
2024-01-08 | 3D-SSGAN: Lifting 2D Semantics for 3D-Aware Compositional Portrait Synthesis | 3D-SSGAN:提升 2D 语义以实现 3D 感知构图合成 | Ruiqi Liu, Peng Zheng, Ye Wang, Rui Ma | http://arxiv.org/pdf/2401.03764v1 | null |
2024-01-08 | Deep Learning for Visual Neuroprosthesis | 视觉神经假体的深度学习 | Peter Beech, Shanshan Jia, Zhaofei Yu, Jian K. Liu | http://arxiv.org/pdf/2401.03639v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-08 | Aligned with LLM: a new multi-modal training paradigm for encoding fMRI activity in visual cortex | 与法学硕士一致:一种新的多模式训练范例,用于编码视觉皮层的功能磁共振成像活动 | Shuxiao Ma, Linyuan Wang, Senbao Hou, Bin Yan | http://arxiv.org/pdf/2401.03851v1 | null |
2024-01-08 | FM-AE: Frequency-masked Multimodal Autoencoder for Zinc Electrolysis Plate Contact Abnormality Detection | FM-AE:用于锌电解板接触异常检测的频率屏蔽多模态自动编码器 | Canzong Zhou, Can Zhou, Hongqiu Zhu, Tianhao Liu | http://arxiv.org/pdf/2401.03806v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-08 | Attention-Guided Erasing: A Novel Augmentation Method for Enhancing Downstream Breast Density Classification | 注意力引导擦除:一种增强下游乳腺密度分类的新型增强方法 | Adarsh Bhandary Panambur, Hui Yu, Sheethal Bhat, Prathmesh Madhu, Siming Bayer, Andreas Maier | http://arxiv.org/pdf/2401.03912v1 | null |
2024-01-08 | STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering | STAIR:用于视频问答的具有可审核中间结果的时空推理 | Yueqian Wang, Yuxuan Wang, Kai Chen, Dongyan Zhao | http://arxiv.org/pdf/2401.03901v1 | null |
2024-01-08 | Gramformer: Learning Crowd Counting via Graph-Modulated Transformer | Gramformer:通过图形调制变压器学习人群计数 | Hui Lin, Zhiheng Ma, Xiaopeng Hong, Qinnan Shangguan, Deyu Meng | http://arxiv.org/pdf/2401.03870v1 | null |
2024-01-08 | Monitoring water contaminants in coastal areas through ML algorithms leveraging atmospherically corrected Sentinel-2 data | 利用经大气校正的 Sentinel-2 数据通过机器学习算法监测沿海地区的水污染物 | Francesca Razzano, Francesco Mauro, Pietro Di Stasio, Gabriele Meoni, Marco Esposito, Gilda Schirinzi, Silvia Liberata Ullo | http://arxiv.org/pdf/2401.03792v1 | null |
2024-01-08 | Identifying Important Group of Pixels using Interactions | 使用交互识别重要的像素组 | Kosuke Sumiyasu, Kazuhiko Kawamoto, Hiroshi Kera | http://arxiv.org/pdf/2401.03785v1 | null |
2024-01-08 | FMA-Net: Flow-Guided Dynamic Filtering and Iterative Feature Refinement with Multi-Attention for Joint Video Super-Resolution and Deblurring | FMA-Net:流引导动态过滤和迭代特征细化,用于联合视频超分辨率和去模糊 | Geunhyuk Youk, Jihyong Oh, Munchurl Kim | http://arxiv.org/pdf/2401.03707v1 | null |
2024-01-08 | GloTSFormer: Global Video Text Spotting Transformer | GloTSFormer:全球视频文本识别变压器 | Han Wang, Yanjie Wang, Yang Li, Can Huang | http://arxiv.org/pdf/2401.03694v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-08 | Structure-focused Neurodegeneration Convolutional Neural Network for Modeling and Classification of Alzheimer's Disease | 用于阿尔茨海默氏病建模和分类的结构聚焦神经变性卷积神经网络 | Simisola Odimayo, Chollette C. Olisah, Khadija Mohammed | http://arxiv.org/pdf/2401.03922v1 | null |
2024-01-08 | InvariantOODG: Learning Invariant Features of Point Clouds for Out-of-Distribution Generalization | InvariantOODG:学习点云的不变特征以实现分布外泛化 | Zhimin Zhang, Xiang Gao, Wei Hu | http://arxiv.org/pdf/2401.03765v1 | null |
2024-01-08 | Sur2f: A Hybrid Representation for High-Quality and Efficient Surface Reconstruction from Multi-view Images | Sur2f:从多视图图像中实现高质量和高效表面重建的混合表示 | Zhangjin Huang, Zhihao Liang, Haojie Zhang, Yangkai Lin, Kui Jia | http://arxiv.org/pdf/2401.03704v1 | null |
2024-01-08 | DME-Driver: Integrating Human Decision Logic and 3D Scene Perception in Autonomous Driving | DME-Driver:在自动驾驶中集成人类决策逻辑和 3D 场景感知 | Wencheng Han, Dongqian Guo, Cheng-Zhong Xu, Jianbing Shen | http://arxiv.org/pdf/2401.03641v1 | null |
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-01-08 | RudolfV: A Foundation Model by Pathologists for Pathologists | RudolfV:病理学家为病理学家提供的基础模型 | Jonas Dippel, Barbara Feulner, Tobias Winterhoff, Simon Schallenberg, Gabriel Dernbach, Andreas Kunft, Stephan Tietz, Philipp Jurmeister, David Horst, Lukas Ruff, et.al. | http://arxiv.org/pdf/2401.04079v1 | null |
2024-01-08 | Fun with Flags: Robust Principal Directions via Flag Manifolds | 旗帜的乐趣:通过旗帜流形实现稳健的主要方向 | Nathan Mankovich, Gustau Camps-Valls, Tolga Birdal | http://arxiv.org/pdf/2401.04071v1 | null |
2024-01-08 | Behavioural Cloning in VizDoom | VizDoom 中的行为克隆 | Ryan Spick, Timothy Bradley, Ayush Raina, Pierluigi Vito Amadori, Guy Moss | http://arxiv.org/pdf/2401.03993v1 | null |
2024-01-08 | Limitations of Data-Driven Spectral Reconstruction -- An Optics-Aware Analysis | 数据驱动的光谱重建的局限性——光学感知分析 | Qiang Fu, Matheus Souza, Eunsue Choi, Suhyun Shin, Seung-Hwan Baek, Wolfgang Heidrich | http://arxiv.org/pdf/2401.03835v1 | null |
2024-01-08 | A foundation for exact binarized morphological neural networks | 精确二值化形态神经网络的基础 | Theodore Aouad, Hugues Talbot | http://arxiv.org/pdf/2401.03830v1 | null |
2024-01-08 | Gnuastro: visualizing the full dynamic range in color images | Gnuastro:可视化彩色图像的完整动态范围 | Raúl Infante-Sainz, Mohammad Akhlaghi | http://arxiv.org/pdf/2401.03814v1 | null |
2024-01-08 | MvKSR: Multi-view Knowledge-guided Scene Recovery for Hazy and Rainy Degradation | MvKSR:多视图知识引导的雾霾和雨天退化场景恢复 | Dong Yang, Wenyu Xu, Yuxu Lu, Yuan Gao, Jingming Zhang, Yu Guo | http://arxiv.org/pdf/2401.03800v1 | null |
2024-01-08 | Low-light Image Enhancement via CLIP-Fourier Guided Wavelet Diffusion | 通过 CLIP-傅立叶引导小波扩散实现低光图像增强 | Minglong Xue, Jinhong He, Yanyi He, Zhipu Liu, Wenhai Wang, Mingliang Zhou | http://arxiv.org/pdf/2401.03788v1 | null |
2024-01-08 | Machine Learning Applications in Traumatic Brain Injury Diagnosis and Prognosis: A Spotlight on Mild TBI and CT Imaging | 机器学习在创伤性脑损伤诊断和预后中的应用:聚焦轻度 TBI 和 CT 成像 | Hanem Ellethy, Shekhar S. Chandra, Viktor Vegh | http://arxiv.org/pdf/2401.03621v1 | null |