Skip to content

Latest commit

 

History

History
executable file
·
132 lines (107 loc) · 21.3 KB

2024-09-28.md

File metadata and controls

executable file
·
132 lines (107 loc) · 21.3 KB

[UPDATED!] 2024-09-28 (Publish Time)

生成模型

Publish Date Title Title_CN Authors PDF Code
2024-09-28 Introducing SDICE: An Index for Assessing Diversity of Synthetic Medical Datasets SDICE:评估合成医疗数据集多样性的指标研究 Mohammed Talha Alam, Raza Imam, Mohammad Areeb Qazi, Asim Ukaye, Karthik Nandakumar http://arxiv.org/pdf/2409.19436v1 null
2024-09-28 Efficient Semantic Diffusion Architectures for Model Training on Synthetic Echocardiograms 高效语义扩散架构在合成超声心动图模型训练中的应用 David Stojanovski, Mariana da Silva, Pablo Lamata, Arian Beqiri, Alberto Gomez http://arxiv.org/pdf/2409.19371v1 null
2024-09-28 Conditional Image Synthesis with Diffusion Models: A Survey 条件图像合成中的扩散模型:综述 Zheyuan Zhan, Defang Chen, Jian-Ping Mei, Zhenghe Zhao, Jiawei Chen, Chun Chen, Siwei Lyu, Can Wang http://arxiv.org/pdf/2409.19365v1 null
2024-09-28 CausalVE: Face Video Privacy Encryption via Causal Video Prediction CausalVE:基于因果视频预测的人脸视频隐私加密方法 Yubo Huang, Wenhao Feng, Xin Lai, Zixi Wang, Jingzehua Xu, Shuai Zhang, Hongjie He, Fan Chen http://arxiv.org/pdf/2409.19306v1 null
2024-09-28 FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models FINE:分解知识以初始化可变尺寸扩散模型 Yucheng Xie, Fu Feng, Ruixiao Shi, Jing Wang, Xin Geng http://arxiv.org/pdf/2409.19289v1 null
2024-09-28 WcDT: World-centric Diffusion Transformer for Traffic Scene Generation WcDT:面向交通场景生成的以世界为中心的扩散Transformer模型 Chen Yang, Yangfan He, Aaron Xuxiang Tian, Dong Chen, Tianyu Shi, Arsalan Heydarian http://arxiv.org/pdf/2404.02082v2 link
2024-09-28 UKnow: A Unified Knowledge Protocol with Multimodal Knowledge Graph Datasets for Reasoning and Vision-Language Pre-Training UKnow:面向推理与视觉-语言预训练的统一知识协议多模态知识图谱数据集 Biao Gong, Shuai Tan, Yutong Feng, Xiaoying Xie, Yuyuan Li, Chaochao Chen, Kecheng Zheng, Yujun Shen, Deli Zhao http://arxiv.org/pdf/2302.06891v4 null
2024-09-28 Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networks 基于离散小波变换和生成对抗网络的颜色文档图像三阶段二值化方法 Rui-Yang Ju, Yu-Shian Lin, Yanlin Jin, Chih-Chia Chen, Chun-Tse Chien, Jen-Shiun Chiang http://arxiv.org/pdf/2211.16098v8 link

多模态

Publish Date Title Title_CN Authors PDF Code
2024-09-28 MedCLIP-SAMv2: Towards Universal Text-Driven Medical Image Segmentation MedCLIP-SAMv2:迈向通用文本驱动的医学图像分割 Taha Koleilat, Hojat Asgariandehkordi, Hassan Rivaz, Yiming Xiao http://arxiv.org/pdf/2409.19483v1 null
2024-09-28 FairPIVARA: Reducing and Assessing Biases in CLIP-Based Multimodal Models 公平PIVARA:降低和评估基于CLIP的多模态模型中的偏见 Diego A. B. Moreira, Alef Iury Ferreira, Gabriel Oliveira dos Santos, Luiz Pereira, João Medrado Gondim, Gustavo Bonil, Helena Maia, Nádia da Silva, Simone Tiemi Hashiguti, Jefersson A. dos Santos, et.al. http://arxiv.org/pdf/2409.19474v1 null
2024-09-28 Contrastive ground-level image and remote sensing pre-training improves representation learning for natural world imagery 对比地物图像与遥感预训练提升自然世界图像表征学习 Andy V. Huynh, Lauren E. Gillespie, Jael Lopez-Saucedo, Claire Tang, Rohan Sikand, Moisés Expósito-Alonso http://arxiv.org/pdf/2409.19439v1 null
2024-09-28 From Unimodal to Multimodal: Scaling up Projectors to Align Modalities 从单模态到多模态:扩展投影器以对齐模态 Mayug Maniparambil, Raiymbek Akshulakov, Yasser Abdelaziz Dahou Djilali, Sanath Narayan, Ankit Singh, Noel E. O'Connor http://arxiv.org/pdf/2409.19425v1 null
2024-09-28 Multi-sensor Learning Enables Information Transfer across Different Sensory Data and Augments Multi-modality Imaging 多传感器学习实现不同感官数据间的信息传递并增强多模态成像能力 Lingting Zhu, Yizheng Chen, Lianli Liu, Lei Xing, Lequan Yu http://arxiv.org/pdf/2409.19420v1 null
2024-09-28 X-Prompt: Multi-modal Visual Prompt for Video Object Segmentation X-Prompt:面向视频目标分割的多模态视觉提示方法 Pinxue Guo, Wanyun Li, Hao Huang, Lingyi Hong, Xinyu Zhou, Zhaoyu Chen, Jinglun Li, Kaixun Jiang, Wei Zhang, Wenqiang Zhang http://arxiv.org/pdf/2409.19342v1 null
2024-09-28 Visual Question Decomposition on Multimodal Large Language Models 视觉问题分解在多模态大型语言模型上的研究 Haowei Zhang, Jianzhe Liu, Zhen Han, Shuo Chen, Bailan He, Volker Tresp, Zhiqiang Xu, Jindong Gu http://arxiv.org/pdf/2409.19339v1 null
2024-09-28 3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models 3D-CT-GPT:通过集成大型视觉-语言模型生成三维放射学报告 Hao Chen, Wei Zhao, Yingli Li, Tianyang Zhong, Yisong Wang, Youlan Shang, Lei Guo, Junwei Han, Tianming Liu, Jun Liu, et.al. http://arxiv.org/pdf/2409.19330v1 null
2024-09-28 CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling CLIP-MoE:构建具有多样化多重回收的CLIP混合专家模型 Jihai Zhang, Xiaoye Qu, Tong Zhu, Yu Cheng http://arxiv.org/pdf/2409.19291v1 null
2024-09-28 TrojVLM: Backdoor Attack Against Vision Language Models TrojVLM:针对视觉语言模型的的后门攻击 Weimin Lyu, Lu Pang, Tengfei Ma, Haibin Ling, Chao Chen http://arxiv.org/pdf/2409.19232v1 null
2024-09-28 Multimodal-Enhanced Objectness Learner for Corner Case Detection in Autonomous Driving 多模态增强的目标性学习器在自动驾驶角案例检测中的应用 Lixing Xiao, Ruixiao Shi, Xiaoyang Tang, Yi Zhou http://arxiv.org/pdf/2402.02026v2 link

Nerf

Publish Date Title Title_CN Authors PDF Code
2024-09-28 G3R: Gradient Guided Generalizable Reconstruction G3R:梯度引导的可泛化重建方法 Yun Chen, Jingkang Wang, Ze Yang, Sivabalan Manivasagam, Raquel Urtasun http://arxiv.org/pdf/2409.19405v1 null
2024-09-28 GeoTransfer : Generalizable Few-Shot Multi-View Reconstruction via Transfer Learning GeoTransfer:基于迁移学习的可泛化少样本多视角重建方法 Shubhendu Jena, Franck Multon, Adnane Boukhayma http://arxiv.org/pdf/2408.14724v2 null

3DGS

Publish Date Title Title_CN Authors PDF Code
2024-09-28 GS-EVT: Cross-Modal Event Camera Tracking based on Gaussian Splatting GS-EVT:基于高斯展开的跨模态事件相机追踪算法 Tao Liu, Runze Yuan, Yi'ang Ju, Xun Xu, Jiaqi Yang, Xiangting Meng, Xavier Lagorce, Laurent Kneip http://arxiv.org/pdf/2409.19228v1 null
2024-09-28 1st Place Solution to the 8th HANDS Workshop Challenge -- ARCTIC Track: 3DGS-based Bimanual Category-agnostic Interaction Reconstruction 第八届HANDS研讨会挑战赛ARCTIC赛道一等奖解决方案:基于3DGS的双手类别无关交互重建 Jeongwan On, Kyeonghwan Gwak, Gunyoung Kang, Hyein Hwang, Soohyun Hwang, Junuk Cha, Jaewook Han, Seungryul Baek http://arxiv.org/pdf/2409.19215v1 null
2024-09-28 SplatSim: Zero-Shot Sim2Real Transfer of RGB Manipulation Policies Using Gaussian Splatting SplatSim:基于高斯扩散的RGB操作策略零样本Sim2Real迁移 Mohammad Nomaan Qureshi, Sparsh Garg, Francisco Yandun, David Held, George Kantor, Abhisesh Silwal http://arxiv.org/pdf/2409.10161v2 null

模型压缩/优化

Publish Date Title Title_CN Authors PDF Code
2024-09-28 Mind the Gap: Promoting Missing Modality Brain Tumor Segmentation with Alignment 填补空白:利用对齐促进缺失模态脑肿瘤分割 Tianyi Liu, Zhaorui Tan, Haochuan Jiang, Xi Yang, Kaizhu Huang http://arxiv.org/pdf/2409.19366v1 null
2024-09-28 MOC-RVQ: Multilevel Codebook-Assisted Digital Generative Semantic Communication MOC-RVQ:多级码本辅助的数字生成语义通信 Yingbin Zhou, Yaping Sun, Guanying Chen, Xiaodong Xu, Hao Chen, Binhong Huang, Shuguang Cui, Ping Zhang http://arxiv.org/pdf/2401.01272v2 link
2024-09-28 Adaptive Depth Networks with Skippable Sub-Paths 自适应深度网络与可跳过子路径 Woochul Kang, Hyungseop Lee http://arxiv.org/pdf/2312.16392v3 null

分类/检测/识别/分割/...

Publish Date Title Title_CN Authors PDF Code
2024-09-28 Accelerating Malware Classification: A Vision Transformer Solution 加速恶意软件分类:一种视觉变换器解决方案 Shrey Bavishi, Shrey Modi http://arxiv.org/pdf/2409.19461v1 null
2024-09-28 On the universality of neural encodings in CNNs 神经编码在卷积神经网络中的普遍性研究 Florentin Guth, Brice Ménard http://arxiv.org/pdf/2409.19460v1 null
2024-09-28 See Where You Read with Eye Gaze Tracking and Large Language Model 基于眼动追踪与大型语言模型的可视化阅读位置识别研究 Sikai Yang, Gang Yan http://arxiv.org/pdf/2409.19454v1 null
2024-09-28 Canonical Correlation Guided Deep Neural Network 规范相关引导的深度神经网络 Zhiwen Chen, Siwen Mo, Haobin Ke, Steven X. Ding, Zhaohui Jiang, Chunhua Yang, Weihua Gui http://arxiv.org/pdf/2409.19396v1 null
2024-09-28 DOTA: Distributional Test-Time Adaptation of Vision-Language Models DOTA:视觉语言模型的分布式测试时自适应调整 Zongbo Han, Jialong Yang, Junfan Li, Qinghua Hu, Qianli Xu, Mike Zheng Shou, Changqing Zhang http://arxiv.org/pdf/2409.19375v1 null
2024-09-28 MambaEviScrib: Mamba and Evidence-Guided Consistency Make CNN Work Robustly for Scribble-Based Weakly Supervised Ultrasound Image Segmentation 曼巴EviScrib:基于曼巴与证据引导一致性的CNN在基于涂鸦弱监督超声图像分割中的稳健工作 Xiaoxiang Han, Xinyu Li, Jiang Shang, Yiman Liu, Keyan Chen, Qiaohong Liu, Qi Zhang http://arxiv.org/pdf/2409.19370v1 null
2024-09-28 Sparse Modelling for Feature Learning in High Dimensional Data 高维数据特征学习的稀疏建模方法 Harish Neelam, Koushik Sai Veerella, Souradip Biswas http://arxiv.org/pdf/2409.19361v1 null
2024-09-28 Toward Deep Learning-based Segmentation and Quantitative Analysis of Cervical Spinal Cord Magnetic Resonance Images 基于深度学习的颈椎脊髓磁共振图像分割与定量分析研究 Maryam Tavakol Elahi http://arxiv.org/pdf/2409.19354v1 null
2024-09-28 VLAD-BuFF: Burst-aware Fast Feature Aggregation for Visual Place Recognition VLAD-BuFF:面向视觉地点识别的突发感知快速特征聚合方法 Ahmad Khaliq, Ming Xu, Stephen Hausler, Michael Milford, Sourav Garg http://arxiv.org/pdf/2409.19293v1 link
2024-09-28 Beyond Euclidean: Dual-Space Representation Learning for Weakly Supervised Video Violence Detection 超越欧几里得:双空间表示学习在弱监督视频暴力检测中的应用 Jiaxu Leng, Zhanjie Wu, Mingpi Tan, Yiran Liu, Ji Gan, Haosheng Chen, Xinbo Gao http://arxiv.org/pdf/2409.19252v1 null
2024-09-28 Cauchy activation function and XNet 柯西激活函数与XNet研究 Xin Li, Zhihong Xia, Hongkun Zhang http://arxiv.org/pdf/2409.19221v1 null
2024-09-28 Learning to Obstruct Few-Shot Image Classification over Restricted Classes 学习在受限类别上阻碍少样本图像分类 Amber Yijia Zheng, Chiao-An Yang, Raymond A. Yeh http://arxiv.org/pdf/2409.19210v1 null
2024-09-28 TextGaze: Gaze-Controllable Face Generation with Natural Language 文本注视:基于自然语言的注视可控人脸生成技术 Hengfei Wang, Zhongqun Zhang, Yihua Cheng, Hyung Jin Chang http://arxiv.org/pdf/2404.17486v3 null
2024-09-28 Machine Vision-Based Assessment of Fall Color Changes and its Relationship with Leaf Nitrogen Concentration 基于机器视觉的秋季叶色变化评估及其与叶片氮浓度关系研究 Achyut Paudel, Jostan Brown, Priyanka Upadhyaya, Atif Bilal Asad, Safal Kshetri, Joseph R. Davidson, Cindy Grimm, Ashley Thompson, Bernardita Sallato, Matthew D. Whiting, et.al. http://arxiv.org/pdf/2404.14653v2 null
2024-09-28 AnyPattern: Towards In-context Image Copy Detection AnyPattern:迈向上下文内图像复制检测 Wenhao Wang, Yifan Sun, Zhentao Tan, Yi Yang http://arxiv.org/pdf/2404.13788v3 link
2024-09-28 RPMArt: Towards Robust Perception and Manipulation for Articulated Objects RPMArt:面向关节物体的稳健感知与操作研究 Junbo Wang, Wenhai Liu, Qiaojun Yu, Yang You, Liu Liu, Weiming Wang, Cewu Lu http://arxiv.org/pdf/2403.16023v2 link
2024-09-28 ProMISe: Promptable Medical Image Segmentation using SAM Promise:基于SAM的可提示医疗图像分割方法 Jinfeng Wang, Sifan Song, Xinkun Wang, Yiyi Wang, Yiyi Miao, Jionglong Su, S. Kevin Zhou http://arxiv.org/pdf/2403.04164v3 link
2024-09-28 YOLOv8-AM: YOLOv8 Based on Effective Attention Mechanisms for Pediatric Wrist Fracture Detection 基于有效注意力机制的YOLOv8-AM在小儿腕部骨折检测中的应用 Chun-Tse Chien, Rui-Yang Ju, Kuang-Yi Chou, Enkaer Xieerke, Jen-Shiun Chiang http://arxiv.org/pdf/2402.09329v5 link
2024-09-28 Improving Image Coding for Machines through Optimizing Encoder via Auxiliary Loss 优化辅助损失函数以提升机器图像编码器的性能 Kei Iino, Shunsuke Akamatsu, Hiroshi Watanabe, Shohei Enomoto, Akira Sakamoto, Takeharu Eda http://arxiv.org/pdf/2402.08267v2 null
2024-09-28 G2D: From Global to Dense Radiography Representation Learning via Vision-Language Pre-training G2D:基于视觉-语言预训练的从全局到密集X射线表征学习 Che Liu, Cheng Ouyang, Sibo Cheng, Anand Shah, Wenjia Bai, Rossella Arcucci http://arxiv.org/pdf/2312.01522v2 null
2024-09-28 Exploring the Coordination of Frequency and Attention in Masked Image Modeling 探索遮罩图像建模中频率与注意力的协调机制 Jie Gui, Tuo Chen, Minjing Dong, Zhengqi Liu, Hao Luo, James Tin-Yau Kwok, Yuan Yan Tang http://arxiv.org/pdf/2211.15362v3 link

图像理解

Publish Date Title Title_CN Authors PDF Code
2024-09-28 Towards Croppable Implicit Neural Representations 面向可裁剪的隐式神经表示方法 Maor Ashkenazi, Eran Treister http://arxiv.org/pdf/2409.19472v1 link
2024-09-28 Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration 带掩膜的任意物体恢复:利用掩膜图像建模实现盲全合一图像修复 Chu-Jie Qin, Rui-Qi Wu, Zikun Liu, Xin Lin, Chun-Le Guo, Hyun Hee Park, Chongyi Li http://arxiv.org/pdf/2409.19403v1 null

LLM

Publish Date Title Title_CN Authors PDF Code
2024-09-28 DENEB: A Hallucination-Robust Automatic Evaluation Metric for Image Captioning DENEB:一种对图像字幕生成具有幻觉鲁棒性的自动评价指标 Kazuki Matsuda, Yuiga Wada, Komei Sugiura http://arxiv.org/pdf/2409.19255v1 null
2024-09-28 Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers Chat-Scene: 利用对象标识符桥接三维场景与大型语言模型 Haifeng Huang, Yilun Chen, Zehan Wang, Rongjie Huang, Runsen Xu, Tai Wang, Luping Liu, Xize Cheng, Yang Zhao, Jiangmiao Pang, et.al. http://arxiv.org/pdf/2312.08168v4 link

Transformer

Publish Date Title Title_CN Authors PDF Code
2024-09-28 Fast Encoding and Decoding for Implicit Video Representation 快速编码与解码隐式视频表示 Hao Chen, Saining Xie, Ser-Nam Lim, Abhinav Shrivastava http://arxiv.org/pdf/2409.19429v1 null
2024-09-28 Steering Prediction via a Multi-Sensor System for Autonomous Racing 多传感器系统在自动驾驶赛车中的转向预测研究 Zhuyun Zhou, Zongwei Wu, Florian Bolli, Rémi Boutteau, Fan Yang, Radu Timofte, Dominique Ginhac, Tobi Delbruck http://arxiv.org/pdf/2409.19356v1 null
2024-09-28 Unveil Benign Overfitting for Transformer in Vision: Training Dynamics, Convergence, and Generalization 揭示视觉Transformer中的良性过拟合:训练动态、收敛性与泛化能力 Jiarui Jiang, Wei Huang, Miao Zhang, Taiji Suzuki, Liqiang Nie http://arxiv.org/pdf/2409.19345v1 null

3D/CG

Publish Date Title Title_CN Authors PDF Code
2024-09-28 Solution of Multiview Egocentric Hand Tracking Challenge ECCV2024 多视角自我中心手部跟踪挑战的解决方案 ECCV2024 Minqiang Zou, Zhi Lv, Riqiang Jin, Tian Zhan, Mochen Yu, Yao Tang, Jiajun Liang http://arxiv.org/pdf/2409.19362v1 null
2024-09-28 Scalable Cloud-Native Pipeline for Efficient 3D Model Reconstruction from Monocular Smartphone Images 可扩展云原生管道:高效从单目智能手机图像重建三维模型 Potito Aghilar, Vito Walter Anelli, Michelantonio Trizio, Tommaso Di Noia http://arxiv.org/pdf/2409.19322v1 null
2024-09-28 PDCFNet: Enhancing Underwater Images through Pixel Difference Convolution PDCFNet:通过像素差分卷积增强水下图像 Song Zhang, Daoliang Li, Ran Zhao http://arxiv.org/pdf/2409.19269v1 link
2024-09-28 ReLoo: Reconstructing Humans Dressed in Loose Garments from Monocular Video in the Wild ReLoo:从野外单目视频中重建穿着宽松衣物的人类形态 Chen Guo, Tianjian Jiang, Manuel Kaufmann, Chengwei Zheng, Julien Valentin, Jie Song, Otmar Hilliges http://arxiv.org/pdf/2409.15269v2 null
2024-09-28 CT-AGRG: Automated Abnormality-Guided Report Generation from 3D Chest CT Volumes CT-AGRG:基于三维胸部CT体积的自动化异常引导报告生成 Theo Di Piazza http://arxiv.org/pdf/2408.11965v3 null

各类学习方式

Publish Date Title Title_CN Authors PDF Code
2024-09-28 MicroSSIM: Improved Structural Similarity for Comparing Microscopy Data 微SSIM:用于比较显微数据改进的结构相似性度量 Ashesh Ashesh, Joran Deschamps, Florian Jug http://arxiv.org/pdf/2408.08747v2 link

其他

Publish Date Title Title_CN Authors PDF Code
2024-09-28 Language-guided Robust Navigation for Mobile Robots in Dynamically-changing Environments 动态环境下语言引导的移动机器人鲁棒导航方法 Cody Simons, Zhichao Liu, Brandon Marcus, Amit K. Roy-Chowdhury, Konstantinos Karydis http://arxiv.org/pdf/2409.19459v1 null
2024-09-28 Brain-JEPA: Brain Dynamics Foundation Model with Gradient Positioning and Spatiotemporal Masking 脑-JEPA:基于梯度定位与时空掩膜的脑动力学基础模型 Zijian Dong, Ruilin Li, Yilei Wu, Thuan Tinh Nguyen, Joanna Su Xian Chong, Fang Ji, Nathanael Ren Jie Tong, Christopher Li Hsian Chen, Juan Helen Zhou http://arxiv.org/pdf/2409.19407v1 null
2024-09-28 Projected Tensor-Tensor Products for Efficient Computation of Optimal Multiway Data Representations 投影张量-张量积:高效计算最优多向数据表示的算法 Katherine Keegan, Elizabeth Newman http://arxiv.org/pdf/2409.19402v1 null
2024-09-28 EEPNet: Efficient Edge Pixel-based Matching Network for Cross-Modal Dynamic Registration between LiDAR and Camera 高效边缘像素匹配网络EEPNet:用于LiDAR与摄像头跨模态动态配准 Yuanchao Yue, Hui Yuan, Suai Li, Qi Jiang http://arxiv.org/pdf/2409.19305v1 null
2024-09-28 Summit Vitals: Multi-Camera and Multi-Signal Biosensing at High Altitudes 高峰生命体征:高海拔环境下多摄像头与多信号生物传感技术研究 Ke Liu, Jiankai Tang, Zhang Jiang, Yuntao Wang, Xiaojing Liu, Dong Li, Yuanchun Shi http://arxiv.org/pdf/2409.19223v1 null
2024-09-28 Extending Depth of Field for Varifocal Multiview Images 扩展变焦多视角图像的景深范围 Zhilong Li, Kejun Wu, Qiong Liu, You Yang http://arxiv.org/pdf/2409.19220v1 null
2024-09-28 What Makes for Good Image Captions? 良好的图像标题应具备哪些特点? Delong Chen, Samuel Cahyawijaya, Etsuko Ishii, Ho Shu Chan, Yejin Bang, Pascale Fung http://arxiv.org/pdf/2405.00485v2 null
2024-09-28 Energy-Based Concept Bottleneck Models: Unifying Prediction, Concept Intervention, and Probabilistic Interpretations 基于能量的概念瓶颈模型:统一预测、概念干预与概率解释 Xinyue Xu, Yi Qin, Lu Mi, Hao Wang, Xiaomeng Li http://arxiv.org/pdf/2401.14142v3 link
2024-09-28 IMMA: Immunizing text-to-image Models against Malicious Adaptation IMMA:对文本到图像模型的恶意适应性免疫保护研究 Amber Yijia Zheng, Raymond A. Yeh http://arxiv.org/pdf/2311.18815v3 link