GitHub - SKDDJ/cv-arxiv-daily: 🎓Automatically Update CV Papers Daily using Github Actions (Update Every 12th hours)

Updated on 2024.12.21

Table of Contents

PEFT
Text-to-Image Generation
Vision-Language Models
Generative Weight Space Modeling
Data Distillation
Schrodinger Bridge
Dataset Distillation
Synthetic Data Generation

PEFT

Publish Date	Title	Authors	PDF	Code
2024-12-19	FedPIA -- Permuting and Integrating Adapters leveraging Wasserstein Barycenters for Finetuning Foundation Models in Multi-Modal Federated Learning	Pramit Saha et.al.	2412.14424	null
2024-12-18	Parameter-efficient Fine-tuning for improved Convolutional Baseline for Brain Tumor Segmentation in Sub-Saharan Africa Adult Glioma Dataset	Bijay Adhikari et.al.	2412.14100	null
2024-12-18	A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Method-Level Code Smell Detection	Beiqi Zhang et.al.	2412.13801	null
2024-12-18	Refining Salience-Aware Sparse Fine-Tuning Strategies for Language Models	Xinxin Liu et.al.	2412.13488	null
2024-12-17	Train More Parameters But Mind Their Placement: Insights into Language Adaptation with PEFT	Jenny Kunz et.al.	2412.12674	link
2024-12-16	Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering	Jinhe Bi et.al.	2412.12359	link
2024-12-16	A LoRA is Worth a Thousand Pictures	Chenxi Liu et.al.	2412.12048	null
2024-12-11	Adaptive Principal Components Allocation with the $\ell_{2,g}$ -regularized Gaussian Graphical Model for Efficient Fine-Tuning Large Models	Jingjing Zheng et.al.	2412.08592	link
2024-12-10	PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition	Kartik Narayan et.al.	2412.07771	null
2024-12-10	MoDULA: Mixture of Domain-Specific and Universal LoRA for Multi-Task Learning	Yufei Ma et.al.	2412.07405	null
2024-12-13	Crack-EdgeSAM Self-Prompting Crack Segmentation System for Edge Devices	Yingchu Wang et.al.	2412.07205	null
2024-12-08	Taming Sensitive Weights : Noise Perturbation Fine-tuning for Robust LLM Quantization	Dongwei Wang et.al.	2412.06858	null
2024-12-09	BoRA: Bi-dimensional Weight-Decomposed Low-Rank Adaptation	Qiushi Wang et.al.	2412.06441	null
2024-12-19	S $^{2}$ FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity	Xinyu Yang et.al.	2412.06289	null
2024-12-08	KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models	Fan Wang et.al.	2412.06071	link
2024-12-07	Training-Free Bayesianization for Low-Rank Adapters of Large Language Models	Haizhou Shi et.al.	2412.05723	link
2024-12-06	PETapter: Leveraging PET-style classification heads for modular few-shot parameter-efficient fine-tuning	Jonas Rieger et.al.	2412.04975	null
2024-12-04	Prompting Large Language Models for Clinical Temporal Relation Extraction	Jianping He et.al.	2412.04512	null
2024-12-05	SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning	Seokju Yun et.al.	2412.04077	link
2024-12-04	Improving Linguistic Diversity of Large Language Models with Possibility Exploration Fine-Tuning	Long Mai et.al.	2412.03343	link
2024-12-03	Mixture of Physical Priors Adapter for Parameter-Efficient Fine-Tuning	Zhaozhi Wang et.al.	2412.02759	null
2024-12-03	CPP-UT-Bench: Can LLMs Write Complex Unit Tests in C++?	Vaishnavi Bhargava et.al.	2412.02735	null
2024-12-03	LoRA Diffusion: Zero-Shot LoRA Synthesis for Diffusion Model Personalization	Ethan Smith et.al.	2412.02352	null
2024-12-03	A Comprehensive Evaluation of Large Language Models on Aspect-Based Sentiment Analysis	Changzhi Zhou et.al.	2412.02279	null
2024-11-30	Unified Parameter-Efficient Unlearning for LLMs	Chenlu Ding et.al.	2412.00383	null
2024-11-29	SURE-VQA: Systematic Understanding of Robustness Evaluation in Medical VQA Tasks	Kim-Celine Kahl et.al.	2411.19688	link
2024-11-28	Parameter-Efficient Transfer Learning for Music Foundation Models	Yiwei Ding et.al.	2411.19371	link
2024-11-28	PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning	Shenghui Li et.al.	2411.19335	null
2024-11-28	Enhancing Parameter-Efficient Fine-Tuning of Vision Transformers through Frequency-Based Adaptation	Son Thai Ly et.al.	2411.19297	link
2024-11-27	Challenges in Adapting Multilingual LLMs to Low-Resource Languages using LoRA PEFT Tuning	Omkar Khade et.al.	2411.18571	null
2024-11-26	PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning	Zhen Sun et.al.	2411.17453	null
2024-11-29	Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning	Hui-Yue Yang et.al.	2411.17217	null
2024-11-25	Towards Efficient Model-Heterogeneity Federated Learning for Large Models	Ruofan Jia et.al.	2411.16796	null
2024-11-25	Parameter Efficient Instruction Tuning: An Empirical Study	Pengfei He et.al.	2411.16775	null
2024-11-25	Graph Adapter of EEG Foundation Models for Parameter Efficient Fine Tuning	Toyotaro Suzumura et.al.	2411.16155	null
2024-11-24	Efficient and Private: Memorisation under differentially private parameter-efficient fine-tuning in language models	Olivia Ma et.al.	2411.15831	null
2024-11-21	Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation	Seokil Ham et.al.	2411.15224	null
2024-11-22	LoRA-FAIR: Federated LoRA Fine-Tuning with Aggregation and Initialization Refinement	Jieming Bian et.al.	2411.14961	null
2024-11-21	Multi LoRA Meets Vision: Merging multiple adapters to create a multi task model	Ege Kesim et.al.	2411.14064	null
2024-11-17	F $^3$ OCUS -- Federated Finetuning of Vision-Language Foundation Models with Optimal Client Layer Updating Strategy via Multi-objective Meta-Heuristics	Pramit Saha et.al.	2411.11912	null
2024-11-16	HELENE: Hessian Layer-wise Clipping and Gradient Annealing for Accelerating Fine-tuning LLM with Zeroth-order Optimization	Huaqin Zhao et.al.	2411.10696	null
2024-11-12	PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model	Yilun Liu et.al.	2411.08212	null
2024-11-10	Prompt-Efficient Fine-Tuning for GPT-like Deep Models to Reduce Hallucination and to Improve Reproducibility in Scientific Text Generation Using Stochastic Optimisation Techniques	Daniil Sulimov et.al.	2411.06445	null
2024-11-06	MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba	Masakazu Yoshimura et.al.	2411.03855	null
2024-11-04	PipeLLM: Fast and Confidential Large Language Model Services with Speculative Pipelined Encryption	Yifan Tan et.al.	2411.03357	null
2024-11-05	Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation	Junchen Fu et.al.	2411.02992	null
2024-11-04	Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study	André Storhaug et.al.	2411.02462	null
2024-11-04	Expanding Sparse Tuning for Low Memory Usage	Shufan Shen et.al.	2411.01800	link
2024-11-15	Visual Fourier Prompt Tuning	Runjia Zeng et.al.	2411.01327	link
2024-10-31	CleaR: Towards Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Label Learning	Yeachan Kim et.al.	2411.00873	null
2024-10-30	FPE-LLM: Highly Intelligent Time-Series Forecasting and Language Interaction LLM in Energy Systems	Zihang Qiu et.al.	2411.00852	null
2024-11-01	Dual Low-Rank Adaptation for Continual Learning with Pre-Trained Models	Huancheng Chen et.al.	2411.00623	null
2024-11-01	Is Multiple Object Tracking a Matter of Specialization?	Gianluca Mancusi et.al.	2411.00553	null
2024-11-01	C2A: Client-Customized Adaptation for Parameter-Efficient Federated Learning	Yeachan Kim et.al.	2411.00311	link
2024-10-29	Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models	Donghoon Kim et.al.	2411.00029	null
2024-10-30	Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation	Wei Dong et.al.	2410.22952	null
2024-10-30	MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning	Xujia Wang et.al.	2410.22782	null
2024-10-29	Meta-Learning Adaptable Foundation Models	Jacob L. Block et.al.	2410.22264	null
2024-10-29	Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models	Raman Dutt et.al.	2410.22149	link
2024-10-30	IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models	Hang Guo et.al.	2410.21759	link
2024-10-28	KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation	Rambod Azimi et.al.	2410.20777	link
2024-10-27	Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation	Maohao Shen et.al.	2410.20336	null
2024-11-01	Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies	Luping Wang et.al.	2410.19878	null
2024-10-23	MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning	Jingfan Zhang et.al.	2410.18035	null
2024-10-22	Towards Real Zero-Shot Camouflaged Object Segmentation without Camouflaged Annotations	Cheng Lei et.al.	2410.16953	null
2024-10-22	MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report	Samrajya Thapa et.al.	2410.16239	link
2024-10-21	Natural GaLore: Accelerating GaLore for memory-efficient LLM Training and Fine-tuning	Arijit Das et.al.	2410.16029	link
2024-10-18	Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation	Shuai Zhao et.al.	2410.14425	link
2024-10-17	LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning	Yiming Shi et.al.	2410.13618	link
2024-10-16	Communication-Efficient and Tensorized Federated Fine-Tuning of Large Language Models	Sajjad Ghiasvand et.al.	2410.13097	null
2024-10-17	Prompt Compression for Large Language Models: A Survey	Zongqian Li et.al.	2410.12388	link
2024-10-15	Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models	Kai Yao et.al.	2410.11772	link
2024-10-15	LoKO: Low-Rank Kalman Optimizer for Online Fine-Tuning of Large Models	Hossein Abdi et.al.	2410.11551	null
2024-10-15	RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates	Md Kowsher et.al.	2410.10075	link
2024-10-13	BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation	Peijia Qin et.al.	2410.09758	null
2024-10-12	Towards Efficient Visual-Language Alignment of the Q-Former for Visual Reasoning Tasks	Sungkyung Kim et.al.	2410.09489	link
2024-10-15	MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning	Yaming Yang et.al.	2410.09437	null
2024-10-09	Parameter-Efficient Fine-Tuning via Selective Discrete Cosine Transform	Yixian Shen et.al.	2410.09103	null
2024-10-04	BIPEFT: Budget-Guided Iterative Search for Parameter Efficient Fine-Tuning of Large Pretrained Language Models	Aofei Chang et.al.	2410.09079	null
2024-10-11	Parameter-Efficient Fine-Tuning of State Space Models	Kevin Galim et.al.	2410.09016	link
2024-10-10	Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning	Dingkang Liang et.al.	2410.08114	link
2024-10-10	SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture	Jiayi Han et.al.	2410.07739	null
2024-10-10	Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures	Yiming Chen et.al.	2410.07698	link
2024-10-09	SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers	Viktoriia Chekalina et.al.	2410.07383	link
2024-10-09	Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs	Ruijia Niu et.al.	2410.06431	null
2024-10-08	Are Large Language Models State-of-the-art Quality Estimators for Machine Translation of User-generated Content?	Shenbin Qian et.al.	2410.06338	link
2024-10-15	LoRTA: Low Rank Tensor Adaptation of Large Language Models	Ignacio Hounie et.al.	2410.04060	null
2024-10-03	Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection	Tianxiang Chen et.al.	2410.02330	link
2024-10-02	TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models	Zefang Liu et.al.	2410.02062	link
2024-10-02	NEAT: Nonlinear Parameter-efficient Adaptation of Pre-trained Models	Yibo Zhong et.al.	2410.01870	null
2024-09-27	A GEN AI Framework for Medical Note Generation	Hui Yi Leong et.al.	2410.01841	null
2024-10-02	DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models	Yuxuan Zhang et.al.	2410.01497	link
2024-10-01	PrivTuner with Homomorphic Encryption and LoRA: A P3EFT Scheme for Privacy-Preserving Parameter-Efficient Fine-Tuning of AI Foundation Models	Yang Li et.al.	2410.00433	null
2024-09-30	Adapting LLMs for the Medical Domain in Portuguese: A Study on Fine-Tuning and Model Evaluation	Pedro Henrique Paiola et.al.	2410.00163	null
2024-09-30	Resource Allocation for Stable LLM Training in Mobile Edge Computing	Chang Liu et.al.	2409.20247	null
2024-09-30	Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models	Luohe Shi et.al.	2409.20181	null
2024-09-28	FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models	Yucheng Xie et.al.	2409.19289	null
2024-10-01	Backdoor Attacks for LLMs with Weak-To-Strong Knowledge Distillation	Shuai Zhao et.al.	2409.17946	null
2024-09-26	PEDRO: Parameter-Efficient Fine-tuning with Prompt DEpenDent Representation MOdification	Tianfang Xie et.al.	2409.17834	null
2024-09-30	Efficient In-Domain Question Answering for Resource-Constrained Environments	Isaac Chung et.al.	2409.17648	null
2024-10-07	PACE: marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization	Yao Ni et.al.	2409.17137	link
2024-09-25	Parameter-efficient Bayesian Neural Networks for Uncertainty-aware Depth Estimation	Richard D. Paul et.al.	2409.17085	null
2024-10-02	Bone: Block Affine Transformation as Parameter Efficient Fine-tuning Methods for Large Language Models	Jiale Kang et.al.	2409.15371	link
2024-09-22	Flat-LoRA: Low-Rank Adaption over a Flat Loss Landscape	Tao Li et.al.	2409.14396	null
2024-10-01	Obliviate: Neutralizing Task-agnostic Backdoors within the Parameter-efficient Fine-tuning Paradigm	Jaehan Kim et.al.	2409.14119	link
2024-09-20	HUT: A More Computation Efficient Fine-Tuning Method With Hadamard Updated Transformation	Geyuan Zhang et.al.	2409.13501	null
2024-09-17	THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models	Mengfei Liang et.al.	2409.11353	link
2024-09-17	LPT++: Efficient Training on Mixture of Long-tailed Experts	Bowen Dong et.al.	2409.11323	null
2024-09-17	Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models	Divij Gupta et.al.	2409.11302	null
2024-09-18	Propulsion: Steering LLM with Tiny Fine-Tuning	Md Kowsher et.al.	2409.10927	link
2024-09-16	From Text to Emoji: How PEFT-Driven Personality Manipulation Unleashes the Emoji Potential in LLMs	Navya Jain et.al.	2409.10245	null
2024-09-14	COMFORT: A Continual Fine-Tuning Framework for Foundation Models Targeted at Consumer Healthcare	Chia-Hao Li et.al.	2409.09549	null
2024-09-14	Comparing Retrieval-Augmentation and Parameter-Efficient Fine-Tuning for Privacy-Preserving Personalization of Large Language Models	Alireza Salemi et.al.	2409.09510	link
2024-09-13	Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights	Dixi Yao et.al.	2409.08482	null
2024-09-12	Do Vision Foundation Models Enhance Domain Generalization in Medical Image Segmentation?	Kerem Cekmeceli et.al.	2409.07960	link
2024-09-11	Efficient Localized Adaptation of Neural Weather Forecasting: A Case Study in the MENA Region	Muhammad Akhtar Munir et.al.	2409.07585	link
2024-09-10	Sam2Rad: A Segmentation Model for Medical Images with Learnable Prompts	Assefa Seyoum Wahd et.al.	2409.06821	link
2024-09-11	Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models	Yao Shu et.al.	2409.06277	link
2024-09-09	SVFit: Parameter-Efficient Fine-Tuning of Large Pre-Trained Models Using Singular Values	Chengwei Sun et.al.	2409.05926	null
2024-09-10	Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment	Zhixian Zhao et.al.	2409.05015	null
2024-09-06	Customizing Large Language Model Generation Style using Parameter-Efficient Finetuning	Xinyue Liu et.al.	2409.04574	null
2024-09-04	iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation	Hayeon Jo et.al.	2409.02838	null
2024-09-04	Deconfounded Causality-aware Parameter-Efficient Fine-Tuning for Problem-Solving Improvement of LLMs	Ruoyu Wang et.al.	2409.02686	null
2024-09-04	Robust Federated Finetuning of Foundation Models via Alternating Minimization of LoRA	Shuangyi Chen et.al.	2409.02346	null
2024-09-02	Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning	Chongjie Si et.al.	2409.01035	link
2024-08-28	3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability	Baohao Liao et.al.	2409.00119	link
2024-08-21	SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models	Yang Cao et.al.	2409.00055	link
2024-08-30	MoRe Fine-Tuning with 10x Fewer Parameters	Wenxuan Tan et.al.	2408.17383	link
2024-09-02	Instant Adversarial Purification with Adversarial Consistency Distillation	Chun Tong Lei et.al.	2408.17064	null
2024-08-28	Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization	Léo Hemamou et.al.	2408.15801	null
2024-08-27	GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs	Maxim Zhelnin et.al.	2408.15300	link
2024-08-27	Pre-training Everywhere: Parameter-Efficient Fine-Tuning for Medical Image Analysis via Target Parameter Pre-training	Xingliang Lei et.al.	2408.15011	null
2024-08-27	CVPT: Cross-Attention help Visual Prompt Tuning adapt visual task	Lingyun Huang et.al.	2408.14961	link
2024-08-27	Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models	Aradhye Agarwal et.al.	2408.14470	link
2024-08-24	Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings	Sagar Srinivas Sakhinana et.al.	2408.13622	null
2024-08-21	Positional Prompt Tuning for Efficient 3D Representation Learning	Shaochen Zhang et.al.	2408.11567	link
2024-08-20	Pluto and Charon: A Time and Memory Efficient Collaborative Edge AI Framework for Personal LLMs Fine-Tuning	Bei Ouyang et.al.	2408.10746	null
2024-08-20	TDS-CLIP: Temporal Difference Side Network for Image-to-Video Transfer Learning	Bin Wang et.al.	2408.10688	link
2024-08-19	TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition	Tianwei Lin et.al.	2408.09856	link
2024-08-16	Learning to Route for Dynamic Adapter Composition in Continual Learning with Language Models	Vladimir Araujo et.al.	2408.09053	null
2024-08-14	KIND: Knowledge Integration and Diversion in Diffusion Models	Yucheng Xie et.al.	2408.07337	null
2024-08-30	TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning	Yujie Feng et.al.	2408.05200	link
2024-08-08	Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models	Yupeng Chang et.al.	2408.04556	link
2024-08-06	SARA: Singular-Value Based Adaptive Low-Rank Adaption	Jihao Gu et.al.	2408.03290	null
2024-08-06	Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi	Pranita Deshmukh et.al.	2408.03172	null
2024-08-03	TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks	Yang Yu et.al.	2408.01835	link
2024-08-02	MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts	Lin Ning et.al.	2408.01505	null
2024-08-02	Tensor Train Low-rank Approximation (TT-LoRA): Democratizing AI with Accelerated LLMs	Afia Anjum et.al.	2408.01008	null
2024-07-31	A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation	Mothilal Asokan et.al.	2407.21739	null
2024-07-28	Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models	Jifeng Wang et.al.	2407.19564	link
2024-07-24	Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective	Jingren Liu et.al.	2407.17120	null
2024-07-22	Zero-Shot Embeddings Inform Learning and Forgetting with Vision-Language Encoders	Laura Niss et.al.	2407.15731	null
2024-07-21	Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization	Jiajun Hu et.al.	2407.15085	null
2024-07-16	InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification	Yujia Hu et.al.	2407.12882	link
2024-07-18	Turning Generative Models Degenerate: The Power of Data Poisoning Attacks	Shuli Jiang et.al.	2407.12281	null
2024-07-16	Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification	Naif Alkhunaizi et.al.	2407.11573	null
2024-07-16	An efficient framework based on large foundation model for cervical cytopathology whole slide image screening	Jialong Huang et.al.	2407.11486	link
2024-07-10	RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization	Xijie Huang et.al.	2407.08044	link
2024-07-10	ROSA: Random Subspace Adaptation for Efficient Fine-Tuning	Marawan Gamal Abdel Hameed et.al.	2407.07802	link
2024-07-10	Parameter Efficient Fine Tuning for Multi-scanner PET to PET Reconstruction	Yumin Kim et.al.	2407.07517	null
2024-07-09	Reprogramming Distillation for Medical Foundation Models	Yuhang Zhou et.al.	2407.06504	null
2024-07-07	See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition	Chongjie Si et.al.	2407.05417	link
2024-07-16	LoRA-GA: Low-Rank Adaptation with Gradient Approximation	Shaowen Wang et.al.	2407.05000	link
2024-07-05	GPT vs RETRO: Exploring the Intersection of Retrieval and Parameter-Efficient Fine-Tuning	Aleksander Ficek et.al.	2407.04528	null
2024-07-04	Deep Content Understanding Toward Entity and Aspect Target Sentiment Analysis on Foundation Models	Vorakit Vorakitphan et.al.	2407.04050	link
2024-07-04	ASteISR: Adapting Single Image Super-resolution Pre-trained Model for Efficient Stereo Image Super-resolution	Yuanbo Zhou et.al.	2407.03598	null
2024-07-03	Knowledge Composition using Task Vectors with Learned Anisotropic Scaling	Frederic Z. Zhang et.al.	2407.02880	link
2024-07-03	Exploring the Capabilities of LLMs for Code Change Related Tasks	Lishui Fan et.al.	2407.02824	link
2024-07-02	FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs	Haodong Chen et.al.	2407.02157	null
2024-07-02	CatMemo at the FinLLM Challenge Task: Fine-Tuning Large Language Models using Data Fusion in Financial Applications	Yupeng Cao et.al.	2407.01953	null
2024-07-05	Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models	Zihan Wang et.al.	2407.01906	link
2024-07-01	A Fingerprint for Large Language Models	Zhiguang Yang et.al.	2407.01235	null
2024-07-02	Embedded Prompt Tuning: Towards Enhanced Calibration of Pretrained Models for Medical Images	Wenqiang Zu et.al.	2407.01003	link
2024-06-25	Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning	Arijit Sehanobish et.al.	2406.17740	null
2024-06-19	Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks	Liangxin Qian et.al.	2406.13602	null
2024-06-19	Sparse High Rank Adapters	Kartikeya Bhardwaj et.al.	2406.13175	null
2024-06-18	Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal Quantization levels and Rank Values trough Differentiable Bayesian Gates	Cristian Meo et.al.	2406.13046	null
2024-06-18	Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation	Branislav Pecher et.al.	2406.12471	link
2024-06-17	A Semantic-based Layer Freezing Approach to Efficient Fine-Tuning of Language Models	Jian Gu et.al.	2406.11753	null
2024-06-16	ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts	Samar Khanna et.al.	2406.10973	null
2024-06-16	ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank Adaptation	Yurun Song et.al.	2406.10785	null
2024-06-16	RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning	Haoyu Wang et.al.	2406.10777	null
2024-06-15	Benchmarking Children's ASR with Supervised and Self-supervised Speech Foundation Models	Ruchao Fan et.al.	2406.10507	link
2024-06-15	Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts	Zhaoxuan Tan et.al.	2406.10471	link
2024-06-13	Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models	Lukas Thede et.al.	2406.09384	null
2024-06-12	Exploring Fact Memorization and Style Imitation in LLMs Using QLoRA: An Experimental Study and Quality Assessment Methods	Eugene Vyborov et.al.	2406.08582	null
2024-06-12	The Impact of Initialization on LoRA Finetuning Dynamics	Soufiane Hayou et.al.	2406.08447	null
2024-06-20	Low-Rank Quantization-Aware Training for LLMs	Yelysei Bondarenko et.al.	2406.06385	link
2024-06-10	A Parameter-efficient Language Extension Framework for Multilingual ASR	Wei Liu et.al.	2406.06329	null
2024-06-09	A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Automated Program Repair	Guochang Li et.al.	2406.05639	link
2024-06-07	Efficient Differentially Private Fine-Tuning of Diffusion Models	Jing Liu et.al.	2406.05257	null
2024-06-07	CorDA: Context-Oriented Decomposition Adaptation of Large Language Models	Yibo Yang et.al.	2406.05223	link
2024-06-07	An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models	Xiongtao Zhou et.al.	2406.05130	link
2024-06-07	MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter	Jitai Hao et.al.	2406.04984	link
2024-06-06	Time Sensitive Knowledge Editing through Efficient Finetuning	Xiou Ge et.al.	2406.04496	link
2024-06-06	VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation	Prashanth Vijayaraghavan et.al.	2406.04379	null
2024-06-10	Hypernetworks for Personalizing ASR to Atypical Speech	Max Müller-Eberstein et.al.	2406.04240	null
2024-06-06	Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning	Naibin Gu et.al.	2406.03792	link
2024-06-05	Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need	Martin Wistuba et.al.	2406.03216	null
2024-06-06	Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision	Minglei Li et.al.	2406.03051	null
2024-05-31	Mamba State-Space Models Can Be Strong Downstream Learners	John T. Halloran et.al.	2406.00209	null
2024-05-30	ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections	Massimo Bini et.al.	2405.20271	link
2024-05-30	SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors	Vijay Lingam et.al.	2405.19597	link
2024-05-29	MemControl: Mitigating Memorization in Medical Diffusion Models via Automated Parameter Selection	Raman Dutt et.al.	2405.19458	link
2024-05-29	MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning	Junjie Wang et.al.	2405.18897	link
2024-05-29	Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation	Zelin Peng et.al.	2405.18840	null
2024-06-01	Low-Rank Few-Shot Adaptation of Vision-Language Models	Maxime Zanella et.al.	2405.18541	null
2024-05-28	Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning	Renzhi Wang et.al.	2405.18292	null
2024-05-28	VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections	Roy Miles et.al.	2405.17991	link
2024-05-28	Sparsity- and Hybridity-Inspired Visual Parameter-Efficient Fine-Tuning for Medical Diagnosis	Mingyuan Liu et.al.	2405.17877	null
2024-05-27	LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters	Klaudia Bałazy et.al.	2405.17604	link
2024-05-23	EMR-Merging: Tuning-Free High-Performance Model Merging	Chenyu Huang et.al.	2405.17461	link
2024-05-28	DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution	Yulong Mao et.al.	2405.17357	link
2024-05-27	$\textit{Trans-LoRA}$ : towards data-free Transferable Parameter Efficient Finetuning	Runqian Wang et.al.	2405.17258	null
2024-05-30	Sparse Matrix in Large Language Model Fine-tuning	Haoze He et.al.	2405.15525	null
2024-05-24	Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation	Abhinav Jain et.al.	2405.15282	link
2024-05-27	VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks	Yang Li et.al.	2405.15179	link
2024-05-23	Bitune: Bidirectional Instruction-Tuning	Dawid J. Kopiczko et.al.	2405.14862	null
2024-05-23	Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference	Ting Liu et.al.	2405.14700	link
2024-05-22	Spectral Adapter: Fine-Tuning in Spectral Space	Fangzhao Zhang et.al.	2405.13952	link
2024-05-24	MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models	Jingwei Xu et.al.	2405.13053	link
2024-05-20	FeTT: Continual Class Incremental Learning via Feature Transformation Tuning	Sunyuan Qiang et.al.	2405.11822	null
2024-05-21	HARIS: Human-Like Attention for Reference Image Segmentation	Mengxi Zhang et.al.	2405.10707	null
2024-05-28	DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation	Jie Xu et.al.	2405.06368	null
2024-05-09	Selective Fine-tuning on LLM-labeled Data May Reduce Reliance on Human Annotation: A Case Study Using Schedule-of-Event Table Detection	Bhawesh Kumar et.al.	2405.06093	null
2024-05-09	Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning	Shibo Jie et.al.	2405.05615	link
2024-05-07	Refining Joint Text and Source Code Embeddings for Retrieval Task with Parameter-Efficient Fine-Tuning	Karim Galliamov et.al.	2405.04126	link
2024-05-04	Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning	Jing Xu et.al.	2405.02596	link
2024-03-16	Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R	Amirreza Esmaeili et.al.	2405.01553	null
2024-05-02	NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment	Gerald Shen et.al.	2405.01481	link
2024-04-29	LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report	Justin Zhao et.al.	2405.00732	link
2024-05-01	Investigating Automatic Scoring and Feedback using Large Language Models	Gloria Ashiya Katuka et.al.	2405.00602	null
2024-05-01	MoPEFT: A Mixture-of-PEFTs for the Segment Anything Model	Rajat Sahay et.al.	2405.00293	null
2024-04-30	SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models	Samir Arora et.al.	2405.00201	null
2024-05-23	HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning	Chunlin Tian et.al.	2404.19245	link
2024-05-25	FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition	Yuxuan Yan et.al.	2404.18848	null
2024-04-25	Efficiency in Focus: LayerNorm as a Catalyst for Fine-tuning Medical Visual Language Pre-trained Models	Jiawei Chen et.al.	2404.16385	null
2024-05-23	MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts	Dengchun Li et.al.	2404.15159	link
2024-04-22	ColA: Collaborative Adaptation with Gradient Learning	Enmao Diao et.al.	2404.13844	link
2024-04-23	Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications	Charith Chandra Sai Balne et.al.	2404.13506	null
2024-04-18	SKIP: Skill-Localized Prompt Tuning for Inference Speed Boost-Up	Nakyeong Yang et.al.	2404.11916	null
2024-04-16	Shears: Unstructured Sparsity with Neural Low-rank Adapter Search	J. Pablo Muñoz et.al.	2404.10934	link
2024-04-16	Exact and Efficient Unlearning for Large Language Model-based Recommendation	Zhiyu Hu et.al.	2404.10327	null
2024-04-15	LoRA Dropout as a Sparsity Regularizer for Overfitting Control	Yang Lin et.al.	2404.09610	null
2024-04-21	Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in LLMs	Ahmed Agiza et.al.	2404.08699	link
2024-04-08	Certified PEFTSmoothing: Parameter-Efficient Fine-Tuning with Randomized Smoothing	Chengyan Fu et.al.	2404.05350	null
2024-04-08	DLoRA: Distributed Parameter-Efficient Fine-Tuning Solution for Large Language Model	Chao Gao et.al.	2404.05182	null
2024-04-12	Q-PEFT: Query-dependent Parameter Efficient Fine-tuning for Text Reranking with Large Language Models	Zhiyuan Peng et.al.	2404.04522	null
2024-04-05	Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation	Tong Su et.al.	2404.04212	null
2024-05-22	ReFT: Representation Finetuning for Language Models	Zhengxuan Wu et.al.	2404.03592	link
2024-06-11	Personalized LLM Response Generation with Parameterized Memory Injection	Kai Zhang et.al.	2404.03565	null
2024-06-20	Eigenpruning: an Interpretability-Inspired PEFT Method	Tomás Vergara-Browne et.al.	2404.03147	link
2024-05-28	PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models	Fanxu Meng et.al.	2404.02948	link
2024-04-03	Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data	Parth Patwa et.al.	2404.02422	null
2024-04-11	IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT	Junchen Fu et.al.	2404.02059	link
2024-03-31	Query-driven Relevant Paragraph Extraction from Legal Judgments	T. Y. S. S Santosh et.al.	2404.00595	null
2024-03-30	Edinburgh Clinical NLP at SemEval-2024 Task 2: Fine-tune your model unless you have access to GPT-4	Aryo Pradipta Gema et.al.	2404.00484	link
2024-04-03	InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning	Yan-Shuo Liang et.al.	2404.00228	link
2024-03-27	Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation	Mateusz Klimaszewski et.al.	2403.18804	link
2024-03-26	The Unreasonable Ineffectiveness of the Deeper Layers	Andrey Gromov et.al.	2403.17887	null
2024-04-15	ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models	Zequan Liu et.al.	2403.16187	null
2024-03-22	KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation	Xindi Luo et.al.	2403.14950	link
2024-03-22	A Single Linear Layer Yields Task-Adapted Low-Rank Matrices	Hwichan Kim et.al.	2403.14946	null
2024-03-21	AutoRE: Document-Level Relation Extraction with Large Language Models	Xue Lilong et.al.	2403.14888	link
2024-04-29	Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey	Zeyu Han et.al.	2403.14608	null
2024-03-20	Harnessing Large Language Models for Text-Rich Sequential Recommendation	Zhi Zheng et.al.	2403.13325	link
2024-04-16	AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models	Zeyu Liu et.al.	2403.13269	null
2024-03-18	Improving LoRA in Privacy-preserving Federated Learning	Youbang Sun et.al.	2403.12313	null
2024-03-18	Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation	Wangbo Zhao et.al.	2403.11808	link
2024-03-18	Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model	Haoyun Xu et.al.	2403.11621	null
2024-03-19	JORA: JAX Tensor-Parallel LoRA Library for Retrieval Augmented Fine-Tuning	Anique Tahir et.al.	2403.11366	link
2024-03-14	Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks	Tingyu Qu et.al.	2403.09377	link
2024-03-14	PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation	Yizhe Xiong et.al.	2403.09192	link
2024-03-13	Data-oriented Dynamic Fine-tuning Parameter Selection Strategy for FISH Mask based Efficient Fine-tuning	Ming Dong et.al.	2403.08484	null

(back to top)

Text-to-Image Generation

Publish Date	Title	Authors	PDF	Code
2024-12-19	LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis	Hanlin Wang et.al.	2412.15214	null
2024-12-19	Flowing from Words to Pixels: A Framework for Cross-Modality Evolution	Qihao Liu et.al.	2412.15213	null
2024-12-19	Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation	Hadi Alzayer et.al.	2412.15211	null
2024-12-19	AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation	Moayed Haji-Ali et.al.	2412.15191	null
2024-12-19	LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation	Weijia Shi et.al.	2412.15188	null
2024-12-19	Tiled Diffusion	Or Madar et.al.	2412.15185	null
2024-12-19	SqueezeMe: Efficient Gaussian Avatars for VR	Shunsuke Saito et.al.	2412.15171	null
2024-12-19	OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization	Jiacheng Zhang et.al.	2412.15159	null
2024-12-19	Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM	Yatai Ji et.al.	2412.15156	link
2024-12-19	Jet: A Modern Transformer-Based Normalizing Flow	Alexander Kolesnikov et.al.	2412.15129	null
2024-12-19	Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation	Yang Tian et.al.	2412.15109	null
2024-12-19	Learning Disentangled Equivariant Representation for Explicitly Controllable 3D Molecule Generation	Haoran Liu et.al.	2412.15086	null
2024-12-19	Eigenstate Preparation on Quantum Computers	Joey Bonitati et.al.	2412.15081	null
2024-12-19	Uni-Renderer: Unifying Rendering and Inverse Rendering Via Dual Stream Diffusion	Zhifei Chen et.al.	2412.15050	null
2024-12-19	DCTdiff: Intriguing Properties of Image Generative Modeling in the DCT Space	Mang Ning et.al.	2412.15032	link
2024-12-18	AniDoc: Animation Creation Made Easier	Yihao Meng et.al.	2412.14173	null
2024-12-19	E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling	Zhihang Yuan et.al.	2412.14170	null
2024-12-18	Autoregressive Video Generation without Vector Quantization	Haoge Deng et.al.	2412.14169	link
2024-12-18	VideoDPO: Omni-Preference Alignment for Video Diffusion Generation	Runtao Liu et.al.	2412.14167	null
2024-12-18	MetaMorph: Multimodal Understanding and Generation via Instruction Tuning	Shengbang Tong et.al.	2412.14164	null
2024-12-18	MCMat: Multiview-Consistent and Physically Accurate PBR Material Generation	Shenhao Zhu et.al.	2412.14148	null
2024-12-18	Event-based Photometric Bundle Adjustment	Shuang Guo et.al.	2412.14111	null
2024-12-18	Future Research Avenues for Artificial Intelligence in Digital Gaming: An Exploratory Report	Markus Dablander et.al.	2412.14085	null
2024-12-18	SurgSora: Decoupled RGBD-Flow Diffusion Model for Controllable Surgical Video Generation	Tong Chen et.al.	2412.14018	null
2024-12-18	Comparative Analysis of Machine Learning-Based Imputation Techniques for Air Quality Datasets with High Missing Data Rates	Sen Yan et.al.	2412.13966	null
2024-12-18	A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI	Beiduo Chen et.al.	2412.13942	null
2024-12-18	Development of a High-Resolution, High-Dynamic-Range Charge Detector for Ion Beam Monitoring	O. Adriani et.al.	2412.13934	null
2024-12-18	Investigating the Effects of Diffusion-based Conditional Generative Speech Models Used for Speech Enhancement on Dysarthric Speech	Joanna Reszka et.al.	2412.13933	null
2024-12-18	Graph-Driven Models for Gas Mixture Identification and Concentration Estimation on Heterogeneous Sensor Array Signals	Ding Wang et.al.	2412.13891	null
2024-12-18	Navigating limitations with precision: A fine-grained ensemble approach to wrist pathology recognition on a limited x-ray dataset	Ammar Ahmed et.al.	2412.13884	null
2024-12-17	CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models	Gaoyang Zhang et.al.	2412.13195	link
2024-12-17	StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models	Yunzhi Yan et.al.	2412.13188	null
2024-12-17	Move-in-2D: 2D-Conditioned Human Motion Generation	Hsin-Ping Huang et.al.	2412.13185	null
2024-12-17	F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration	Lu Liu et.al.	2412.13155	null
2024-12-17	Prompt Augmentation for Self-supervised Text-guided Image Manipulation	Rumeysa Bodur et.al.	2412.13081	null
2024-12-17	3D MedDiffusion: A 3D Medical Diffusion Model for Controllable and High-quality Medical Image Generation	Haoshen Wang et.al.	2412.13059	null
2024-12-17	Guiding Generative Protein Language Models with Reinforcement Learning	Filippo Stocco et.al.	2412.12979	null
2024-12-18	Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance	Wenhao Sun et.al.	2412.12974	link
2024-12-17	ArchesWeather & ArchesWeatherGen: a deterministic and generative model for efficient ML weather forecasting	Guillaume Couairon et.al.	2412.12971	link
2024-12-17	Modified UNIFAC 2.0 -- A Group-Contribution Method Completed with Machine Learning	Nicolas Hayer et.al.	2412.12962	null
2024-12-17	MOPO: Multi-Objective Prompt Optimization for Affective Text Generation	Yarik Menchaca Resendiz et.al.	2412.12948	null
2024-12-17	Generation of cosmic ray trajectories by a Diffusion Model trained on test particles in 3D magnetohydrodynamic turbulence	Johannes Martin et.al.	2412.12923	null
2024-12-17	Unsupervised Region-Based Image Editing of Denoising Diffusion Models	Zixiang Li et.al.	2412.12912	null
2024-12-18	ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction	Zhongjie Duan et.al.	2412.12888	link
2024-12-17	Memory-minimal quantum generation of stochastic processes: spectral invariants of quantum hidden Markov models	Magdalini Zonnios et.al.	2412.12812	null
2024-12-16	Causal Diffusion Transformers for Generative Modeling	Chaorui Deng et.al.	2412.12095	link
2024-12-16	CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models	Felix Taubner et.al.	2412.12093	null
2024-12-16	Wonderland: Navigating 3D Scenes from a Single Image	Hanwen Liang et.al.	2412.12091	null
2024-12-16	A LoRA is Worth a Thousand Pictures	Chenxi Liu et.al.	2412.12048	null
2024-12-16	LLMs for Cold-Start Cutting Plane Separator Configuration	Connor Lawless et.al.	2412.12038	null
2024-12-16	Learning to Navigate in Mazes with Novel Layouts using Abstract Top-down Maps	Linfeng Zhao et.al.	2412.12024	null
2024-12-16	The entropic optimal (self-)transport problem: Limit distributions for decreasing regularization with application to score function estimation	Gilles Mordant et.al.	2412.12007	null
2024-12-16	Controllable Shadow Generation with Single-Step Diffusion Models from Synthetic Data	Onur Tasar et.al.	2412.11972	null
2024-12-16	The Erdős unit distance problem for small point sets	Boris Alexeev et.al.	2412.11914	null
2024-12-16	CharacterBench: Benchmarking Character Customization of Large Language Models	Jinfeng Zhou et.al.	2412.11912	link
2024-12-16	Towards Understanding Systems Trade-offs in Retrieval-Augmented Generation Model Inference	Michael Shen et.al.	2412.11854	null
2024-12-16	ColorFlow: Retrieval-Augmented Image Sequence Colorization	Junhao Zhuang et.al.	2412.11815	null
2024-12-16	InterDyn: Controllable Interactive Dynamics with Video Diffusion Models	Rick Akkerman et.al.	2412.11785	null
2024-12-16	Joint Reconstruction of the Activity and the Attenuation in PET by Diffusion Posterior Sampling: a Feasibility Study	Clémentine Phung-Ngoc et.al.	2412.11776	null
2024-12-17	No More Adam: Learning Rate Scaling at Initialization is All You Need	Minghao Xu et.al.	2412.11768	link
2024-12-13	Towards a foundation model for heavy-ion collision experiments through point cloud diffusion	Manjunath Omana Kuttan et.al.	2412.10352	null
2024-12-13	BrushEdit: All-In-One Image Inpainting and Editing	Yaowei Li et.al.	2412.10316	null
2024-12-13	Iterating the Transient Light Transport Matrix for Non-Line-of-Sight Imaging	Talha Sultan et.al.	2412.10300	null
2024-12-13	Coherent 3D Scene Diffusion From a Single RGB Image	Manuel Dahnert et.al.	2412.10294	null
2024-12-13	Adversarial Robustness of Bottleneck Injected Deep Neural Networks for Task-Oriented Communication	Alireza Furutanpey et.al.	2412.10265	null
2024-12-13	Targeted Angular Reversal of Weights (TARS) for Knowledge Removal in Large Language Models	Harry J. Davies et.al.	2412.10257	null
2024-12-13	Exploring the Frontiers of Animation Video Generation in the Sora Era: Method, Dataset and Benchmark	Yudong Jiang et.al.	2412.10255	null
2024-12-13	Radiator Tailoring for Enhanced Performance in InAs-Based Near-Field Thermophotovoltaics	Mathieu Giroux et.al.	2412.10217	null
2024-12-13	GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion	Jiapeng Tang et.al.	2412.10209	null
2024-12-13	Efficient Generative Modeling with Residual Vector Quantization-Based Tokens	Jaehyeon Kim et.al.	2412.10208	null
2024-12-13	Simple Guidance Mechanisms for Discrete Diffusion Models	Yair Schiff et.al.	2412.10193	link
2024-12-13	SwiftTry: Fast and Consistent Video Virtual Try-On with Diffusion Models	Hung Nguyen et.al.	2412.10178	null
2024-12-13	Learning payoffs while routing in skill-based queues	Sanne van Kempen et.al.	2412.10168	null
2024-12-13	The Art of Deception: Color Visual Illusions and Diffusion Models	Alex Gomez-Villa et.al.	2412.10122	null
2024-12-13	Familiarity: Better Evaluation of Zero-Shot Named Entity Recognition by Quantifying Label Shifts in Synthetic Training Data	Jonas Golde et.al.	2412.10121	null
2024-12-12	FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion	Haonan Qiu et.al.	2412.09626	null
2024-12-12	Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors	Yue Feng et.al.	2412.09625	null
2024-12-12	GenEx: Generating an Explorable World	Taiming Lu et.al.	2412.09624	null
2024-12-12	OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation	Weiqi Li et.al.	2412.09623	null
2024-12-12	LoRACLR: Contrastive Adaptation for Customization of Diffusion Models	Enis Simsar et.al.	2412.09622	null
2024-12-12	SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training	Dongting Hu et.al.	2412.09619	null
2024-12-12	EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM	Zhuofan Zong et.al.	2412.09618	null
2024-12-12	Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG	Kavana Venkatesh et.al.	2412.09614	null
2024-12-13	Olympus: A Universal Task Router for Computer Vision Tasks	Yuanze Lin et.al.	2412.09612	link
2024-12-12	Owl-1: Omni World Model for Consistent Long Video Generation	Yuanhui Huang et.al.	2412.09600	link
2024-12-12	LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors	Yabo Chen et.al.	2412.09597	null
2024-12-12	Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion	Zexin He et.al.	2412.09593	null
2024-12-12	Improving the Reliability of Cable Broadband Networks via Proactive Network Maintenance	Jiyao Hu et.al.	2412.09564	null
2024-12-12	Meshtron: High-Fidelity, Artist-Like 3D Mesh Generation at Scale	Zekun Hao et.al.	2412.09548	null
2024-12-12	SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing	Xueting Li et.al.	2412.09545	null
2024-12-11	Generative Semantic Communication: Architectures, Technologies, and Applications	Jinke Ren et.al.	2412.08642	null
2024-12-11	DMin: Scalable Training Data Influence Estimation for Diffusion Models	Huawei Lin et.al.	2412.08637	link
2024-12-11	Multimodal Latent Language Modeling with Next-Token Diffusion	Yutao Sun et.al.	2412.08635	link
2024-12-11	An SDR-Based Monostatic Wi-Fi System with Analog Self-Interference Cancellation for Sensing	Andreas Toftegaard Kristensen et.al.	2412.08612	null
2024-12-12	Design2GarmentCode: Turning Design Concepts to Tangible Garments Through Program Synthesis	Feng Zhou et.al.	2412.08603	null
2024-12-11	TryOffAnyone: Tiled Cloth Generation from a Dressed Person	Ioannis Xarchakos et.al.	2412.08573	link
2024-12-12	Watermarking Training Data of Music Generation Models	Pascal Epple et.al.	2412.08549	null
2024-12-11	Orderly Management of Packets in RDMA by Eunomia	Sana Mahmood et.al.	2412.08540	null
2024-12-11	Ensemble-Based Quantum-Token Protocol Benchmarked on IBM Quantum Processors	Lucas Tsunaki et.al.	2412.08530	null
2024-12-11	Comparative Opinion Mining in Product Reviews: Multi-perspective Prompt-based Learning	Hai-Yen Thi Nguyen et.al.	2412.08508	null
2024-12-11	Open-Loop and Model Predictive Control for Electric Vehicle Charging to Manage Excess Renewable Energy Supply in Texas	Kelsey M. Nelson et.al.	2412.08505	null
2024-12-11	Learning Flow Fields in Attention for Controllable Person Image Generation	Zijian Zhou et.al.	2412.08486	link
2024-12-11	InvDiff: Invariant Guidance for Bias Mitigation in Diffusion Models	Min Hou et.al.	2412.08480	link
2024-12-11	CC-Diff: Enhancing Contextual Coherence in Remote Sensing Image Synthesis	Mu Zhang et.al.	2412.08464	null
2024-12-11	Federated Learning for Traffic Flow Prediction with Synthetic Data Augmentation	Fermin Orozco et.al.	2412.08460	null
2024-12-10	Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets	Zhen Liu et.al.	2412.07775	null
2024-12-10	UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics	Xi Chen et.al.	2412.07774	null
2024-12-10	From Slow Bidirectional to Fast Causal Video Generators	Tianwei Yin et.al.	2412.07772	null
2024-12-10	Make-A-Texture: Fast Shape-Aware Texture Generation in 3 Seconds	Xiaoyu Xiang et.al.	2412.07766	null
2024-12-10	Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences	Alan Nawzad Amin et.al.	2412.07763	link
2024-12-10	Repurposing Pre-trained Video Diffusion Models for Event-based Video Interpolation	Jingxi Chen et.al.	2412.07761	null
2024-12-10	SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints	Jianhong Bai et.al.	2412.07760	link
2024-12-10	PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation	Fatemeh Nazarieh et.al.	2412.07754	null
2024-12-10	Multi-Shot Character Consistency for Text-to-Video Generation	Yuval Atzmon et.al.	2412.07750	null
2024-12-10	StyleMaster: Stylize Your Video with Artistic Generation and Translation	Zixuan Ye et.al.	2412.07744	null
2024-12-10	STIV: Scalable Text and Image Conditioned Video Generation	Zongyu Lin et.al.	2412.07730	null
2024-12-10	ObjCtrl-2.5D: Training-free Object Control with Camera Poses	Zhouxia Wang et.al.	2412.07721	null
2024-12-10	ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer	Jinyi Hu et.al.	2412.07720	link
2024-12-10	Privacy-Preserving Customer Support: A Framework for Secure and Scalable Interactions	Anant Prakash Awasthi et.al.	2412.07687	null
2024-12-10	Optimizing Sensor Redundancy in Sequential Decision-Making Problems	Jonas Nüßlein et.al.	2412.07686	null
2024-12-10	[MASK] is All You Need	Vincent Tao Hu et.al.	2412.06787	link
2024-12-09	Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation	Ruihan Gao et.al.	2412.06785	link
2024-12-09	Diverse Score Distillation	Yanbo Xu et.al.	2412.06780	null
2024-12-09	Visual Lexicon: Rich Image Features in Language Space	XuDong Wang et.al.	2412.06774	null
2024-12-09	InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention	Howard Zhang et.al.	2412.06753	null
2024-12-09	ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities	Adhiraj Ghosh et.al.	2412.06745	null
2024-12-10	ContRail: A Framework for Realistic Railway Image Synthesis using ControlNet	Andrei-Robert Alexandrescu et.al.	2412.06742	null
2024-12-09	Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection	Caiyun Xie et.al.	2412.06727	link
2024-12-09	You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale	Baorui Ma et.al.	2412.06699	link
2024-12-09	Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy	Yuxuan Xue et.al.	2412.06698	null
2024-12-09	Diff5T: Benchmarking Human Brain Diffusion MRI with an Extensive 5.0 Tesla K-Space and Spatial Dataset	Shanshan Wang et.al.	2412.06666	null
2024-12-09	Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion	Shuaiting Li et.al.	2412.06661	null
2024-12-09	MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences	Weitao Wang et.al.	2412.06614	null
2024-12-09	Augmented reality for upper limb rehabilitation: real-time kinematic feedback with HoloLens 2	Beatrice Luciani et.al.	2412.06596	null
2024-12-09	EmoSpeech: A Corpus of Emotionally Rich and Contextually Detailed Speech Annotations	Weizhen Bian et.al.	2412.06581	null
2024-12-06	Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model	Lening Wang et.al.	2412.05280	link
2024-12-06	Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories	Susung Hong et.al.	2412.05279	null
2024-12-06	Birth and Death of a Rose	Chen Geng et.al.	2412.05278	null
2024-12-06	MotionFlow: Attention-Driven Motion Transfer in Video Diffusion Models	Tuna Han Salih Meral et.al.	2412.05275	null
2024-12-06	Go-or-Grow Models in Biology: a Monster on a Leash	R. Thiessen et.al.	2412.05191	null
2024-12-06	Privacy Drift: Evolving Privacy Concerns in Incremental Learning	Sayyed Farid Ahamed et.al.	2412.05183	null
2024-12-06	DNF: Unconditional 4D Generation with Dictionary-based Neural Fields	Xinyi Zhang et.al.	2412.05161	null
2024-12-06	A text-to-tabular approach to generate synthetic patient data using LLMs	Margaux Tornqvist et.al.	2412.05153	link
2024-12-06	LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation	Donald Shenaj et.al.	2412.05148	null
2024-12-06	How to Squeeze An Explanation Out of Your Model	Tiago Roxo et.al.	2412.05134	null
2024-12-06	Probabilistic Galaxy Field Generation with Diffusion Models	Tanner Sether et.al.	2412.05131	null
2024-12-06	The Silent Prompt: Initial Noise as Implicit Guidance for Goal-Driven Image Generation	Ruoyu Wang et.al.	2412.05101	null
2024-12-06	Reconstructing Quantitative Cerebral Perfusion Images Directly From Measured Sinogram Data Acquired Using C-arm Cone-Beam CT	Haotian Zhao et.al.	2412.05084	null
2024-12-06	ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration	Chi-Wei Hsiao et.al.	2412.05043	null
2024-12-06	Get It Right: Improving Comprehensibility with Adaptable Speech Expression of a Humanoid Service Robot	Thomas Sievers et.al.	2412.05022	null
2024-12-05	PaintScene4D: Consistent 4D Scene Generation from Text Prompts	Vinayak Gupta et.al.	2412.04471	null
2024-12-05	LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors	Yusuf Dalva et.al.	2412.04460	null
2024-12-05	Four-Plane Factorized Video Autoencoders	Mohammed Suhail et.al.	2412.04452	null
2024-12-05	MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation	Longtao Zheng et.al.	2412.04448	null
2024-12-05	DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models	Yizhuo Li et.al.	2412.04446	null
2024-12-05	Learning Artistic Signatures: Symmetry Discovery and Style Transfer	Emma Finn et.al.	2412.04441	null
2024-12-05	GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration	Kaiyi Huang et.al.	2412.04440	null
2024-12-05	Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation	Yuying Ge et.al.	2412.04432	link
2024-12-05	Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis	Jian Han et.al.	2412.04431	link
2024-12-05	Reversible molecular simulation for training classical and machine learning force fields	Joe G Greener et.al.	2412.04374	link
2024-12-05	Machine Theory of Mind for Autonomous Cyber-Defence	Luke Swaby et.al.	2412.04367	null
2024-12-05	ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation	Dayoung Gong et.al.	2412.04353	null
2024-12-05	RMD: A Simple Baseline for More General Human Motion Generation via Training-free Retrieval-Augmented Motion Diffuse	Zhouyingcheng Liao et.al.	2412.04343	null
2024-12-05	Likelihood-Scheduled Score-Based Generative Modeling for Fully 3D PET Image Reconstruction	George Webber et.al.	2412.04339	null
2024-12-05	Multi-Subject Image Synthesis as a Generative Prior for Single-Subject PET Image Reconstruction	George Webber et.al.	2412.04324	null
2024-12-04	Navigation World Models	Amir Bar et.al.	2412.03572	null
2024-12-04	MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation	Zehuan Huang et.al.	2412.03558	null
2024-12-04	NODE-AdvGAN: Improving the transferability and perceptual similarity of adversarial examples by dynamic-system-driven adversarial generative model	Xinheng Xie et.al.	2412.03539	null
2024-12-04	NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images	Lingen Li et.al.	2412.03517	null
2024-12-04	Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion	Shengyuan Zhang et.al.	2412.03515	link
2024-12-04	Data Fusion of Semantic and Depth Information in the Context of Object Detection	Md Abu Yusuf et.al.	2412.03490	null
2024-12-04	Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective	Neta Shaul et.al.	2412.03487	null
2024-12-04	Pre-trained Multiple Latent Variable Generative Models are good defenders against Adversarial Attacks	Dario Serez et.al.	2412.03453	link
2024-12-04	CleanDIFT: Diffusion Features without Noise	Nick Stracke et.al.	2412.03439	link
2024-12-04	SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model	Yan Li et.al.	2412.03430	null
2024-12-04	Skel3D: Skeleton Guided Novel View Synthesis	Aron Fóthi et.al.	2412.03407	null
2024-12-04	Identifiability implies consistency of MLE in partially observed diffusions on a torus	Ibrahim Ekren et.al.	2412.03380	null
2024-12-04	TASR: Timestep-Aware Diffusion Model for Image Super-Resolution	Qinwei Lin et.al.	2412.03355	link
2024-12-04	DIVE: Taming DINO for Subject-Driven Video Editing	Yi Huang et.al.	2412.03347	null
2024-12-04	Geometry-guided Cross-view Diffusion for One-to-many Cross-view Image Synthesis	Tao Jun Lin et.al.	2412.03315	null
2024-12-03	Motion Prompting: Controlling Video Generation with Motion Trajectories	Daniel Geng et.al.	2412.02700	null
2024-12-03	Diffusion-based Visual Anagram as Multi-task Learning	Zhiyuan Xu et.al.	2412.02693	link
2024-12-03	FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation	Kefan Chen et.al.	2412.02690	null
2024-12-04	SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance	Viet Nguyen et.al.	2412.02687	null
2024-12-03	AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction	Lingteng Qiu et.al.	2412.02684	null
2024-12-03	Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation	Yiftach Edelstein et.al.	2412.02631	null
2024-12-03	The effect of priors on Learning with Restricted Boltzmann Machines	Gianluca Manzan et.al.	2412.02623	null
2024-12-03	ComPair-2: A Next Generation Medium Energy Gamma-ray Telescope Prototype	Regina Caputo et.al.	2412.02562	null
2024-12-03	The Two-Center Problem of Uncertain Points on Cactus Graphs	Haitao Xu et.al.	2412.02559	null
2024-12-03	ShadowHack: Hacking Shadows via Luminance-Color Divide and Conquer	Jin Hu et.al.	2412.02545	link
2024-12-03	Unveiling Concept Attribution in Diffusion Models	Quang H. Nguyen et.al.	2412.02542	null
2024-12-03	LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data	Hanyu Zhang et.al.	2412.02525	null
2024-12-03	GerPS-Compare: Comparing NER methods for legal norm analysis	Sarah T. Bachinger et.al.	2412.02427	null
2024-12-03	It Takes Two: Real-time Co-Speech Two-person's Interaction Generation via Reactive Auto-regressive Diffusion Model	Mingyi Shi et.al.	2412.02419	null
2024-12-03	A Multi-Agent Framework for Extensible Structured Text Generation in PLCs	Donghao Yang et.al.	2412.02410	null
2024-11-29	Nanostructured micrometric-pore membranes for nanofiltration: Micrometric geometry may optimize performance, energy efficiency and operational lifetime	J. C. Verde et.al.	2411.19900	null
2024-11-29	Input-Output Optics as a Causal Time Series Mapping: A Generative Machine Learning Solution	Abhijit Sen et.al.	2411.19897	null
2024-11-29	MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks	Yiming Wu et.al.	2411.19786	null
2024-11-29	Riemannian Denoising Score Matching for Molecular Structure Optimization with Accurate Energy	Jeheon Woo et.al.	2411.19769	null
2024-11-29	JetFormer: An Autoregressive Generative Model of Raw Images and Text	Michael Tschannen et.al.	2411.19722	null
2024-11-29	Inverse Design of Mechanical Metamaterials Using a Point-Cloud-Based Deep Generative Model	Seungwook Hong et.al.	2411.19681	null
2024-11-29	TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting	Bojun Xiong et.al.	2411.19654	null
2024-11-29	Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing	Wenyi Mo et.al.	2411.19652	link
2024-11-29	Enhancing Security in Third-Party Library Reuse -- Comprehensive Detection of 1-day Vulnerability through Code Patch Analysis	Shangzhi Xu et.al.	2411.19648	null
2024-11-29	Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings	Qiong Wu et.al.	2411.19628	link
2024-11-29	Unimib Assistant: designing a student-friendly RAG-based chatbot for all their needs	Chiara Antico et.al.	2411.19554	null
2024-11-29	Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook	Florinel-Alin Croitoru et.al.	2411.19537	link
2024-11-29	Quantized Delta Weight Is Safety Keeper	Yule Liu et.al.	2411.19530	null
2024-12-02	DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding	Jungbin Cho et.al.	2411.19527	null
2024-11-29	Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis	Tianqi Li et.al.	2411.19509	null
2024-11-27	Textured Gaussians for Enhanced 3D Scene Appearance Modeling	Brian Chao et.al.	2411.18625	null
2024-11-27	GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data	Wentao Wang et.al.	2411.18624	null
2024-11-27	Diffusion Self-Distillation for Zero-Shot Customized Image Generation	Shengqu Cai et.al.	2411.18616	null
2024-11-27	CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models	Rundi Wu et.al.	2411.18613	null
2024-11-27	Evaluating and Improving the Effectiveness of Synthetic Chest X-Rays for Medical Image Analysis	Eva Prakash et.al.	2411.18602	null
2024-11-27	Bit symmetry entails the symmetry of the quantum transition probability	Gerd Niestegge et.al.	2411.18589	null
2024-11-27	Building Confidence in Deep Generative Protein Design	Tianyuan Zheng et.al.	2411.18568	link
2024-11-27	High-throughput antibody screening with high-quality factor nanophotonics and bioprinting	Sajjad Abdollahramezani et.al.	2411.18557	null
2024-11-27	FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion	Haosen Yang et.al.	2411.18552	null
2024-11-28	Enhancing weed detection performance by means of GenAI-based image augmentation	Sourav Modak et.al.	2411.18513	null
2024-11-27	GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation	Pengfei Zhou et.al.	2411.18499	null
2024-11-27	Synthetic ECG Generation for Data Augmentation and Transfer Learning in Arrhythmia Classification	José Fernando Núñez et.al.	2411.18456	null
2024-11-27	Is my Meeting Summary Good? Estimating Quality with a Multi-LLM Evaluator	Frederic Kirstein et.al.	2411.18444	null
2024-11-27	Learning the Evolution of Physical Structure of Galaxies via Diffusion Models	Andrew Lizarraga et.al.	2411.18440	link
2024-11-27	Search for heavy scalar or pseudoscalar states in $\mathrm{t \bar{t}}$ events at CMS	Laurids Jeppe et.al.	2411.18414	null
2024-11-27	StableAnimator: High-Quality Identity-Preserving Human Image Animation	Shuyuan Tu et.al.	2411.17697	link
2024-11-26	ScribbleLight: Single Image Indoor Relighting with Scribbles	Jun Myeong Choi et.al.	2411.17696	null
2024-11-26	Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis	Akshita Gupta et.al.	2411.17690	null
2024-11-26	GenDeg: Diffusion-Based Degradation Synthesis for Generalizable All-in-One Image Restoration	Sudarshan Rajagopalan et.al.	2411.17687	null
2024-11-26	Semi-analytical model for the calculation of solar radiation pressure and its effects on a LEO satellite with predicting the change in position vectors using machine learning techniques	Pranava Seth et.al.	2411.17626	null
2024-11-26	Accelerating Vision Diffusion Transformers with Skip Branches	Guanjie Chen et.al.	2411.17616	link
2024-11-26	Mixed-State Quantum Denoising Diffusion Probabilistic Model	Gino Kwun et.al.	2411.17608	null
2024-11-26	Making History Readable	Bipasha Banerjee et.al.	2411.17600	null
2024-11-26	VideoDirector: Precise Video Editing via Text-to-Video Models	Yukun Wang et.al.	2411.17592	null
2024-11-26	Rapid Deployment of Domain-specific Hyperspectral Image Processors with Application to Autonomous Driving	Jon Gutiérrez-Zaballa et.al.	2411.17543	null
2024-11-26	Metaverse Innovation Canvas: A Tool for Extended Reality Product/Service Development	Amir Reza Asadi et.al.	2411.17541	null
2024-11-26	IMPROVE: Improving Medical Plausibility without Reliance on HumanValidation -- An Enhanced Prototype-Guided Diffusion Framework	Anurag Shandilya et.al.	2411.17535	null
2024-11-26	FTMoMamba: Motion Generation with Frequency and Text State Space Models	Chengjian Li et.al.	2411.17532	null
2024-11-26	Exact and Heuristic Approaches for the Covering Tour Location Routing Problem	Andreas Hagn et.al.	2411.17510	link
2024-11-26	WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model	Zongjian Li et.al.	2411.17459	link
2024-11-25	Generative Omnimatte: Learning to Decompose Video into Layers	Yao-Chih Lee et.al.	2411.16683	null
2024-11-25	Diffusion Features for Zero-Shot 6DoF Object Pose Estimation	Bernd Von Gimborn et.al.	2411.16668	null
2024-11-25	DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation	Zun Wang et.al.	2411.16657	null
2024-11-25	Exploring Discrete Flow Matching for 3D De Novo Molecule Generation	Ian Dunn et.al.	2411.16644	link
2024-11-25	LegoPET: Hierarchical Feature Guided Conditional Diffusion for PET Image Reconstruction	Yiran Sun et.al.	2411.16629	null
2024-11-25	Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models	Ronghuan Wu et.al.	2411.16602	null
2024-11-25	Unlocking The Potential of Adaptive Attacks on Diffusion-Based Purification	Andre Kassis et.al.	2411.16598	link
2024-11-25	Rethinking Diffusion for Text-Driven Human Motion Generation	Zichong Meng et.al.	2411.16575	null
2024-11-25	Representation Collapsing Problems in Vector Quantization	Wenhao Zhao et.al.	2411.16550	null
2024-11-25	ADOBI: Adaptive Diffusion Bridge For Blind Inverse Problems with Application to MRI Reconstruction	Yuyang Hu et.al.	2411.16535	null
2024-11-25	PriorPath: Coarse-To-Fine Approach for Controlled De-Novo Pathology Semantic Masks Generation	Nati Daniel et.al.	2411.16515	null
2024-11-25	Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis	Boming Miao et.al.	2411.16503	null
2024-11-25	Multi-Resolution Generative Modeling of Human Motion from Limited Data	David Eduardo Moreno-Villamarín et.al.	2411.16498	null
2024-11-25	Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval	Xiaocong Yang et.al.	2411.16454	null
2024-11-25	Model-based reinforcement corrosion prediction: Continuous calibration with Bayesian optimization and corrosion wire sensor data	A. Potnis et.al.	2411.16447	null
2024-11-22	DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving	Bencheng Liao et.al.	2411.15139	link
2024-11-22	Material Anything: Generating Materials for Any 3D Object via Diffusion	Xin Huang et.al.	2411.15138	null
2024-11-22	VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement	Daeun Lee et.al.	2411.15115	null
2024-11-22	RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts	Hjalmar Wijk et.al.	2411.15114	link
2024-11-22	Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion	Samarth N Ramesh et.al.	2411.15113	null
2024-11-22	Leapfrog Latent Consistency Model (LLCM) for Medical Images Generation	Lakshmikar R. Polamreddy et.al.	2411.15084	link
2024-11-22	Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network	Irfan Nafiz Shahan et.al.	2411.15082	link
2024-11-22	Empowering Clients: Transformation of Design Processes Due to Generative AI	Johannes Schneider et.al.	2411.15061	null
2024-11-22	The 1D nonlocal Fisher-KPP equation with a top hat kernel. Part 3. The effect of perturbations in the kernel	David John Needham et.al.	2411.15054	null
2024-11-22	FloAt: Flow Warping of Self-Attention for Clothing Animation Generation	Swasti Shreya Mishra et.al.	2411.15028	null
2024-11-22	Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation	Huy Le et.al.	2411.14913	null
2024-11-22	Dynamically Encircled Higher-order Exceptional Points in an Optical Fiber	Arpan Roy et.al.	2411.14874	null
2024-11-22	Prioritize Denoising Steps on Diffusion Model Preference Alignment via Explicit Denoised Distribution Estimation	Dingyuan Shi et.al.	2411.14871	null
2024-11-22	Latent Schrodinger Bridge: Prompting Latent Diffusion for Fast Unpaired Image-to-Image Translation	Jeongsol Kim et.al.	2411.14863	null
2024-11-22	Style-Friendly SNR Sampler for Style-Driven Generation	Jooyoung Choi et.al.	2411.14793	null
2024-11-21	Stable Flow: Vital Layers for Training-Free Image Editing	Omri Avrahami et.al.	2411.14430	null
2024-11-21	Transformer-based Heuristic for Advanced Air Mobility Planning	Jun Xiang et.al.	2411.14427	null
2024-11-21	A Python-Based Approach to Sputter Deposition Simulations in Combinatorial Materials Science	Felix Thelen et.al.	2411.14413	null
2024-11-21	Multi-Agent Environments for Vehicle Routing Problems	Ricardo Gama et.al.	2411.14411	link
2024-11-21	Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation	Yuanhao Cai et.al.	2411.14384	null
2024-11-21	CoNFiLD-inlet: Synthetic Turbulence Inflow Using Generative Latent Diffusion Models with Neural Fields	Xin-Yang Liu et.al.	2411.14378	null
2024-11-21	Enhancing Medical Image Segmentation with Deep Learning and Diffusion Models	Houze Liu et.al.	2411.14353	null
2024-11-21	DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding	Tianhe Ren et.al.	2411.14347	link
2024-11-21	Lower Dimensional Spherical Representation of Medium Voltage Load Profiles for Visualization, Outlier Detection, and Generative Modelling	Edgar Mauricio Salazar Duque et.al.	2411.14346	null
2024-11-21	StereoCrafter-Zero: Zero-Shot Stereo Video Generation with Noisy Restart	Jian Shi et.al.	2411.14295	null
2024-11-21	Efficient Aspect-Based Summarization of Climate Change Reports with Small Language Models	Iacopo Ghinassi et.al.	2411.14272	link
2024-11-21	Guided MRI Reconstruction via Schrödinger Bridge	Yue Wang et.al.	2411.14269	null
2024-11-21	Regional Attention for Shadow Removal	Hengxing Liu et.al.	2411.14201	link
2024-11-21	TaQ-DiT: Time-aware Quantization for Diffusion Transformers	Xinyan Liu et.al.	2411.14172	null
2024-11-21	Creating a Formally Verified Neural Network for Autonomous Navigation: An Experience Report	Syed Ali Asadullah Bukhari et.al.	2411.14163	link
2024-11-20	REDUCIO! Generating 1024 $\times$ 1024 Video within 16 Seconds using Extremely Compressed Motion Latents	Rui Tian et.al.	2411.13552	link
2024-11-20	Identity Preserving 3D Head Stylization with Multiview Score Distillation	Bahri Batuhan Bilecen et.al.	2411.13536	null
2024-11-20	VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models	Ziqi Huang et.al.	2411.13503	link
2024-11-20	LIMBA: An Open-Source Framework for the Preservation and Valorization of Low-Resource Languages using Generative Models	Salvatore Mario Carta et.al.	2411.13453	null
2024-11-20	Heuristically Adaptive Diffusion-Model Evolutionary Strategy	Benedikt Hartl et.al.	2411.13420	null
2024-11-20	Energy-based generative models for monoclonal antibodies	Paul Pereira et.al.	2411.13390	link
2024-11-20	Small and Close-In Planets are Uncommon around A-type Stars	Steven Giacalone et.al.	2411.13363	null
2024-11-20	Vertical Validation: Evaluating Implicit Generative Models for Graphs on Thin Support Regions	Mai Elkady et.al.	2411.13358	null
2024-11-20	A CSI Feedback Framework based on Transmitting the Important Values and Generating the Others	Zhilin Du et.al.	2411.13298	null
2024-11-21	Structure-Based Molecule Optimization via Gradient-Guided Bayesian Update	Keyue Qiu et.al.	2411.13280	null
2024-11-20	XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation	Ziyi Wang et.al.	2411.13243	link
2024-11-20	BIPro: Zero-shot Chinese Poem Generation via Block Inverse Prompting Constrained Generation Framework	Xu Zou et.al.	2411.13237	null
2024-11-20	Building music with Lego bricks and Raspberry Pi	Ana M. Barbancho et.al.	2411.13224	null
2024-11-20	A computational framework for integrating Predictive processes with evidence Accumulation Models (PAM)	Antonino Visalli et.al.	2411.13203	link
2024-11-20	OpenMS WebApps: Building User-Friendly Solutions for MS Analysis	Tom David Müller et.al.	2411.13189	null
2024-11-19	Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs	Ahmed Akib Jawad Karim et.al.	2411.12712	null
2024-11-19	OrigamiPlot: An R Package and Shiny Web App Enhanced Visualizations for Multivariate Data	Yiwen Lu et.al.	2411.12674	null
2024-11-19	Auto-Evaluation with Few Labels through Post-hoc Regression	Benjamin Eyre et.al.	2411.12665	null
2024-11-19	PoM: Efficient Image and Video Generation with the Polynomial Mixer	David Picard et.al.	2411.12663	link
2024-11-19	Optimizing Airline Reservation Systems with Edge-Enabled Microservices: A Framework for Real-Time Data Processing and Enhanced User Responsiveness	Biman Barua et.al.	2411.12650	null
2024-11-19	DLBacktrace: A Model Agnostic Explainability for any Deep Learning Models	Vinay Kumar Sankarapu et.al.	2411.12643	link
2024-11-19	Improving Controllability and Editability for Pretrained Text-to-Music Generation Models	Yixiao Zhang et.al.	2411.12641	null
2024-11-19	Universal programmable waveguide arrays	Akram Youssry et.al.	2411.12610	null
2024-11-19	Whisper Finetuning on Nepali Language	Sanjay Rijal et.al.	2411.12587	null
2024-11-19	Predicting Customer Satisfaction by Replicating the Survey Response Distribution	Etienne Manderscheid et.al.	2411.12539	null
2024-11-19	Data Pruning in Generative Diffusion Models	Rania Briq et.al.	2411.12523	null
2024-11-19	Probe-Me-Not: Protecting Pre-trained Encoders from Malicious Probing	Ruyi Ding et.al.	2411.12508	null
2024-11-19	Empirical Privacy Evaluations of Generative and Predictive Machine Learning Models -- A review and challenges for practice	Flavio Hafner et.al.	2411.12451	null
2024-11-19	Frequency-Aware Guidance for Blind Image Restoration via Diffusion Models	Jun Xiao et.al.	2411.12450	null
2024-11-19	A general modeling and simulation framework for dynamic vehicle routing	Markó Horváth et.al.	2411.12406	link
2024-11-18	QARM: Quantitative Alignment Multi-Modal Recommendation at Kuaishou	Xinchen Luo et.al.	2411.11739	null
2024-11-18	Aligning Few-Step Diffusion Models with Dense Reward Difference Learning	Ziyi Zhang et.al.	2411.11727	link
2024-11-18	Multiscale nonlinear integration drives accurate encoding of input information	Giorgio Nicoletti et.al.	2411.11710	null
2024-11-18	Robust Reinforcement Learning under Diffusion Models for Data with Jumps	Chenyang Jiang et.al.	2411.11697	null
2024-11-18	Active droplets controlled by enzymatic reactions	Jacques Fries et.al.	2411.11696	null
2024-11-18	Do Captioning Metrics Reflect Music Semantic Alignment?	Jinwoo Lee et.al.	2411.11692	null
2024-11-18	Conceptwm: A Diffusion Model Watermark for Concept Protection	Liangqi Lei et.al.	2411.11688	null
2024-11-19	GNN-Based Code Annotation Logic for Establishing Security Boundaries in C Code	Varun Gadey et.al.	2411.11567	null
2024-11-19	Cascaded Diffusion Models for 2D and 3D Microscopy Image Synthesis to Enhance Cell Segmentation	Rüveyda Yilmaz et.al.	2411.11515	null
2024-11-18	Collaborative Contrastive Network for Click-Through Rate Prediction	Chen Gao et.al.	2411.11508	null
2024-11-18	LaVin-DiT: Large Vision Diffusion Transformer	Zhaoqing Wang et.al.	2411.11505	null
2024-11-18	Alien Recombination: Exploring Concept Blends Beyond Human Cognitive Availability in Visual Art	Alejandro Hernandez et.al.	2411.11494	null
2024-11-18	MVLight: Relightable Text-to-3D Generation via Light-conditioned Multi-View Diffusion	Dongseok Shim et.al.	2411.11475	null
2024-11-18	GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts	Junwen He et.al.	2411.11435	null
2024-11-18	CLUE-MARK: Watermarking Diffusion Models using CLWE	Kareem Shehata et.al.	2411.11434	null
2024-11-15	M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation	Sucheng Ren et.al.	2411.10433	link
2024-11-15	Mitigating Parameter Degeneracy using Joint Conditional Diffusion Model for WECC Composite Load Model in Power Systems	Feiqin Zhu et.al.	2411.10431	null
2024-11-15	Multiscale Dubuc: A New Similarity Measure for Time Series	Mahsa Khazaei et.al.	2411.10418	link
2024-11-15	Experimental generation of extreme electron beams for advanced accelerator applications	Claudio Emma et.al.	2411.10413	null
2024-11-15	How to Build a Quantum Supercomputer: Scaling Challenges and Opportunities	Masoud Mohseni et.al.	2411.10406	null
2024-11-15	Nonlinearity-Driven Morphing and Control of Topological Modes in Non-Hermitian Systems	Zhao-Fan Cai et.al.	2411.10398	null
2024-11-15	Towards High-Fidelity 3D Portrait Generation with Rich Details by Cross-View Prior-Aware Diffusion	Haoran Wei et.al.	2411.10369	null
2024-11-15	Safe Text-to-Image Generation: Simply Sanitize the Prompt Embedding	Huming Qiu et.al.	2411.10329	null
2024-11-15	Probabilistic Prior Driven Attention Mechanism Based on Diffusion Model for Imaging Through Atmospheric Turbulence	Guodong Sun et.al.	2411.10321	null
2024-11-15	Assortment Optimization under the Multinomial Logit Model with Covering Constraints	Omar El Housni et.al.	2411.10310	null
2024-11-15	Modification Takes Courage: Seamless Image Stitching via Reference-Driven Inpainting	Ziqi Xie et.al.	2411.10309	link
2024-11-15	MDHP-Net: Detecting Injection Attacks on In-vehicle Network using Multi-Dimensional Hawkes Process and Temporal Model	Qi Liu et.al.	2411.10258	null
2024-11-15	The Unreasonable Effectiveness of Guidance for Diffusion Models	Tim Kaiser et.al.	2411.10257	null
2024-11-15	Smooth transport map via diffusion process	Arthur Stéphanovitch et.al.	2411.10235	null
2024-11-15	ColorEdit: Training-free Image-Guided Color editing with diffusion model	Xingxi Yin et.al.	2411.10232	null
2024-11-14	A Bayesian Optimization Approach to Machine Translation Reranking	Julius Cheng et.al.	2411.09694	null
2024-11-14	SimTube: Generating Simulated Video Comments through Multimodal AI and User Personas	Yu-Kai Hung et.al.	2411.09577	null
2024-11-14	Golden Noise for Diffusion Models: A Learning Framework	Zikai Zhou et.al.	2411.09502	null
2024-11-14	Sparse Bayesian Generative Modeling for Compressive Sensing	Benedikt Böck et.al.	2411.09483	link
2024-11-14	DiffRoad: Realistic and Diverse Road Scenario Generation for Autonomous Vehicle Testing	Junjie Zhou et.al.	2411.09451	null
2024-11-14	Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models	Chutian Meng et.al.	2411.09449	null
2024-11-14	A survey of probabilistic generative frameworks for molecular simulations	Richard John et.al.	2411.09388	link
2024-11-14	Multi-scale Generative Modeling for Fast Sampling	Xiongye Xiao et.al.	2411.09356	null
2024-11-14	ParaLBench: A Large-Scale Benchmark for Computational Paralinguistics over Acoustic Foundation Models	Zixing Zhang et.al.	2411.09349	null
2024-11-15	Approximate Probabilistic Inference for Time-Series Data A Robust Latent Gaussian Model With Temporal Awareness	Anton Johansson et.al.	2411.09312	null
2024-11-14	EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models	Soowon Kim et.al.	2411.09302	null
2024-11-14	LES-Talker: Fine-Grained Emotion Editing for Talking Head Generation in Linear Emotion Space	Guanwen Feng et.al.	2411.09268	null
2024-11-14	Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey	Xuannan Liu et.al.	2411.09259	link
2024-11-14	RibCageImp: A Deep Learning Framework for 3D Ribcage Implant Generation	Gyanendra Chaubey et.al.	2411.09204	null
2024-11-14	Improvement and Implementation of a Speech Emotion Recognition Model Based on Dual-Layer LSTM	Xiaoran Yang et.al.	2411.09189	null
2024-11-13	4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization	Mijeong Kim et.al.	2411.08879	null
2024-11-13	A generalized software framework for consolidation of radiotherapy planning and delivery data from diverse data sources	Yasin Abdulkadir et.al.	2411.08876	null
2024-11-13	Offline Adaptation of Quadruped Locomotion using Diffusion Models	Reece O'Mahoney et.al.	2411.08832	null
2024-11-13	SANDWICH: Towards an Offline, Differentiable, Fully-Trainable Wireless Neural Ray-Tracing Surrogate	Yifei Jin et.al.	2411.08767	null
2024-11-13	Analyst Reports and Stock Performance: Evidence from the Chinese Market	Rui Liu et.al.	2411.08726	null
2024-11-14	Reducing ADC Front-end Costs During Training of On-sensor Printed Multilayer Perceptrons	Florentia Afentaki et.al.	2411.08674	null
2024-11-13	Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks	Zhang Liu et.al.	2411.08672	null
2024-11-13	Toward Human Understanding with Controllable Synthesis	Hanz Cuevas-Velasquez et.al.	2411.08663	null
2024-11-13	The Galactica database: an open, generic and versatile tool for the dissemination of simulation data in astrophysics	Damien Chapon et.al.	2411.08647	null
2024-11-13	Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models	Chengdong Dong et.al.	2411.08642	null
2024-11-13	Deep Generative Demand Learning for Newsvendor and Pricing	Shijin Gong et.al.	2411.08631	null
2024-11-13	LG-Gaze: Learning Geometry-aware Continuous Prompts for Language-Guided Gaze Estimation	Pengwei Yin et.al.	2411.08606	null
2024-11-13	CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs	Suhas S Kowshik et.al.	2411.08553	null
2024-11-13	Explainers' Mental Representations of Explainees' Needs in Everyday Explanations	Michael Erol Schaffer et.al.	2411.08514	null
2024-11-13	HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere	Hatef Otroshi Shahreza et.al.	2411.08470	null
2024-11-12	Scaling Properties of Diffusion Models for Perceptual Tasks	Rahul Ravishankar et.al.	2411.08034	null
2024-11-12	GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation	Yushi Lan et.al.	2411.08033	null
2024-11-12	Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings	Aditya Sanghi et.al.	2411.08017	link
2024-11-12	JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation	Yiyang Ma et.al.	2411.07975	link
2024-11-12	Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules	Binxu Wang et.al.	2411.07873	null
2024-11-12	Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders	Xiaofeng Zhu et.al.	2411.07870	null
2024-11-12	CDXFormer: Boosting Remote Sensing Change Detection with Extended Long Short-Term Memory	Zhenkai Wu et.al.	2411.07863	link
2024-11-12	Sparsity-Aware Optimization of In-Memory Bayesian Binary Neural Network Accelerators	Prabodh Katti et.al.	2411.07842	null
2024-11-12	Novel View Synthesis with Pixel-Space Diffusion Models	Noam Elata et.al.	2411.07765	null
2024-11-12	Nanosecond nanothermometry in an electron microscope	Florian Castioni et.al.	2411.07764	null
2024-11-12	LapGSR: Laplacian Reconstructive Network for Guided Thermal Super-Resolution	Aditya Kasliwal et.al.	2411.07750	null
2024-11-12	The relationship between general equilibrium models with infinite-lived agents and overlapping generations models, and some applications	Ngoc-Sang Pham et.al.	2411.07674	null
2024-11-12	Evaluating the Generation of Spatial Relations in Text and Image Generative Models	Shang Hong Sim et.al.	2411.07664	null
2024-11-12	Leveraging Previous Steps: A Training-free Fast Solver for Flow Diffusion	Kaiyu Song et.al.	2411.07627	null
2024-11-12	Unraveling the Connections between Flow Matching and Diffusion Probabilistic Models in Training-free Conditional Generation	Kaiyu Song et.al.	2411.07625	null
2024-11-11	Score-based generative diffusion with "active" correlated noise sources	Alexandra Lamtyugina et.al.	2411.07233	null
2024-11-12	Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models	Yoad Tewel et.al.	2411.07232	null
2024-11-11	Learning from Limited and Imperfect Data	Harsh Rangwani et.al.	2411.07229	null
2024-11-11	TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models	Matheus Simão et.al.	2411.07224	null
2024-11-11	DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID	Nyle Siddiqui et.al.	2411.07205	link
2024-11-11	Crossover from inhomogeneous to homogeneous response of a resonantly driven hBN quantum emitter	Domitille Gérard et.al.	2411.07202	null
2024-11-11	OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision	Cong Wei et.al.	2411.07199	null
2024-11-11	More Expressive Attention with Negative Weights	Ang Lv et.al.	2411.07176	link
2024-11-11	Edify 3D: Scalable High-Quality 3D Asset Generation	NVIDIA et.al.	2411.07135	null
2024-11-11	Benchmarking LLMs' Judgments with No Gold Standard	Shengwei Xu et.al.	2411.07127	link
2024-11-11	Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models	NVIDIA et.al.	2411.07126	null
2024-11-11	Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models	Yanchen Wang et.al.	2411.07121	link
2024-11-11	Scaling Mesh Generation via Compressive Tokenization	Haohan Weng et.al.	2411.07025	link
2024-11-11	An Electrocardiogram Monitoring Device Based on STM32	Wenqi Guan et.al.	2411.06962	null
2024-11-11	Generative Feature Training of Thin 2-Layer Networks	Johannes Hertrich et.al.	2411.06848	link
2024-11-08	StdGEN: Semantic-Decomposed 3D Character Generation from Single Images	Yuze He et.al.	2411.05738	null
2024-11-08	Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models	Jia-Hong Huang et.al.	2411.05706	null
2024-11-08	Improving Molecular Graph Generation with Flow Matching and Optimal Transport	Xiaoyang Hou et.al.	2411.05676	null
2024-11-08	Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion	Nan Song et.al.	2411.05544	null
2024-11-08	Improving image synthesis with diffusion-negative sampling	Alakh Desai et.al.	2411.05473	null
2024-11-08	Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation	Peidong Liu et.al.	2411.05472	link
2024-11-08	IntellBot: Retrieval Augmented LLM Chatbot for Cyber Threat Knowledge Delivery	Dincy R. Arikkat et.al.	2411.05442	null
2024-11-08	RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction	Xingyu Ai et.al.	2411.05354	null
2024-11-08	Electro-diffusive modeling and the role of spine geometry on action potential propagation in neurons	Rahul Gulati et.al.	2411.05329	null
2024-11-08	Social balance in directed networks	Bingjie Hao et.al.	2411.05327	null
2024-11-08	SeqRFM: Fast RFM Analysis in Sequence Data	Yanxin Zheng et.al.	2411.05317	link
2024-11-08	Differentiable Calibration of Inexact Stochastic Simulation Models via Kernel Score Minimization	Ziwei Su et.al.	2411.05315	null
2024-11-08	A Real-time Face Mask Detection and Social Distancing System for COVID-19 using Attention-InceptionV3 Model	Abdullah Al Asif et.al.	2411.05312	null
2024-11-08	Adaptive Whole-Body PET Image Denoising Using 3D Diffusion Models with ControlNet	Boxiao Yu et.al.	2411.05302	null
2024-11-08	GPT Semantic Cache: Reducing LLM Costs and Latency via Semantic Embedding Caching	Sajal Regmi et.al.	2411.05276	null
2024-11-07	SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models	Muyang Li et.al.	2411.05007	link
2024-11-07	ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing	Jun-Kun Chen et.al.	2411.05006	null
2024-11-07	Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models	Shuhong Zheng et.al.	2411.05005	null
2024-11-07	ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning	David Junhao Zhang et.al.	2411.05003	null
2024-11-07	SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation	Koichi Namekata et.al.	2411.04989	null
2024-11-07	Few-Shot Task Learning through Inverse Generative Modeling	Aviv Netanyahu et.al.	2411.04987	null
2024-11-07	How fast does the WallGo? A package for computing wall velocities in first-order phase transitions	Andreas Ekstedt et.al.	2411.04970	link
2024-11-07	VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes	Advaith V. Sethuraman et.al.	2411.04963	null
2024-11-07	Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification	Mischa Dombrowski et.al.	2411.04956	null
2024-11-07	Fed-LDR: Federated Local Data-infused Graph Creation with Node-centric Model Refinement	Jiechao Gao et.al.	2411.04936	null
2024-11-07	DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion	Wenqiang Sun et.al.	2411.04928	null
2024-11-07	StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration	Panwen Hu et.al.	2411.04925	null
2024-11-07	Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion	Kaizhe Hu et.al.	2411.04919	link
2024-11-07	GASE: Generatively Augmented Sentence Encoding	Manuel Frank et.al.	2411.04914	null
2024-11-07	Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation	Benito Buchheim et.al.	2411.04724	null
2024-11-06	Community Forensics: Using Thousands of Generators to Train Fake Image Detectors	Jeongsoo Park et.al.	2411.04125	null
2024-11-06	Stepping Forward on the Last Mile	Chen Feng et.al.	2411.04036	null
2024-11-06	Prototyping O-RAN Enabled UAV Experimentation for the AERPAW Testbed	Joshua Moore et.al.	2411.04027	null
2024-11-06	Object-Centric Dexterous Manipulation from Human Motion Data	Yuanpei Chen et.al.	2411.04005	null
2024-11-06	Synomaly Noise and Multi-Stage Diffusion: A Novel Approach for Unsupervised Anomaly Detection in Ultrasound Imaging	Yuan Bi et.al.	2411.04004	null
2024-11-06	ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy	Chenrui Tie et.al.	2411.03990	null
2024-11-06	ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models	Ashutosh Srivastava et.al.	2411.03982	null
2024-11-06	Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning	Jiawei Yao et.al.	2411.03978	link
2024-11-06	Bayesian algorithmic perfumery: A Hierarchical Relevance Vector Machine for the Estimation of Personalized Fragrance Preferences based on Three Sensory Layers and Jungian Personality Archetypes	Rolando Gonzales Martinez et.al.	2411.03965	null
2024-11-06	Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks	Felipe Marra et.al.	2411.03948	link
2024-11-06	Can Custom Models Learn In-Context? An Exploration of Hybrid Architecture Performance on In-Context Learning Tasks	Ryan Campbell et.al.	2411.03945	link
2024-11-06	GUIDE-VAE: Advancing Data Generation with User Information and Pattern Dictionaries	Kutay Bölat et.al.	2411.03936	null
2024-11-06	Large Generative Model-assisted Talking-face Semantic Communication System	Feibo Jiang et.al.	2411.03876	null
2024-11-06	ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization	Huayang Huang et.al.	2411.03862	link
2024-11-06	Sub-DM:Subspace Diffusion Model with Orthogonal Decomposition for MRI Reconstruction	Yu Guan et.al.	2411.03758	null
2024-11-05	MME-Finance: A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning	Ziliang Gan et.al.	2411.03314	null
2024-11-05	LLMs for Domain Generation Algorithm Detection	Reynier Leyva La O et.al.	2411.03307	null
2024-11-05	DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models	Ying Zhou et.al.	2411.03250	null
2024-11-05	On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models	Tariq Berrada Ifriqi et.al.	2411.03177	null
2024-11-05	Unleashing the power of novel conditional generative approaches for new materials discovery	Lev Novitskiy et.al.	2411.03156	link
2024-11-05	Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting	Adrian B. Chłopowiec et.al.	2411.03098	null
2024-11-05	Gradient-Guided Conditional Diffusion Models for Private Image Reconstruction: Analyzing Adversarial Impacts of Differential Privacy and Denoising	Tao Huang et.al.	2411.03053	null
2024-11-05	GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details	Zhongjin Luo et.al.	2411.03047	null
2024-11-05	Speaker Emotion Recognition: Leveraging Self-Supervised Models for Feature Extraction Using Wav2Vec2 and HuBERT	Pourya Jafarzadeh et.al.	2411.02964	null
2024-11-05	IMUDiffusion: A Diffusion Model for Multivariate Time Series Synthetisation for Inertial Motion Capturing Systems	Heiko Oppel et.al.	2411.02954	null
2024-11-05	LDPM: Towards undersampled MRI reconstruction with MR-VAE and Latent Diffusion Prior	Xingjian Tang et.al.	2411.02951	null
2024-11-05	A scalable generative model for dynamical system reconstruction from neuroimaging data	Eric Volkmann et.al.	2411.02949	link
2024-11-05	Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey	Ao Fu et.al.	2411.02914	null
2024-11-05	The Unreasonable Effectiveness of LLMs for Query Optimization	Peter Akioyamen et.al.	2411.02862	link
2024-11-05	ADOPT: Modified Adam Can Converge with Any $β_2$ with the Optimal Rate	Shohei Taniguchi et.al.	2411.02853	link
2024-11-04	Training-free Regional Prompting for Diffusion Transformers	Anthony Chen et.al.	2411.02395	link
2024-11-04	How Far is Video Generation from World Model: A Physical Law Perspective	Bingyi Kang et.al.	2411.02385	null
2024-11-04	Virgo Filaments IV: Using WISE to Measure the Modification of Star-Forming Disks in the Extended Regions Around the Virgo Cluster	Kim Conger et.al.	2411.02352	null
2024-11-04	Diffusion-based Generative Multicasting with Intent-aware Semantic Decomposition	Xinkai Liu et.al.	2411.02334	null
2024-11-05	PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance	Ruyang Liu et.al.	2411.02327	link
2024-11-04	LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation	Mufei Li et.al.	2411.02322	link
2024-11-04	CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments	Kung-Hsiang Huang et.al.	2411.02305	link
2024-11-04	Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation	Xianghui Yang et.al.	2411.02293	null
2024-11-04	Counterfactual Explanations via Riemannian Latent Space Traversal	Paraskevas Pegios et.al.	2411.02259	null
2024-11-04	FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training	Ruihong Yin et.al.	2411.02229	null
2024-11-04	Recursive Learning of Asymptotic Variational Objectives	Alessandro Mastrototaro et.al.	2411.02217	null
2024-11-04	Digi2Real: Bridging the Realism Gap in Synthetic Data Face Recognition via Foundation Models	Anjith George et.al.	2411.02188	null
2024-11-04	Touch-to-Touch Translation -- Learning the Mapping Between Heterogeneous Tactile Sensing Technologies	Francesco Grella et.al.	2411.02187	null
2024-11-04	CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality	Yiqin Zhao et.al.	2411.02179	null
2024-11-04	CryptoEL: A Novel Experiential Learning Tool for Enhancing K-12 Cryptography Education	Pranathi Rayavaram et.al.	2411.02143	null
2024-10-31	Bridging Geometric States via Geometric Diffusion Bridge	Shengjie Luo et.al.	2410.24220	null
2024-10-31	Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning	Penghui Ruan et.al.	2410.24219	link
2024-10-31	DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion	Weicai Ye et.al.	2410.24203	link
2024-10-31	Multi-Attribute Linguistic Tuning for Controlled Paraphrase Generation	Mohamed Elgaar et.al.	2410.24199	null
2024-10-31	Generative modelling for mass-mapping with fast uncertainty quantification	Jessica J. Whitney et.al.	2410.24197	link
2024-10-31	AR-Pro: Counterfactual Explanations for Anomaly Repair with Formal Properties	Xiayan Ji et.al.	2410.24178	null
2024-10-31	Redefining in Dictionary: Towards a Enhanced Semantic Understanding of Creative Generation	Fu Feng et.al.	2410.24160	null
2024-10-31	Scaling Concept With Text-Guided Diffusion Models	Chao Huang et.al.	2410.24151	null
2024-10-31	Repository-Level Compositional Code Translation and Validation	Ali Reza Ibrahimzada et.al.	2410.24117	link
2024-10-31	Extended electrochemical monitoring of biomolecular binding using commercially available, reusable electrodes in microliter volumes	Jeremy Mendez et.al.	2410.24110	null
2024-10-31	Sparsh: Self-supervised touch representations for vision-based tactile sensing	Carolina Higuera et.al.	2410.24090	null
2024-10-31	Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure	Xiang Li et.al.	2410.24060	link
2024-10-31	TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation	Sunjae Yoon et.al.	2410.24037	null
2024-10-31	Unveiling Synthetic Faces: How Synthetic Datasets Can Expose Real Identities	Hatef Otroshi Shahreza et.al.	2410.24015	null
2024-10-31	DiffPAD: Denoising Diffusion-based Adversarial Patch Decontamination	Jia Fu et.al.	2410.24006	link
2024-10-30	ReferEverything: Towards Segmenting Everything We Can Speak of in Videos	Anurag Bagchi et.al.	2410.23287	null
2024-10-30	Provable acceleration for diffusion models under minimal assumptions	Gen Li et.al.	2410.23285	null
2024-10-30	RelationBooth: Towards Relation-Aware Customized Object Generation	Qingyu Shi et.al.	2410.23280	null
2024-10-30	SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation	Yining Hong et.al.	2410.23277	null
2024-10-30	Multi-student Diffusion Distillation for Better One-step Generators	Yanke Song et.al.	2410.23274	null
2024-10-30	ReaWristic: Remote Touch Sensation to Fingers from a Wristband via Visually Augmented Electro-Tactile Feedback	Yudai Tanaka et.al.	2410.23193	null
2024-10-30	Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning	Keqin Bao et.al.	2410.23136	link
2024-10-30	Educating for Hardware Specialization in the Chiplet Era: A Path for the HPC Community	Kazutomo Yoshii et.al.	2410.23127	null
2024-10-30	CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense	Mingkun Zhang et.al.	2410.23091	link
2024-10-30	General Bayesian quantile regression for counts via generative modeling	Yuta Yamauchi et.al.	2410.23081	null
2024-10-30	Controlling Language and Diffusion Models by Transporting Activations	Pau Rodriguez et.al.	2410.23054	link
2024-10-30	Dispersion kinks from electronic correlations in an unconventional iron-based superconductor	Ming-Hua Chang et.al.	2410.23044	null
2024-10-30	Improving Musical Accompaniment Co-creation via Diffusion Transformers	Javier Nistal et.al.	2410.23005	null
2024-10-30	DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes	Jialiang Zhang et.al.	2410.23004	null
2024-10-30	LumiSculpt: A Consistency Lighting Control Network for Video Generation	Yuxin Zhang et.al.	2410.22979	null
2024-10-29	CaStL: Constraints as Specifications through LLM Translation for Long-Horizon Task and Motion Planning	Weihang Guo et.al.	2410.22225	null
2024-10-29	A Gaussian Process Generative Model for QCD Equation of State	Jiaxuan Gong et.al.	2410.22160	null
2024-10-29	Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models	Raman Dutt et.al.	2410.22149	link
2024-10-29	AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts	Vishal Kumar et.al.	2410.22143	null
2024-10-29	Infrared photometry with InGaAs detectors: First light with SPECULOOS	Peter P. Pedersen et.al.	2410.22140	link
2024-10-29	SimRec: Mitigating the Cold-Start Problem in Sequential Recommendation by Integrating Item Similarity	Shaked Brody et.al.	2410.22136	link
2024-10-29	Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench	Zheyuan Liu et.al.	2410.22108	link
2024-10-29	Variational inference for pile-up removal at hadron colliders with diffusion models	Malte Algren et.al.	2410.22074	null
2024-10-29	PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement	Shutong Jin et.al.	2410.22059	null
2024-10-29	Dual Conditional Diffusion Models for Sequential Recommendation	Hongtao Huang et.al.	2410.21967	null
2024-10-29	PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference	Kendong Liu et.al.	2410.21966	null
2024-10-29	CT to PET Translation: A Large-scale Dataset and Domain-Knowledge-Guided Diffusion Approach	Dac Thai Nguyen et.al.	2410.21932	link
2024-10-29	Guided Diffusion-based Counterfactual Augmentation for Robust Session-based Recommendation	Muskan Gupta et.al.	2410.21892	null
2024-10-29	On the study of the limit cycles for a class of population models with time-varying factors	Renhao Tian et.al.	2410.21848	null
2024-10-29	Diffusion as Reasoning: Enhancing Object Goal Navigation with LLM-Biased Diffusion Model	Yiming Ji et.al.	2410.21842	null
2024-10-28	On Inductive Biases That Enable Generalization of Diffusion Transformers	Jie An et.al.	2410.21273	link
2024-10-28	EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation	Shih-Yang Liu et.al.	2410.21271	null
2024-10-28	LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior	Hanyu Wang et.al.	2410.21264	null
2024-10-28	One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation	Zhendong Wang et.al.	2410.21257	null
2024-10-28	On learning higher-order cumulants in diffusion models	Gert Aarts et.al.	2410.21212	null
2024-10-28	The VSPEC Collection: A suite of utilities to model spectroscopic phase curves of 3D exoplanet atmospheres in the presence of stellar variability	Ted M Johnson et.al.	2410.21190	null
2024-10-28	Trajectory Flow Matching with Applications to Clinical Time Series Modeling	Xi Zhang et.al.	2410.21154	link
2024-10-28	Synthetica: Large Scale Synthetic Data for Robot Perception	Ritvik Singh et.al.	2410.21153	null
2024-10-28	Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model in Irregular Longitudinal Sequences	Zhihao Zhao et.al.	2410.21130	null
2024-10-28	Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models	Wenda Li et.al.	2410.21088	link
2024-10-28	Federated Time Series Generation on Feature and Temporally Misaligned Data	Chenrui Fan et.al.	2410.21072	null
2024-10-28	Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework	Vladimir Arkhipkin et.al.	2410.21061	link
2024-10-28	Beyond Autoregression: Fast LLMs via Self-Distillation Through Time	Justin Deschenaux et.al.	2410.21035	link
2024-10-29	EEG-Driven 3D Object Reconstruction with Color Consistency and Diffusion Prior	Xin Xiang et.al.	2410.20981	null
2024-10-28	MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis	Di Qiu et.al.	2410.20974	null
2024-10-25	Model merging with SVD to tie the Knots	George Stoica et.al.	2410.19735	link
2024-10-25	Adversarial Environment Design via Regret-Guided Diffusion Models	Hojun Chung et.al.	2410.19715	null
2024-10-25	Perception, Control and Hardware for In-Hand Slip-Aware Object Manipulation with Parallel Grippers	Gabriel Arslan Waltersson et.al.	2410.19660	null
2024-10-25	DiffGS: Functional Gaussian Splatting Diffusion	Junsheng Zhou et.al.	2410.19657	null
2024-10-25	VARS: Vision-based Assessment of Risk in Security Systems	Pranav Gupta et.al.	2410.19642	null
2024-10-25	Diffusion models for lattice gauge field simulations	Qianteng Zhu et.al.	2410.19602	null
2024-10-25	Energy Efficient Dual Designs of FeFET-Based Analog In-Memory Computing with Inherent Shift-Add Capability	Zeyu Yang et.al.	2410.19593	null
2024-10-25	Hybrid Memetic Search for Electric Vehicle Routing with Time Windows, Simultaneous Pickup-Delivery, and Partial Recharges	Zubin Zheng et.al.	2410.19580	null
2024-10-25	Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series	Ilan Naiman et.al.	2410.19538	null
2024-10-25	Ensemble Data Assimilation for Particle-based Methods	Marius Duvillard et.al.	2410.19525	null
2024-10-25	Marked Temporal Bayesian Flow Point Processes	Hui Chen et.al.	2410.19512	null
2024-10-25	EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data	Xuetian Chen et.al.	2410.19461	null
2024-10-28	NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction	Zixuan Gong et.al.	2410.19452	link
2024-10-25	Learned Reference-based Diffusion Sampling for multi-modal distributions	Maxence Noble et.al.	2410.19449	null
2024-10-25	Generative Diffusion Models for Sequential Recommendations	Sharare Zolghadr et.al.	2410.19429	null
2024-10-24	Framer: Interactive Frame Interpolation	Wen Wang et.al.	2410.18978	null
2024-10-24	MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms	Ling-Hao Chen et.al.	2410.18977	null
2024-10-24	Unbounded: A Generative Infinite Game of Character Life Simulation	Jialu Li et.al.	2410.18975	null
2024-10-24	3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation	Hansheng Chen et.al.	2410.18974	link
2024-10-24	On the Crucial Role of Initialization for Matrix Factorization	Bingcong Li et.al.	2410.18965	null
2024-10-24	Stable Consistency Tuning: Understanding and Improving Consistency Models	Fu-Yun Wang et.al.	2410.18958	link
2024-10-24	Generation of synthetic financial time series by diffusion models	Tomonori Takahashi et.al.	2410.18897	null
2024-10-24	Diff-Instruct++: Training One-step Text-to-image Generator Model to Align with Human Preferences	Weijian Luo et.al.	2410.18881	null
2024-10-24	The Cat and Mouse Game: The Ongoing Arms Race Between Diffusion Models and Detection Methods	Linda Laurier et.al.	2410.18866	null
2024-10-24	From Efficiency to Equity: Measuring Fairness in Preference Learning	Shreeyash Gowaikar et.al.	2410.18841	null
2024-10-24	From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages	Artur Kiulian et.al.	2410.18836	null
2024-10-24	Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation	Xiaoyu Zhang et.al.	2410.18830	null
2024-10-24	Towards Visual Text Design Transfer Across Languages	Yejin Choi et.al.	2410.18823	null
2024-10-24	Fast constrained sampling in pre-trained diffusion models	Alexandros Graikos et.al.	2410.18804	null
2024-10-24	Large Generative AI Models meet Open Networks for 6G: Integration, Platform, and Monetization	Peizheng Li et.al.	2410.18790	null
2024-10-23	DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes	Hengwei Bian et.al.	2410.18084	null
2024-10-23	Prioritized Generative Replay	Renhao Wang et.al.	2410.18082	null
2024-10-23	WorldSimBench: Towards Video Generation Models as World Simulators	Yiran Qin et.al.	2410.18072	null
2024-10-23	TP-Eval: Tap Multimodal LLMs' Potential in Evaluation by Customizing Prompts	Yuxuan Xie et.al.	2410.18071	null
2024-10-23	Training Free Guided Flow Matching with Optimal Control	Luran Wang et.al.	2410.18070	null
2024-10-23	Spectrally shaped THz pulses from tapered dielectric waveguides	Karel Peetermans et.al.	2410.17975	null
2024-10-23	Optical Generative Models	Shiqi Chen et.al.	2410.17970	null
2024-10-23	A Wavelet Diffusion GAN for Image Super-Resolution	Lorenzo Aloisi et.al.	2410.17966	null
2024-10-23	Addressing Asynchronicity in Clinical Multimodal Fusion via Individualized Chest X-ray Generation	Wenfang Yao et.al.	2410.17918	link
2024-10-23	regAL: Python Package for Active Learning of Regression Problems	Elizaveta Surzhikova et.al.	2410.17917	null
2024-10-23	Scaling Diffusion Language Models via Adaptation from Autoregressive Models	Shansan Gong et.al.	2410.17891	link
2024-10-23	Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech	Danilo de Oliveira et.al.	2410.17834	null
2024-10-23	PGDiffSeg: Prior-Guided Denoising Diffusion Model with Parameter-Shared Attention for Breast Cancer Segmentation	Feiyan Feng et.al.	2410.17812	null
2024-10-23	GenUDC: High Quality 3D Mesh Generation with Unsigned Dual Contouring Representation	Ruowei Wang et.al.	2410.17802	link
2024-10-23	Regularized autoregressive modeling and its application to audio signal declipping	Ondřej Mokrý et.al.	2410.17790	link
2024-10-22	Large Language Models Empowered Personalized Web Agents	Hongru Cai et.al.	2410.17236	null
2024-10-22	Creativity in AI: Progresses and Challenges	Mete Ismayilzada et.al.	2410.17218	null
2024-10-22	Audio-to-Score Conversion Model Based on Whisper methodology	Hongyao Zhang et.al.	2410.17209	null
2024-10-22	Reinforcement learning on structure-conditioned categorical diffusion for protein inverse folding	Yasha Ektefaie et.al.	2410.17173	link
2024-10-22	Performance of the CMS high-level trigger during LHC Run 2	CMS Collaboration et.al.	2410.17038	null
2024-10-22	Hybrid Generative AI for De Novo Design of Co-Crystals with Enhanced Tabletability	Nina Gubina et.al.	2410.17005	link
2024-10-22	DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization	Haowei Zhu et.al.	2410.16942	null
2024-10-22	Hierarchical Clustering for Conditional Diffusion in Image Generation	Jorge da Silva Goncalves et.al.	2410.16910	link
2024-10-22	Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections	Marco Miani et.al.	2410.16901	null
2024-10-22	VistaDream: Sampling multiview consistent images for single-view scene reconstruction	Haiping Wang et.al.	2410.16892	null
2024-10-22	CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare	Nicholas I-Hsien Kuo et.al.	2410.16872	null
2024-10-22	MPDS: A Movie Posters Dataset for Image Generation with Diffusion Model	Meng Xu et.al.	2410.16840	null
2024-10-22	Bridging Search and Recommendation in Generative Retrieval: Does One Task Help the Other?	Gustavo Penha et.al.	2410.16823	null
2024-10-22	Evaluating the Effectiveness of Attack-Agnostic Features for Morphing Attack Detection	Laurent Colbois et.al.	2410.16802	link
2024-10-22	One-Step Diffusion Distillation through Score Implicit Matching	Weijian Luo et.al.	2410.16794	link
2024-10-21	MvDrag3D: Drag-based Creative 3D Editing via Multi-view Generation-Reconstruction Priors	Honghua Chen et.al.	2410.16272	null
2024-10-21	Agent-to-Sim: Learning Interactive Behavior Models from Casual Longitudinal Videos	Gengshan Yang et.al.	2410.16259	null
2024-10-21	Distribution Learning with Valid Outputs Beyond the Worst-Case	Nick Rittler et.al.	2410.16253	null
2024-10-21	Building A Coding Assistant via the Retrieval-Augmented Language Model	Xinze Li et.al.	2410.16229	link
2024-10-21	CiteClick: A Browser Extension for Real-Time Scholar Citation Tracking	Nishat Raihan et.al.	2410.16211	null
2024-10-21	A Framework for Evaluating Predictive Models Using Synthetic Image Covariates and Longitudinal Data	Simon Deltadahl et.al.	2410.16177	null
2024-10-22	Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models	Giannis Daras et.al.	2410.16152	null
2024-10-21	Modelling Structured Data Learning with Restricted Boltzmann Machines in the Teacher-Student Setting	Robin Thériault et.al.	2410.16150	null
2024-10-21	SeaDAG: Semi-autoregressive Diffusion for Conditional Directed Acyclic Graph Generation	Xinyi Zhou et.al.	2410.16119	null
2024-10-21	Critical Example Mining for Vehicle Trajectory Prediction using Flow-based Generative Models	Zhezhang Ding et.al.	2410.16083	null
2024-10-21	Continuous Speech Synthesis using per-token Latent Diffusion	Arnon Turetzky et.al.	2410.16048	null
2024-10-21	Some generalizations of the convective model of jet generation	S. N. Artekha et.al.	2410.16035	null
2024-10-21	ComPO: Community Preferences for Language Model Personalization	Sachin Kumar et.al.	2410.16027	null
2024-10-21	Massimo: Public Queue Monitoring and Management using Mass-Spring Model	Abhijeet Kumar et.al.	2410.16012	null
2024-10-21	AI-Driven Innovations in Modern Cloud Computing	Animesh Kumar et.al.	2410.15960	null
2024-10-18	BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities	Shaozhe Hao et.al.	2410.14672	link
2024-10-18	How Does Data Diversity Shape the Weight Landscape of Neural Networks?	Yang Ba et.al.	2410.14602	null
2024-10-18	Bayesian Multi-wavelength Imaging of the LMC SN1987A with SRG/eROSITA	Vincent Eberle et.al.	2410.14599	null
2024-10-18	Neuro-Symbolic Traders: Assessing the Wisdom of AI Crowds in Markets	Namid R. Stillman et.al.	2410.14587	null
2024-10-18	Reimagining partial thickness keratoplasty: An eye mountable robot for autonomous big bubble needle insertion	Y. Wang et.al.	2410.14577	null
2024-10-18	Multi-modal Pose Diffuser: A Multimodal Generative Conditional Pose Prior	Calvin-Khang Ta et.al.	2410.14540	null
2024-10-18	Blockchain-Based Trust and Transparency in Airline Reservation Systems using Microservices Architecture	Biman Barua et.al.	2410.14518	null
2024-10-18	LEAD: Latent Realignment for Human Motion Diffusion	Nefeli Andreou et.al.	2410.14508	null
2024-10-18	Reinforcement Learning in Non-Markov Market-Making	Luca Lalor et.al.	2410.14504	null
2024-10-18	Data-driven topology design with persistent homology for enhancing population diversity	Taisei Kii et.al.	2410.14496	null
2024-10-18	ANT: Adaptive Noise Schedule for Time Series Diffusion Models	Seunghan Lee et.al.	2410.14488	link
2024-10-21	CaTs and DAGs: Integrating Directed Acyclic Graphs with Transformers and Fully-Connected Neural Networks for Causally Constrained Predictions	Matthew J. Vowels et.al.	2410.14485	link
2024-10-18	DRL Optimization Trajectory Generation via Wireless Network Intent-Guided Diffusion Models for Optimizing Resource Allocation	Junjie Wu et.al.	2410.14481	null
2024-10-18	Flow-based Sampling for Entanglement Entropy and the Machine Learning of Defects	Andrea Bulgarelli et.al.	2410.14466	null
2024-10-18	FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models	Rui Hu et.al.	2410.14429	null
2024-10-17	Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens	Lijie Fan et.al.	2410.13863	null
2024-10-17	Diffusing States and Matching Scores: A New Framework for Imitation Learning	Runzhe Wu et.al.	2410.13855	link
2024-10-17	Influence Functions for Scalable Data Attribution in Diffusion Models	Bruno Mlodozeniec et.al.	2410.13850	null
2024-10-17	VidPanos: Generative Panoramic Videos from Casual Panning Videos	Jingwei Ma et.al.	2410.13832	null
2024-10-17	DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control	Yujie Wei et.al.	2410.13830	null
2024-10-17	Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning	Xiaodan Xing et.al.	2410.13823	link
2024-10-17	ConsisSR: Delving Deep into Consistency in Diffusion-based Image Super-Resolution	Junhao Gu et.al.	2410.13807	null
2024-10-17	Probing the Latent Hierarchical Structure of Data via Diffusion Models	Antonio Sclocchi et.al.	2410.13770	null
2024-10-17	Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers	Yuchen Liang et.al.	2410.13746	null
2024-10-17	Improved Convergence Rate for Diffusion Probabilistic Models	Gen Li et.al.	2410.13738	null
2024-10-17	Optimizing Probabilistic Conformal Prediction with Vectorized Non-Conformity Scores	Minxing Zheng et.al.	2410.13735	null
2024-10-18	DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation	Hanbo Cheng et.al.	2410.13726	link
2024-10-17	Movie Gen: A Cast of Media Foundation Models	Adam Polyak et.al.	2410.13720	link
2024-10-18	Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion	Yijun Liang et.al.	2410.13674	link
2024-10-17	Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design	Chenyu Wang et.al.	2410.13643	link
2024-10-16	Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds	Xingzhi Sun et.al.	2410.12779	null
2024-10-16	Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts	Hongcheng Gao et.al.	2410.12777	link
2024-10-16	SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation	Jaehong Yoon et.al.	2410.12761	null
2024-10-16	Signature of Vertical Mixing in Hydrogen-dominated Exoplanet Atmospheres	Vikas Soni et.al.	2410.12737	null
2024-10-16	Counterfactual Generative Modeling with Variational Causal Inference	Yulun Wu et.al.	2410.12730	null
2024-10-16	FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression	Zhenheng Tang et.al.	2410.12707	null
2024-10-16	Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization	Xingqi Wang et.al.	2410.12700	link
2024-10-16	AdaptiveDrag: Semantic-Driven Dragging on Diffusion-Based Image Editing	DuoSheng Chen et.al.	2410.12696	null
2024-10-16	3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation	Dewei Zhou et.al.	2410.12669	null
2024-10-16	Towards Designing Scalable Quantum-Enhanced Generative Networks for Neutrino Physics Experiments with Liquid Argon Time Projection Chambers	Andrea Delgado et.al.	2410.12650	null
2024-10-16	A Robo-Advisor System: expected utility modeling via pairwise comparisons	Bo Chen et.al.	2410.12570	null
2024-10-16	One Step Diffusion via Shortcut Models	Kevin Frans et.al.	2410.12557	link
2024-10-16	Disentangling data distribution for Federated Learning	Xinyuan Zhao et.al.	2410.12530	null
2024-10-16	Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing	Mingce Guo et.al.	2410.12526	null
2024-10-16	MING: A Functional Approach to Learning Molecular Generative Models	Van Khoa Nguyen et.al.	2410.12522	null
2024-10-15	High-Resolution Frame Interpolation with Patch-based Cascaded Diffusion	Junhwa Hur et.al.	2410.11838	null
2024-10-15	On the Effectiveness of Dataset Alignment for Fake Image Detection	Anirudh Sundara Rajan et.al.	2410.11835	null
2024-10-15	Bayesian Experimental Design via Contrastive Diffusions	Jacopo Iollo et.al.	2410.11826	link
2024-10-15	KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities	Hsin-Ping Huang et.al.	2410.11824	null
2024-10-15	Improving Long-Text Alignment for Text-to-Image Diffusion Models	Luping Liu et.al.	2410.11817	link
2024-10-15	SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing	Zhiyuan Zhang et.al.	2410.11815	null
2024-10-16	Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices	Zhiyuan Ma et.al.	2410.11795	null
2024-10-15	G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks	Guibin Zhang et.al.	2410.11782	null
2024-10-15	Technical Report of 1:10 Scale Autonomous Vehicle Robot	Amirhossein Kheiri Holighi et.al.	2410.11746	null
2024-10-15	Probabilistic Principles for Biophysics and Neuroscience: Entropy Production, Bayesian Mechanics & the Free-Energy Principle	Lancelot Da Costa et.al.	2410.11735	null
2024-10-15	Patch-Based Diffusion Models Beat Whole-Image Models for Mismatched Distribution Inverse Problems	Jason Hu et.al.	2410.11730	null
2024-10-15	Parameter estimation of structural dynamics with neural operators enabled surrogate modeling	Mingyuan Zhou et.al.	2410.11712	null
2024-10-15	Findings of the WMT 2024 Shared Task on Chat Translation	Wafaa Mohammed et.al.	2410.11624	null
2024-10-15	DeformPAM: Data-Efficient Learning for Long-horizon Deformable Object Manipulation via Preference-based Action Alignment	Wendi Chen et.al.	2410.11584	link
2024-10-15	A Data-Driven Aggressive Autonomous Racing Framework Utilizing Local Trajectory Planning with Velocity Prediction	Zhouheng Li et.al.	2410.11570	link
2024-10-14	Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models	Jingzhi Bao et.al.	2410.10821	link
2024-10-15	TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models	Mu Cai et.al.	2410.10818	link
2024-10-14	LVD-2M: A Long-take Video Dataset with Temporally Dense Captions	Tianwei Xiong et.al.	2410.10816	link
2024-10-14	Depth Any Video with Scalable Synthetic Data	Honghui Yang et.al.	2410.10815	link
2024-10-14	HART: Efficient Visual Generation with Hybrid Autoregressive Transformer	Haotian Tang et.al.	2410.10812	link
2024-10-14	TrajDiffuse: A Conditional Diffusion Model for Environment-Aware Trajectory Prediction	Qingze et.al.	2410.10804	link
2024-10-14	Boosting Camera Motion Control for Video Diffusion Transformers	Soon Yau Cheong et.al.	2410.10802	null
2024-10-14	Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations	Litu Rout et.al.	2410.10792	null
2024-10-14	ControlMM: Controllable Masked Motion Generation	Ekkasit Pinyoanuntapong et.al.	2410.10780	null
2024-10-14	Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation	Youwei Yu et.al.	2410.10766	null
2024-10-14	DragEntity: Trajectory Guided Video Generation using Entity and Positional Relationships	Zhang Wan et.al.	2410.10751	null
2024-10-14	CosForce: A Force-Based General Model for Simulating Pedestrian Anticipation and Reaction Mechanisms	Jinghui Wang et.al.	2410.10746	null
2024-10-14	FlexGen: Flexible Multi-View Generation from Text and Image Inputs	Xinli Xu et.al.	2410.10745	null
2024-10-14	Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models	Junyu Chen et.al.	2410.10733	link
2024-10-14	Large Language Models Are Active Critics in NLG Evaluation	Shuying Xu et.al.	2410.10724	null
2024-10-11	SceneCraft: Layout-Guided 3D Scene Generation	Xiuyu Yang et.al.	2410.09049	link
2024-10-11	Linear Convergence of Diffusion Models Under the Manifold Hypothesis	Peter Potaptchik et.al.	2410.09046	null
2024-10-11	PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents	Xiangyu Yin et.al.	2410.09034	link
2024-10-11	Semantic Score Distillation Sampling for Compositional Text-to-3D Generation	Ling Yang et.al.	2410.09009	link
2024-10-11	WaveDiffusion: Exploring Full Waveform Inversion via Joint Diffusion in the Latent Space	Hanchen Wang et.al.	2410.09002	null
2024-10-11	Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory	Aymane El Firdoussi et.al.	2410.08942	null
2024-10-11	DiffPO: A causal diffusion model for learning distributions of potential outcomes	Yuchen Ma et.al.	2410.08924	null
2024-10-11	An End-to-End Deep Learning Method for Solving Nonlocal Allen-Cahn and Cahn-Hilliard Phase-Field Models	Yuwei Geng et.al.	2410.08914	null
2024-10-11	Conditional Generative Models for Contrast-Enhanced Synthesis of T1w and T1 Maps in Brain MRI	Moritz Piening et.al.	2410.08894	link
2024-10-11	MATCH: Model-Aware TVM-based Compilation for Heterogeneous Edge Devices	Mohamed Amine Hamdi et.al.	2410.08855	link
2024-10-14	LIME-Eval: Rethinking Low-light Image Enhancement Evaluation via Object Detection	Mingjia Li et.al.	2410.08810	link
2024-10-11	Bad Neighbors: On Understanding VPN Provider Networks	Teemu Rytilahti et.al.	2410.08737	link
2024-10-11	5G as Enabler for Industrie 4.0 Use Cases: Challenges and Concepts	M. Gundall et.al.	2410.08726	null
2024-10-11	Investigating Human-Computer Interaction and Visual Comprehension in Text Generation Process of Natural Language Generation Models	Yunchao Wang et.al.	2410.08723	null
2024-10-11	Impact of Surface Reflections in Maritime Obstacle Detection	Samed Yalçın et.al.	2410.08713	link
2024-10-10	LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts	Anh-Quan Cao et.al.	2410.08211	null
2024-10-10	DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models	Xiaoxiao He et.al.	2410.08207	null
2024-10-10	HybridBooth: Hybrid Prompt Inversion for Efficient Subject-Driven Generation	Shanyan Guan et.al.	2410.08192	null
2024-10-10	DifFRelight: Diffusion-Based Facial Performance Relighting	Mingming He et.al.	2410.08188	null
2024-10-10	RGM: Reconstructing High-fidelity 3D Car Assets with Relightable 3D-GS Generative Model from a Single Image	Xiaoxue Chen et.al.	2410.08181	null
2024-10-10	ZeroComp: Zero-shot Object Compositing from Image Intrinsics via Diffusion	Zitian Zhang et.al.	2410.08168	null
2024-10-10	DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation	Jiatao Gu et.al.	2410.08159	null
2024-10-10	Progressive Autoregressive Video Diffusion Models	Desai Xie et.al.	2410.08151	link
2024-10-10	Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction	Jarrid Rector-Brooks et.al.	2410.08134	null
2024-10-10	Robust AI-Generated Text Detection by Restricted Embeddings	Kristian Kuznetsov et.al.	2410.08113	link
2024-10-10	LiPO: LiDAR Inertial Odometry for ICP Comparison	Darwin Mick et.al.	2410.08097	null
2024-10-10	Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models	Vinith M. Suriyakumar et.al.	2410.08074	null
2024-10-10	Reversible Decoupling Network for Single Image Reflection Removal	Hao Zhao et.al.	2410.08063	link
2024-10-10	A Target-Aware Analysis of Data Augmentation for Hate Speech Detection	Camilla Casula et.al.	2410.08053	null
2024-10-10	LADIMO: Face Morph Generation through Biometric Template Inversion with Latent Diffusion	Marcel Grimmer et.al.	2410.07988	link
2024-10-09	IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation	Xinchen Zhang et.al.	2410.07171	link
2024-10-09	Sylber: Syllabic Embedding Representation of Speech from Raw Audio	Cheol Jun Cho et.al.	2410.07168	link
2024-10-09	AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation	Yukang Cao et.al.	2410.07164	null
2024-10-09	InstructG2I: Synthesizing Images from Multimodal Attributed Graphs	Bowen Jin et.al.	2410.07157	link
2024-10-09	Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis	Bohan Zeng et.al.	2410.07155	link
2024-10-10	EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models	Rui Zhao et.al.	2410.07133	link
2024-10-09	Personalized Visual Instruction Tuning	Renjie Pi et.al.	2410.07113	link
2024-10-09	A Gentle Introduction and Tutorial on Deep Generative Models in Transportation Research	Seongjin Choi et.al.	2410.07066	link
2024-10-09	Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax	Ivan Butakov et.al.	2410.06993	null
2024-10-09	Diffusion Density Estimators	Akhil Premkumar et.al.	2410.06986	null
2024-10-09	Jointly Generating Multi-view Consistent PBR Textures using Collaborative Control	Shimon Vainer et.al.	2410.06985	null
2024-10-09	Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation	Runze Chen et.al.	2410.06982	null
2024-10-09	Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think	Sihyun Yu et.al.	2410.06940	link
2024-10-09	VEC-Sim: A Simulation Platform for Evaluating Service Caching and Computation Offloading Policies in Vehicular Edge Networks	Fan Wu et.al.	2410.06934	null
2024-10-09	Generative Model for Less-Resourced Language with 1 billion parameters	Domen Vreš et.al.	2410.06898	null
2024-10-07	DART: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control	Kaifeng Zhao et.al.	2410.05260	null
2024-10-07	GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting	Yukang Cao et.al.	2410.05259	null
2024-10-07	SePPO: Semi-Policy Preference Optimization for Diffusion Alignment	Daoan Zhang et.al.	2410.05255	link
2024-10-07	DiffuseReg: Denoising Diffusion Model for Obtaining Deformation Fields in Unsupervised Deformable Image Registration	Yongtai Zhuo et.al.	2410.05234	link
2024-10-07	Density estimation with LLMs: a geometric investigation of in-context learning trajectories	Toni J. B. Liu et.al.	2410.05218	null
2024-10-07	Avoiding Deadlocks via Weak Deadlock Sets	Gianpaolo Oriolo et.al.	2410.05175	null
2024-10-07	Presto! Distilling Steps and Layers for Accelerating Music Generation	Zachary Novack et.al.	2410.05167	null
2024-10-08	A Simulation-Free Deep Learning Approach to Stochastic Optimal Control	Mengjian Hua et.al.	2410.05163	null
2024-10-07	Smart Jamming Attack and Mitigation on Deep Transfer Reinforcement Learning Enabled Resource Allocation for Network Slicing	Shavbo Salehi et.al.	2410.05153	null
2024-10-07	Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information	Timofey Efimov et.al.	2410.05143	null
2024-10-07	Agnostic Smoothed Online Learning	Moïse Blanchard et.al.	2410.05124	null
2024-10-07	Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning	Ayano Hiranaka et.al.	2410.05116	null
2024-10-07	Synthetic Generation of Dermatoscopic Images with GAN and Closed-Form Factorization	Rohan Reddy Mekala et.al.	2410.05114	null
2024-10-07	Hyper-Representations: Learning from Populations of Neural Networks	Konstantin Schürholt et.al.	2410.05107	link
2024-10-07	DreamSat: Towards a General 3D Model for Novel View Synthesis of Space Objects	Nidhi Mathihalli et.al.	2410.05097	link
2024-10-04	Estimating Body and Hand Motion in an Ego-sensed World	Brent Yi et.al.	2410.03665	null
2024-10-04	Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models	Zhuochun Li et.al.	2410.03663	null
2024-10-04	Geometric Representation Condition Improves Equivariant Molecule Generation	Zian Li et.al.	2410.03655	null
2024-10-04	Aligning LLMs with Individual Preferences via Interaction	Shujin Wu et.al.	2410.03642	link
2024-10-04	Real-World Benchmarks Make Membership Inference Attacks Fail on Diffusion Models	Chumeng Liang et.al.	2410.03640	link
2024-10-04	Conditional Enzyme Generation Using Protein Language Models with Adapters	Jason Yang et.al.	2410.03634	null
2024-10-04	How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework	Yinuo Ren et.al.	2410.03601	null
2024-10-04	Teaching Transformers Modular Arithmetic at Scale	Eshika Saxena et.al.	2410.03569	null
2024-10-04	Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features	Benyuan Meng et.al.	2410.03558	link
2024-10-04	Loading Ceramics: Visualising Possibilities of Robotics in Ceramics	Varvara Guljajeva et.al.	2410.03550	null
2024-10-04	NRGBoost: Energy-Based Generative Boosted Trees	João Bravo et.al.	2410.03535	null
2024-10-04	Generative Artificial Intelligence for Navigating Synthesizable Chemical Space	Wenhao Gao et.al.	2410.03494	link
2024-10-04	SeBS-Flow: Benchmarking Serverless Cloud Function Workflows	Larissa Schmid et.al.	2410.03480	null
2024-10-04	Formalizing MLTL Formula Progression in Isabelle/HOL	Katherine Kosaian et.al.	2410.03465	null
2024-10-04	Diffusion State-Guided Projected Gradient for Inverse Problems	Rayhan Zirvi et.al.	2410.03463	null
2024-10-03	SIEVE: General Purpose Data Filtering System Matching GPT-4o Accuracy at 1% the Cost	Jifan Zhang et.al.	2410.02755	null
2024-10-03	CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation	Han He et.al.	2410.02748	null
2024-10-03	Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization	Lei Xu et.al.	2410.02741	link
2024-10-03	Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models	Zhengfeng Lai et.al.	2410.02740	null
2024-10-03	Custom Non-Linear Model Predictive Control for Obstacle Avoidance in Indoor and Outdoor Environments	Lara Laban et.al.	2410.02732	link
2024-10-03	A Photonic Parameter-shift Rule: Enabling Gradient Computation for Photonic Quantum Computers	Axel Pappalardo et.al.	2410.02726	null
2024-10-03	AlzhiNet: Traversing from 2DCNN to 3DCNN, Towards Early Detection and Diagnosis of Alzheimer's Disease	Romoke Grace Akindele et.al.	2410.02714	null
2024-10-03	SteerDiff: Steering towards Safe Text-to-Image Diffusion Models	Hongxiang Zhang et.al.	2410.02710	null
2024-10-03	ControlAR: Controllable Image Generation with Autoregressive Models	Zongming Li et.al.	2410.02705	link
2024-10-03	User-centric Immersive Communications in 6G: A Data-oriented Approach via Digital Twin	Conghao Zhou et.al.	2410.02688	null
2024-10-03	GUD: Generation with Unified Diffusion	Mathis Gerdes et.al.	2410.02667	null
2024-10-03	Grounded Answers for Multi-agent Decision-making Problem through Generative World Model	Zeyang Liu et.al.	2410.02664	null
2024-10-03	Scalable Simulation-free Entropic Unbalanced Optimal Transport	Jaemoo Choi et.al.	2410.02656	null
2024-10-03	Measuring and Improving Persuasiveness of Generative Models	Somesh Singh et.al.	2410.02653	null
2024-10-03	Efficient calibration of the shifted square-root diffusion model to credit default swap spreads using asymptotic approximations	Ankush Agarwal et.al.	2410.02645	null
2024-10-02	FabricDiffusion: High-Fidelity Texture Transfer for 3D Garments Generation from In-The-Wild Clothing Images	Cheng Zhang et.al.	2410.01801	null
2024-10-02	Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space	Yangming Li et.al.	2410.01796	null
2024-10-02	Dynamical-generative downscaling of climate model ensembles	Ignacio Lopez-Gomez et.al.	2410.01776	null
2024-10-02	Towards deep learning sequence-structure co-generation for protein design	Chentong Wang et.al.	2410.01773	null
2024-10-02	ImageFolder: Autoregressive Image Generation with Folded Tokens	Xiang Li et.al.	2410.01756	link
2024-10-02	AssessITS: Integrating procedural guidelines and practical evaluation metrics for organizational IT and Cybersecurity risk assessment	Mir Mehedi Rahman et.al.	2410.01750	null
2024-10-02	VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models	Kailai Feng et.al.	2410.01738	link
2024-10-02	HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration	Yushi Huang et.al.	2410.01723	null
2024-10-02	Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective	Zeyu Gan et.al.	2410.01720	link
2024-10-02	COMUNI: Decomposing Common and Unique Video Signals for Diffusion-based Video Generation	Mingzhen Sun et.al.	2410.01718	null
2024-10-02	A Mathematics-Inspired Learning-to-Optimize Framework for Decentralized Optimization	Yutong He et.al.	2410.01700	null
2024-10-02	Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding	Yao Teng et.al.	2410.01699	link
2024-10-02	Lossy Semantic Communication for the Logical Deduction of the State of the World	Ahmet Faruk Saz et.al.	2410.01676	link
2024-10-02	Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering	Klaus-Rudolf Kladny et.al.	2410.01660	null
2024-10-02	On The Adaptation of Unlimiformer for Decoder-Only Transformers	Kian Ahrabian et.al.	2410.01637	null
2024-09-30	SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes	Tianchang Shen et.al.	2409.20562	null
2024-09-30	Annealing Flow Generative Model Towards Sampling High-Dimensional and Multi-Modal Distributions	Dongze Wu et.al.	2409.20547	link
2024-09-30	A Compact Quantum Random Number Generator Based on Balanced Detection of Shot Noise	Jaideep Singh et.al.	2409.20515	null
2024-09-30	NUTRIVISION: A System for Automatic Diet Management in Smart Healthcare	Madhumita Veeramreddy et.al.	2409.20508	null
2024-09-30	COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models	Divyanshu Daiya et.al.	2409.20502	null
2024-09-30	FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing	Lingling Cai et.al.	2409.20500	null
2024-09-30	All-optical autoencoder machine learning framework using diffractive processors	Peijie Feng et.al.	2409.20346	null
2024-09-30	Devil is in Details: Locality-Aware 3D Abdominal CT Volume Generation for Self-Supervised Organ Segmentation	Yuran Wang et.al.	2409.20332	null
2024-09-30	UIR-LoRA: Achieving Universal Image Restoration through Multiple Low-Rank Adaptation	Cheng Zhang et.al.	2409.20197	link
2024-09-30	Ensemble Kalman Diffusion Guidance: A Derivative-free Method for Inverse Problems	Hongkai Zheng et.al.	2409.20175	null
2024-09-30	Erase, then Redraw: A Novel Data Augmentation Approach for Free Space Detection Using Diffusion Model	Fulong Ma et.al.	2409.20164	null
2024-09-30	Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation	Rong Tang et.al.	2409.20124	null
2024-09-30	Training a Computer Vision Model for Commercial Bakeries with Primarily Synthetic Images	Thomas H. Schmitt et.al.	2409.20122	null
2024-09-30	Reaction-diffusion model for a population structured in phenotype and space I -- Criterion for persistence	Nathanaël Boutillon et.al.	2409.20118	null
2024-09-30	Near-Field Coupling Coil System: A Novel Radiofrequency Coil Solution for MRI	Zhiguang Mo et.al.	2409.20095	null
2024-09-27	$O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions	Gen Li et.al.	2409.18959	null
2024-09-27	ReviveDiff: A Universal Diffusion Model for Restoring Images in Adverse Weather Conditions	Wenfeng Huang et.al.	2409.18932	null
2024-09-27	Unsupervised Low-light Image Enhancement with Lookup Tables and Diffusion Priors	Yunlong Lin et.al.	2409.18899	null
2024-09-27	Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis	Songrui Wang et.al.	2409.18897	null
2024-09-27	HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models	Yu Zhou et.al.	2409.18893	null
2024-09-27	Explainable Artifacts for Synthetic Western Blot Source Attribution	João Phillipe Cardenuto et.al.	2409.18881	link
2024-09-27	Emu3: Next-Token Prediction is All You Need	Xinlong Wang et.al.	2409.18869	null
2024-09-27	Challenges of Generating Structurally Diverse Graphs	Fedor Velikonivtsev et.al.	2409.18859	link
2024-09-27	Moldable Development Patterns	Oscar Nierstrasz et.al.	2409.18811	null
2024-09-27	Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions	Iskander Azangulov et.al.	2409.18804	null
2024-09-27	Student-Oriented Teacher Knowledge Refinement for Knowledge Distillation	Chaomin Shen et.al.	2409.18785	null
2024-09-27	Geometric deep learning for galaxy-halo connection: a case study for galaxy intrinsic alignments	Yesukhei Jagvaral et.al.	2409.18761	null
2024-09-27	Cottention: Linear Transformers With Cosine Attention	Gabriel Mongaras et.al.	2409.18747	link
2024-09-27	Read Over the Lines: Attacking LLMs and Toxicity Detection Systems with ASCII Art to Mask Profanity	Sergey Berezin et.al.	2409.18708	link
2024-09-27	MG-Net: Learn to Customize QAOA with Circuit Depth Awareness	Yang Qian et.al.	2409.18692	link
2024-09-26	FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner	Wenliang Zhao et.al.	2409.18128	link
2024-09-26	Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction	Jing He et.al.	2409.18124	null
2024-09-26	EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation	Jiaxiang Tang et.al.	2409.18114	null
2024-09-26	MALPOLON: A Framework for Deep Species Distribution Modeling	Theo Larcher et.al.	2409.18102	link
2024-09-26	StackGen: Generating Stable Structures from Silhouettes via Diffusion	Luzhe Sun et.al.	2409.18098	null
2024-09-26	DiffSSC: Semantic LiDAR Scan Completion using Denoising Diffusion Probabilistic Models	Helin Cao et.al.	2409.18092	null
2024-09-26	Stable Video Portraits	Mirela Ostrek et.al.	2409.18083	null
2024-09-26	LightAvatar: Efficient Head Avatar as Dynamic Neural Light Field	Huan Wang et.al.	2409.18057	link
2024-09-26	Automated Detection and Analysis of Power Words in Persuasive Text Using Natural Language Processing	Sahil Garje et.al.	2409.18033	null
2024-09-26	PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging	Xin Cai et.al.	2409.17996	null
2024-09-26	Joint Localization and Planning using Diffusion	L. Lao Beyer et.al.	2409.17995	null
2024-09-26	Manufacturing, processing, applications, and advancements of Fe-based shape memory alloys	Anwar Algamal et.al.	2409.17973	null
2024-09-26	CNCA: Toward Customizable and Natural Generation of Adversarial Camouflage for Vehicle Detectors	Linye Lyu et.al.	2409.17963	null
2024-09-26	Relativistic diffusion model for hadron production in p-Pb collisions at the LHC	Philipp Schulz et.al.	2409.17960	null
2024-09-26	Perturb, Attend, Detect and Localize (PADL): Robust Proactive Image Defense	Filippo Bartolucci et.al.	2409.17941	null
2024-09-25	DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion	Yukun Huang et.al.	2409.17145	link
2024-09-25	Language-oriented Semantic Communication for Image Transmission with Fine-Tuned Diffusion Model	Xinfeng Wei et.al.	2409.17104	null
2024-09-25	Accumulator-Aware Post-Training Quantization	Ian Colbert et.al.	2409.17092	null
2024-09-25	Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification	Xinrui Zhou et.al.	2409.17091	null
2024-09-25	Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors	Aiping Zhang et.al.	2409.17058	link
2024-09-25	ControlCity: A Multimodal Diffusion Model Based Approach for Accurate Geospatial Data Generation and Urban Morphology Analysis	Fangshuo Zhou et.al.	2409.17049	link
2024-09-25	GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design	Phillip Mueller et.al.	2409.17045	null
2024-09-25	CNN Mixture-of-Depths	Rinor Cakaj et.al.	2409.17016	null
2024-09-25	Single Image, Any Face: Generalisable 3D Face Generation	Wenqing Wang et.al.	2409.16990	null
2024-09-25	Dynamic Obstacle Avoidance through Uncertainty-Based Adaptive Planning with Diffusion	Vineet Punyamoorty et.al.	2409.16950	null
2024-09-25	DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling	Kyuheon Jung et.al.	2409.16949	link
2024-09-25	Divergence asymmetry and connected components in a general duplication-divergence graph model	Dario Borrelli et.al.	2409.16943	null
2024-09-25	Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model	Hongliang Zhong et.al.	2409.16938	link
2024-09-25	Linking in Style: Understanding learned features in deep learning models	Maren H. Wehrheim et.al.	2409.16865	link
2024-09-25	A Versatile and Differentiable Hand-Object Interaction Representation	Théo Morales et.al.	2409.16855	null
2024-09-18	Massively Multi-Person 3D Human Motion Forecasting with Scene Context	Felix B Mueller et.al.	2409.12189	link
2024-09-18	MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion	Kalakonda Sai Shashank et.al.	2409.12140	null
2024-09-24	Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models	Sijing Chen et.al.	2409.12139	null
2024-09-18	Brain-Streams: fMRI-to-Image Reconstruction with Multi-modal Guidance	Jaehoon Joo et.al.	2409.12099	null
2024-09-19	Skill matching at scale: freelancer-project alignment for efficient multilingual candidate retrieval	Warren Jouanneau et.al.	2409.12097	null
2024-09-18	Design of Ligand-Binding Proteins with Atomic Flow Matching	Junqi Liu et.al.	2409.12080	null
2024-09-18	Denoising diffusion models for high-resolution microscopy image restoration	Pamela Osuna-Vargas et.al.	2409.12078	null
2024-09-19	Using Large Language Models to Generate Clinical Trial Tables and Figures	Yumeng Yang et.al.	2409.12046	null
2024-09-18	LEMON: Localized Editing with Mesh Optimization and Neural Shaders	Furkan Mert Algan et.al.	2409.12024	null
2024-09-18	Promise and Peril of Collaborative Code Generation Models: Balancing Effectiveness and Memorization	Zhi Chen et.al.	2409.12020	null
2024-09-18	Towards Global Localization using Multi-Modal Object-Instance Re-Identification	Aneesh Chavan et.al.	2409.12002	link
2024-09-18	Tracking Any Point with Frame-Event Fusion Network at High Frame Rate	Jiaxiong Liu et.al.	2409.11953	null
2024-09-18	Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models	Lorenzo Mandelli et.al.	2409.11920	null
2024-09-18	AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots	Zhaxizhuoma et.al.	2409.11905	null
2024-09-18	Finding the Subjective Truth: Collecting 2 Million Votes for Comprehensive Gen-AI Model Evaluation	Dimitrios Christodoulou et.al.	2409.11904	null
2024-09-17	Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion	Zhenwei Wang et.al.	2409.11406	null
2024-09-17	Teaching dark matter simulations to speak the halo language	Shivam Pandey et.al.	2409.11401	link
2024-09-17	Ultrasound Image Enhancement with the Variance of Diffusion Models	Yuxin Zhang et.al.	2409.11380	link
2024-09-17	OSV: One Step is Enough for High-Quality Image to Video Generation	Xiaofeng Mao et.al.	2409.11367	null
2024-09-17	Ping! Your Food is Ready: Comparing Different Notification Techniques in 3D AR Cooking Environment	Aditya Raikwar et.al.	2409.11357	null
2024-09-17	Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think	Gonzalo Martin Garcia et.al.	2409.11355	link
2024-09-17	OmniGen: Unified Image Generation	Shitao Xiao et.al.	2409.11340	link
2024-09-17	fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction	Jianxiong Gao et.al.	2409.11315	null
2024-09-17	SpMis: An Investigation of Synthetic Spoken Misinformation Detection	Peizhuo Liu et.al.	2409.11308	null
2024-09-17	Measurement of top-quark pair production in association with charm quarks in proton-proton collisions at $\sqrt{s}=13$ TeV with the ATLAS detector	ATLAS Collaboration et.al.	2409.11305	null
2024-09-17	NirvaWave: An Accurate and Efficient Near Field Wave Propagation Simulator for 6G and Beyond	Vahid Yazdnian et.al.	2409.11293	link
2024-09-17	DroneDiffusion: Robust Quadrotor Dynamics Learning with Diffusion Models	Avirup Das et.al.	2409.11292	null
2024-09-17	Neural Networks for Vehicle Routing Problem	László Kovács et.al.	2409.11290	null
2024-09-17	Attacking Slicing Network via Side-channel Reinforcement Learning Attack	Wei Shao et.al.	2409.11258	null
2024-09-17	Learning Source Disentanglement in Neural Audio Codec	Xiaoyu Bie et.al.	2409.11228	null
2024-09-16	Pennsieve - A Collaborative Platform for Translational Neuroscience and Beyond	Zack Goldblum et.al.	2409.10509	null
2024-09-16	Torres funerarias chullpa en el valle del río Lauca: un primer análisis arqueoastronómico	Alejandro Gangui et.al.	2409.10497	null
2024-09-16	Incorporating Classifier-Free Guidance in Diffusion Model-Based Recommendation	Noah Buchanan et.al.	2409.10494	null
2024-09-16	SimInversion: A Simple Framework for Inversion-Based Text-to-Image Editing	Qi Qian et.al.	2409.10476	null
2024-09-16	MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion	Lehong Wu et.al.	2409.10473	null
2024-09-16	Signed Graph Autoencoder for Explainable and Polarization-Aware Network Embeddings	Nikolaos Nakis et.al.	2409.10452	null
2024-09-16	Mamba-ST: State Space Model for Efficient Style Transfer	Filippo Botti et.al.	2409.10385	link
2024-09-16	2D or not 2D: How Does the Dimensionality of Gesture Representation Affect 3D Co-Speech Gesture Generation?	Téo Guichoux et.al.	2409.10357	null
2024-09-16	Taming Diffusion Models for Image Restoration: A Review	Ziwei Luo et.al.	2409.10353	null
2024-09-16	MEGS: Morphological Evaluation of Galactic Structure	Ufuk Çakır et.al.	2409.10346	link
2024-09-16	VAE-QWGAN: Improving Quantum GANs for High Resolution Image Generation	Aaron Mark Thomas et.al.	2409.10339	null
2024-09-16	Research and Design of a Financial Intelligent Risk Control Platform Based on Big Data Analysis and Deep Machine Learning	Shuochen Bi et.al.	2409.10331	null
2024-09-16	Fairness, not Emotion, Drives Socioeconomic Decision Making	Rudra Mukhopadhyay et.al.	2409.10322	null
2024-09-16	On Synthetic Texture Datasets: Challenges, Creation, and Curation	Blaine Hoak et.al.	2409.10297	null
2024-09-16	DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis	Fa-Ting Hong et.al.	2409.10281	null
2024-09-13	Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation	Qingwen Bu et.al.	2409.09016	link
2024-09-13	A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis	Yohan Poirier-Ginter et.al.	2409.08947	null
2024-09-13	Emerging Reliance Behaviors in Human-AI Text Generation: Hallucinations, Data Quality Assessment, and Cognitive Forcing Functions	Zahra Ashktorab et.al.	2409.08937	null
2024-09-13	Latent Space Score-based Diffusion Model for Probabilistic Multivariate Time Series Imputation	Guojun Liang et.al.	2409.08917	link
2024-09-13	Gaussian is All You Need: A Unified Framework for Solving Inverse Problems via Diffusion Posterior Sampling	Nebiyou Yismaw et.al.	2409.08906	null
2024-09-13	Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control	Carles Domingo-Enrich et.al.	2409.08861	null
2024-09-13	The Line-Based Dial-a-Ride Problem	Kendra Reiter et.al.	2409.08860	link
2024-09-13	InstantDrag: Improving Interactivity in Drag-based Image Editing	Joonghyuk Shin et.al.	2409.08857	null
2024-09-13	DX2CT: Diffusion Model for 3D CT Reconstruction from Bi or Mono-planar 2D X-ray(s)	Yun Su Jeong et.al.	2409.08850	null
2024-09-13	Development of a Compton Imager Setup	Anuraag Arya et.al.	2409.08822	null
2024-09-13	LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment	Huan Zhang et.al.	2409.08795	link
2024-09-13	What You Say = What You Want? Teaching Humans to Articulate Requirements for LLMs	Qianou Ma et.al.	2409.08775	link
2024-09-13	A Hybrid Meta-Learning and Multi-Armed Bandit Approach for Context-Specific Multi-Objective Recommendation Optimization	Tiago Cunha et.al.	2409.08752	null
2024-09-13	Adaptive Sampling for Continuous Group Equivariant Neural Networks	Berfin Inal et.al.	2409.08741	null
2024-09-13	DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset	Jiawei Du et.al.	2409.08731	link
2024-09-12	DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors	Thomas Hanwen Zhu et.al.	2409.08278	null
2024-09-12	Hand-Object Interaction Pretraining from Videos	Himanshu Gaurav Singh et.al.	2409.08273	null
2024-09-12	Click2Mask: Local Editing with Dynamic Mask Generation	Omer Regev et.al.	2409.08272	null
2024-09-12	DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer	Runjia Li et.al.	2409.08271	null
2024-09-12	Touch2Touch: Cross-Modal Tactile Generation for Object Manipulation	Samanta Rodriguez et.al.	2409.08269	null
2024-09-12	Improving Text-guided Object Inpainting with Semantic Pre-inpainting	Yifu Chen et.al.	2409.08260	link
2024-09-12	Improving Virtual Try-On with Garment-focused Diffusion Models	Siqi Wan et.al.	2409.08258	null
2024-09-12	LoRID: Low-Rank Iterative Diffusion for Adversarial Purification	Geigh Zollicoffer et.al.	2409.08255	null
2024-09-12	Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding	Hongyu Li et.al.	2409.08251	null
2024-09-12	IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation	Yinwei Wu et.al.	2409.08240	null
2024-09-12	Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources	Alisia Lupidi et.al.	2409.08239	null
2024-09-12	LT3SD: Latent Trees for 3D Scene Diffusion	Quan Meng et.al.	2409.08215	null
2024-09-12	VI3DRM:Towards meticulous 3D Reconstruction from Sparse Views via Photo-Realistic Novel View Synthesis	Hao Chen et.al.	2409.08207	null
2024-09-12	High-Frequency Anti-DreamBooth: Robust Defense Against Image Synthesis	Takuto Onikubo et.al.	2409.08167	link
2024-09-12	MagicStyle: Portrait Stylization Based on Reference Image	Zhaoli Deng et.al.	2409.08156	null
2024-09-11	DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation	Haibo Yang et.al.	2409.07454	null
2024-09-11	Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models	Haibo Yang et.al.	2409.07452	link
2024-09-11	FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process	Yang Luo et.al.	2409.07451	null
2024-09-11	Efficient One-Step Diffusion Refinement for Snapshot Compressive Imaging	Yunzhen Wang et.al.	2409.07417	null
2024-09-11	Extracting TCPIP Headers at High Speed for the Anonymized Network Traffic Graph Challenge	Zhaoyang Han et.al.	2409.07374	null
2024-09-11	Awaking the Slides: A Tuning-free and Knowledge-regulated AI Tutoring System via Language Model Coordination	Daniel Zhang-Li et.al.	2409.07372	null
2024-09-11	Event-based Mosaicing Bundle Adjustment	Shuang Guo et.al.	2409.07365	link
2024-09-11	Training-Free Guidance for Discrete Diffusion Models for Molecular Generation	Thomas J. Kerby et.al.	2409.07359	null
2024-09-11	Learning Robotic Manipulation Policies from Point Clouds with Conditional Flow Matching	Eugenio Chisari et.al.	2409.07343	null
2024-09-11	Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models	Fengzhe Zhang et.al.	2409.07323	null
2024-09-11	Optimizing Neural Network Performance and Interpretability with Diophantine Equation Encoding	Ronald Katende et.al.	2409.07310	null
2024-09-11	Exploring User-level Gradient Inversion with a Diffusion Prior	Zhuohang Li et.al.	2409.07291	null
2024-09-11	CCFExp: Facial Image Synthesis with Cycle Cross-Fusion Diffusion Model for Facial Paralysis Individuals	Weixiang Gao et.al.	2409.07271	link
2024-09-11	Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models	Sanoojan Baliah et.al.	2409.07269	link
2024-09-11	EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion	Jian Zhang et.al.	2409.07255	null
2024-09-10	Technical Report of Mobile Manipulator Robot for Industrial Environments	Erfan Amoozad Khalili et.al.	2409.06693	null
2024-09-10	SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation	Teng Hu et.al.	2409.06633	null
2024-09-10	MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification	Phu Pham et.al.	2409.06620	null
2024-09-10	A Primer on Variational Inference for Physics-Informed Deep Generative Modelling	Alex Glyn-Davies et.al.	2409.06560	null
2024-09-10	From LIMA to DeepLIMA: following a new path of interoperability	Victor Bocharov et.al.	2409.06550	null
2024-09-10	Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models	Xin Jing et.al.	2409.06451	null
2024-09-10	Prompt2Fashion: An automatically generated fashion dataset	Georgia Argyro et.al.	2409.06442	link
2024-09-10	Fast nonparametric inference of network backbones for graph sparsification	Alec Kirkley et.al.	2409.06417	link
2024-09-10	Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition	Junzheng Zhang et.al.	2409.06371	null
2024-09-10	What happens to diffusion model likelihood when your model is conditional?	Mattias Cross et.al.	2409.06364	null
2024-09-10	DiffQRCoder: Diffusion-based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement	Jia-Wei Liao et.al.	2409.06355	null
2024-09-10	Improving Conditional Level Generation using Automated Validation in Match-3 Games	Monica Villanueva Aylagas et.al.	2409.06349	null
2024-09-10	Foragax: An Agent Based Modelling framework based on JAX	Siddharth Chaturvedi et.al.	2409.06345	link
2024-09-10	G3PT: Unleash the power of Autoregressive Modeling in 3D Generation via Cross-scale Querying Transformer	Jinzhi Zhang et.al.	2409.06322	null
2024-09-10	Learning Augmentation Policies from A Model Zoo for Time Series Forecasting	Haochen Yuan et.al.	2409.06282	null
2024-09-09	Fast Generation of Custom Floating-Point Spatial Filters on FPGAs	Nelson Campos et.al.	2409.05837	null
2024-09-09	Enhancing Preference-based Linear Bandits via Human Response Time	Shen Li et.al.	2409.05798	null
2024-09-09	Predicting Critical Heat Flux with Uncertainty Quantification and Domain Generalization Using Conditional Variational Autoencoders and Deep Neural Networks	Farah Alsafadi et.al.	2409.05790	null
2024-09-09	Vector Quantized Diffusion Model Based Speech Bandwidth Extension	Yuan Fang et.al.	2409.05784	null
2024-09-09	AS-Speech: Adaptive Style For Speech Synthesis	Zhipeng Li et.al.	2409.05730	null
2024-09-09	pFedGPA: Diffusion-based Generative Parameter Aggregation for Personalized Federated Learning	Jiahao Lai et.al.	2409.05701	null
2024-09-09	Citizen-Led Personalization of User Interfaces: Investigating How People Customize Interfaces for Themselves and Others	Sérgio Alves et.al.	2409.05696	null
2024-09-09	Unlearning or Concealment? A Critical Analysis and Evaluation Metrics for Unlearning in Diffusion Models	Aakash Sen Sharma et.al.	2409.05668	null
2024-09-09	Forward KL Regularized Preference Optimization for Aligning Diffusion Policies	Zhao Shan et.al.	2409.05622	null
2024-09-09	CustomContrast: A Multilevel Contrastive Perspective For Subject-Driven Text-to-Image Customization	Nan Chen et.al.	2409.05606	null
2024-09-09	Latent 3D Brain MRI Counterfactual	Wei Peng et.al.	2409.05585	null
2024-09-09	Spatially-Aware Speaker for Vision-and-Language Navigation Instruction Generation	Muraleekrishna Gopinathan et.al.	2409.05583	link
2024-09-09	Design and Implementation of TAO DAQ System	Shuihan Zhang et.al.	2409.05522	null
2024-09-09	A Taxonomy of Miscompressions: Preparing Image Forensics for Neural Compression	Nora Hofer et.al.	2409.05490	null
2024-09-09	DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation	Wei Wu et.al.	2409.05463	null
2024-09-06	VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation	Yecheng Wu et.al.	2409.04429	link
2024-09-06	Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques	Davide Clode da Silva et.al.	2409.04424	null
2024-09-06	Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation	Zhuoyan Luo et.al.	2409.04410	null
2024-09-06	Enhancing Skin Lesion Diagnosis with Ensemble Learning	Xiaoyi Liu et.al.	2409.04381	null
2024-09-06	How Fair is Your Diffusion Recommender Model?	Daniele Malitesta et.al.	2409.04339	null
2024-09-06	Random effects estimation in a fractional diffusion model based on continuous observations	Nesrine Chebli et.al.	2409.04331	null
2024-09-06	Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models	Yuxiao Huang et.al.	2409.04270	null
2024-09-06	An overview of domain-specific foundation model: key technologies, applications and challenges	Haolong Chen et.al.	2409.04267	null
2024-09-06	UniDet3D: Multi-dataset Indoor 3D Object Detection	Maksim Kolodiazhnyi et.al.	2409.04234	link
2024-09-06	Generative Modelling via Quantile Regression	Johannes Schmidt-Hieber et.al.	2409.04231	null
2024-09-06	Breaking the Brownian Barrier: Models and Manifestations of Molecular Diffusion in Complex Fluids	Harish Srinivasan et.al.	2409.04199	null
2024-09-06	GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers	Lorenza Prospero et.al.	2409.04196	null
2024-09-06	Subsampling of Correlated Graph Signals	Rishabh Ravi et.al.	2409.04107	null
2024-09-06	Estimation of service value parameters for a queue with unobserved balking	Daniel Podorojnyi et.al.	2409.04090	null
2024-09-06	D4: Text-guided diffusion model-based domain adaptive data augmentation for vineyard shoot detection	Kentaro Hirahara et.al.	2409.04060	null
2024-09-05	Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding	Yunze Man et.al.	2409.03757	link
2024-09-05	WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild	Yuntian Deng et.al.	2409.03753	null
2024-09-05	ArtiFade: Learning to Generate High-quality Subject from Blemished Images	Shuya Yang et.al.	2409.03745	null
2024-09-06	RAG based Question-Answering for Contextual Response Prediction System	Sriram Veturi et.al.	2409.03708	null
2024-09-05	RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images	Benzhi Wang et.al.	2409.03644	link
2024-09-05	DiffEVC: Any-to-Any Emotion Voice Conversion with Expressive Guidance	Hsing-Hang Chou et.al.	2409.03636	null
2024-09-05	Generalizing Linear Graphs and Bond Graph Models with Hetero-functional Graphs for System-of-Systems Engineering Applications	Ehsanoddin Ghorbanichemazkati et.al.	2409.03630	null
2024-09-05	TCDiff: Triple Condition Diffusion Model with 3D Constraints for Stylizing Synthetic Faces	Bernardo Biesseck et.al.	2409.03600	link
2024-09-05	DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture	Qianlong Xiang et.al.	2409.03550	null
2024-09-05	Euclid preparation. Simulations and nonlinearities beyond $Λ$ CDM. 2. Results from non-standard simulations	Euclid Collaboration et.al.	2409.03523	null
2024-09-05	Blended Latent Diffusion under Attention Control for Real-World Video Editing	Deyin Liu et.al.	2409.03514	null
2024-09-05	Physical Modelling of Piano Sound	Haifan Xie et.al.	2409.03481	null
2024-09-05	Data-free Distillation with Degradation-prompt Diffusion for Multi-weather Image Restoration	Pei Wang et.al.	2409.03455	null
2024-09-05	Rx Strategist: Prescription Verification using LLM Agents System	Phuc Phan Van et.al.	2409.03440	null
2024-09-05	KiloBot: A Programming Language for Deploying Perception-Guided Industrial Manipulators at Scale	Wei Gao et.al.	2409.03439	null
2024-09-04	HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts	Xinyu Liu et.al.	2409.02919	link
2024-09-04	Latent Watermarking of Audio Generative Models	Robin San Roman et.al.	2409.02915	null
2024-09-04	Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling	Kaiwen Zheng et.al.	2409.02908	null
2024-09-04	Configurable Foundation Models: Building LLMs from a Modular Perspective	Chaojun Xiao et.al.	2409.02877	null
2024-09-04	Look Into the LITE in Deep Learning for Time Series Classification	Ali Ismail-Fawaz et.al.	2409.02869	link
2024-09-04	Building a Scalable, Effective, and Steerable Search and Ranking Platform	Marjan Celikik et.al.	2409.02856	null
2024-09-04	Human-VDM: Learning Single-Image 3D Human Gaussian Splatting from Video Diffusion Models	Zhibin Liu et.al.	2409.02851	link
2024-09-04	Anomaly Detection in Offshore Open Radio Access Network Using Long Short-Term Memory Models on a Novel Artificial Intelligence-Driven Cloud-Native Data Platform	Abdelrahim Ahmad et.al.	2409.02849	null
2024-09-04	Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model	Tornike Karchkhadze et.al.	2409.02845	null
2024-09-04	SNNAX -- Spiking Neural Networks in JAX	Jamie Lohoff et.al.	2409.02842	null
2024-09-04	Experimental Framework for Generating Reliable Ground Truth for Laryngeal Spatial Segmentation Tasks	Hamzeh Ghasemzadeh et.al.	2409.02809	null
2024-09-04	Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for PostNL	Mohammad Reshadati et.al.	2409.02711	null
2024-09-04	Rethinking HTG Evaluation: Bridging Generation and Recognition	Konstantina Nikolaidou et.al.	2409.02683	link
2024-09-04	Introduction to Machine Learning	Laurent Younes et.al.	2409.02668	null
2024-09-04	Creating Domain-Specific Translation Memories for Machine Translation Fine-tuning: The TRENCARD Bilingual Cardiology Corpus	Gokhan Dogru et.al.	2409.02667	null
2024-08-30	Generative AI Enables Medical Image Segmentation in Ultra Low-Data Regimes	Li Zhang et.al.	2408.17421	link
2024-08-30	Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in the Environmental and Climate Change Domain	Francesca Grasso et.al.	2408.17362	link
2024-08-30	Subspace Diffusion Posterior Sampling for Travel-Time Tomography	Xiang Cao et.al.	2408.17333	null
2024-08-30	Structuring a Training Strategy to Robustify Perception Models with Realistic Image Augmentations	Ahmed Hammam et.al.	2408.17311	null
2024-08-30	Leveraging Deep Generative Model For Computational Protein Design And Optimization	Boqiao Lai et.al.	2408.17241	null
2024-08-30	Towards Symbolic XAI -- Explanation Through Human Understandable Logical Relationships Between Features	Thomas Schnake et.al.	2408.17198	null
2024-09-02	Leveraging Blockchain and ANFIS for Optimal Supply Chain Management	Amirfarhad Farhadi et.al.	2408.17161	null
2024-08-30	Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning	Xiaoye Qu et.al.	2408.17150	link
2024-08-30	Flow Matching for Optimal Reaction Coordinates of Biomolecular System	Mingyuan Zhang et.al.	2408.17139	link
2024-08-30	Temporal and Interactive Modeling for Efficient Human-Human Motion Generation	Yabiao Wang et.al.	2408.17135	null
2024-09-02	RISSOLE: Parameter-efficient Diffusion Models via Block-wise Generation and Retrieval-Guidance	Avideep Mukherjee et.al.	2408.17095	null
2024-08-30	FissionVAE: Federated Non-IID Image Generation with Latent Space and Decoder Decomposition	Chen Hu et.al.	2408.17090	link
2024-08-30	Approximately Invertible Neural Network for Learned Image Compression	Yanbo Gao et.al.	2408.17073	null
2024-09-02	Instant Adversarial Purification with Adversarial Consistency Distillation	Chun Tong Lei et.al.	2408.17064	null
2024-08-30	Text-to-Image Generation Via Energy-Based CLIP	Roy Ganz et.al.	2408.17046	null
2024-08-29	ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model	Fangfu Liu et.al.	2408.16767	null
2024-08-29	CSGO: Content-Style Composition in Text-to-Image Generation	Peng Xing et.al.	2408.16766	null
2024-08-29	A Score-Based Density Formula, with Applications in Diffusion Generative Models	Gen Li et.al.	2408.16765	null
2024-08-29	UV-free Texture Generation with Denoising and Geodesic Heat Diffusions	Simone Foti et.al.	2408.16762	link
2024-08-29	One-Shot Learning Meets Depth Diffusion in Multi-Object Videos	Anisha Jain et.al.	2408.16704	null
2024-08-29	VMC: A Grammar for Visualizing Statistical Model Checks	Ziyang Guo et.al.	2408.16702	null
2024-08-29	GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models	Moreno D'Incà et.al.	2408.16700	link
2024-08-29	Optimization Models for the Quadratic Traveling Salesperson Problem	Yuxiao Chen et.al.	2408.16680	null
2024-08-29	DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving	Yongjie Fu et.al.	2408.16647	null
2024-08-29	RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model	Zhuan Shi et.al.	2408.16634	null
2024-08-28	TEDRA: Text-based Editing of Dynamic and Photoreal Actors	Basavaraj Sunagad et.al.	2408.15995	null
2024-08-28	Distribution Backtracking Builds A Faster Convergence Trajectory for One-step Diffusion Distillation	Shengyuan Zhang et.al.	2408.15991	link
2024-08-28	Thoughtseeds: Evolutionary Priors, Nested Markov Blankets, and the Emergence of Embodied Cognition	Prakash Chandra Kavi et.al.	2408.15982	null
2024-08-28	Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems	Ibrahim K. Ozaslan et.al.	2408.15969	null
2024-08-28	MetaGFN: Exploring Distant Modes with Adapted Metadynamics for Continuous GFlowNets	Dominic Phillips et.al.	2408.15905	null
2024-08-28	Gen-Swarms: Adapting Deep Generative Models to Swarms of Drones	Carlos Plou et.al.	2408.15899	null
2024-08-28	Airfoil Diffusion: Denoising Diffusion Model For Conditional Airfoil Generation	Reid Graves et.al.	2408.15898	link
2024-08-28	Disentangled Diffusion Autoencoder for Harmonization of Multi-site Neuroimaging Data	Ayodeji Ijishakin et.al.	2408.15890	null
2024-08-29	Recent Decade's Power Outage Data Reveals the Increasing Vulnerability of U.S. Power Infrastructure	Bo Li et.al.	2408.15882	null
2024-08-28	GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model	Yongjie Fu et.al.	2408.15868	null
2024-08-27	GenRec: Unifying Video Generation and Recognition with Diffusion Models	Zejia Weng et.al.	2408.15241	link
2024-08-27	Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation	Xiaojuan Wang et.al.	2408.15239	null
2024-08-27	Simulation of Stochastic Discrete Dislocation Dynamics in Ductile Vs Brittle Materials	Santosh Chhetri et.al.	2408.15157	null
2024-08-27	How transformers learn structured data: insights from hierarchical filtering	Jerome Garnier-Brun et.al.	2408.15138	link
2024-08-27	DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays	Yiran Sun et.al.	2408.15118	link
2024-08-27	Data-Driven Nonlinear Deformation Design of 3D-Printable Shells	Samuel Silverman et.al.	2408.15097	link
2024-08-27	Constrained Diffusion Models via Dual Training	Shervin Khalafi et.al.	2408.15094	null
2024-08-27	LN-Gen: Rectal Lymph Nodes Generation via Anatomical Features	Weidong Guo et.al.	2408.14977	null
2024-08-27	MegActor- $Σ$ : Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer	Shurong Yang et.al.	2408.14975	null
2024-08-27	Integrated Bundling and Pricing of Unique Items	Maxime Bouscary et.al.	2408.14913	null
2024-08-26	K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences	Zhikai Li et.al.	2408.14468	null
2024-08-26	Uncovering Knowledge Gaps in Radiology Report Generation Models through Knowledge Graphs	Xiaoman Zhang et.al.	2408.14397	link
2024-08-26	Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning	Sakhinana Sagar Srinivas et.al.	2408.14387	null
2024-08-26	GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy	Peiyan Li et.al.	2408.14368	link
2024-08-27	Foundation Models for Music: A Survey	Yinghao Ma et.al.	2408.14340	link
2024-08-26	Automated Machine Learning in Insurance	Panyi Dong et.al.	2408.14331	link
2024-08-26	LLM-3D Print: Large Language Models To Monitor and Control 3D Printing	Yayati Jadhav et.al.	2408.14307	null
2024-08-26	Learning Local Pattern Modularization for Point Cloud Reconstruction from Unseen Classes	Chao Chen et.al.	2408.14279	null
2024-08-26	Towards Synthetic Trace Generation of Modeling Operations using In-Context Learning Approach	Vittoriano Muttillo et.al.	2408.14259	null
2024-08-27	Text3DAug -- Prompted Instance Augmentation for LiDAR Perception	Laurenz Reichardt et.al.	2408.14253	link
2024-08-23	How Diffusion Models Learn to Factorize and Compose	Qiyao Liang et.al.	2408.13256	null
2024-08-23	Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption	Sakhinana Sagar Srinivas et.al.	2408.13248	null
2024-08-23	CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities	Tao Wu et.al.	2408.13239	null
2024-08-23	Social Welfare Maximization for Federated Learning with Network Effects	Xiang Li et.al.	2408.13223	null
2024-08-23	Instruct-DeBERTa: A Hybrid Approach for Aspect-based Sentiment Analysis on Textual Reviews	Dineth Jayakody et.al.	2408.13202	null
2024-08-23	IFH: a Diffusion Framework for Flexible Design of Graph Generative Models	Samuel Cognolato et.al.	2408.13194	link
2024-08-23	Deep Learning for Lung Disease Classification Using Transfer Learning and a Customized CNN Architecture with Attention	Xiaoyi Liu et.al.	2408.13180	null
2024-08-26	Focus on Neighbors and Know the Whole: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation	Bonan Li et.al.	2408.13149	null
2024-08-23	Diffusion-based Episodes Augmentation for Offline Multi-Agent Reinforcement Learning	Jihwan Oh et.al.	2408.13092	null
2024-08-23	General Intelligent Imaging and Uncertainty Quantification by Deterministic Diffusion Model	Weiru Fan et.al.	2408.13061	null
2024-08-22	xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations	Can Qin et.al.	2408.12590	null
2024-08-22	ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation	Lujia Zhong et.al.	2408.12561	link
2024-08-22	Show-o: One Single Transformer to Unify Multimodal Understanding and Generation	Jinheng Xie et.al.	2408.12528	null
2024-08-22	FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing	Jue Wang et.al.	2408.12429	link
2024-08-22	Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification	Sudi Murindanyi et.al.	2408.12426	null
2024-08-22	4D Diffusion for Dynamic Protein Structure Prediction with Reference Guided Motion Alignment	Kaihui Cheng et.al.	2408.12419	null
2024-08-22	CODE: Confident Ordinary Differential Editing	Bastien van Delft et.al.	2408.12418	link
2024-08-22	Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures	Ce Liu et.al.	2408.12413	null
2024-08-22	A Stable Polygamy Approach to Spectrum Access with Channel Reuse	Dan Ben Ami et.al.	2408.12402	null
2024-08-22	Multi-Style Facial Sketch Synthesis through Masked Generative Modeling	Bowen Sun et.al.	2408.12400	null
2024-08-21	Pixel Is Not A Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models	Chun-Yen Shih et.al.	2408.11810	null
2024-08-21	ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation	Shiqi Yang et.al.	2408.11805	null
2024-08-21	DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework	Zhifei Xie et.al.	2408.11788	null
2024-08-21	Timeline and Boundary Guided Diffusion Network for Video Shadow Detection	Haipeng Zhou et.al.	2408.11785	link
2024-08-21	Sum of Squares Circuits	Lorenzo Loconte et.al.	2408.11778	null
2024-08-21	Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards	Omar Erak et.al.	2408.11775	link
2024-08-21	D-RMGPT: Robot-assisted collaborative tasks driven by large multimodal models	M. Forlini et.al.	2408.11761	null
2024-08-21	JieHua Paintings Style Feature Extracting Model using Stable Diffusion with ControlNet	Yujia Gu et.al.	2408.11744	null
2024-08-21	Enhancing Cross-Modal Medical Image Segmentation through Compositionality	Aniek Eijpe et.al.	2408.11733	link
2024-08-21	AI-assisted Automated Short Answer Grading of Handwritten University Level Mathematics Exams	Tianyi Liu et.al.	2408.11728	null
2024-08-20	Reconciling Methodological Paradigms: Employing Large Language Models as Novice Qualitative Research Assistants in Talent Management Research	Sreyoshi Bhaduri et.al.	2408.11043	null
2024-08-20	Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model	Chunting Zhou et.al.	2408.11039	null
2024-08-20	Full Detector Simulation of a Projective Dual-Readout Segmented Crystal Electromagnetic Calorimeter with Precision Timing	Wonyong Chung et.al.	2408.11027	null
2024-08-20	MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning	Haoning Wu et.al.	2408.11001	link
2024-08-20	GreediRIS: Scalable Influence Maximization using Distributed Streaming Maximum Cover	Reet Barik et.al.	2408.10982	null
2024-08-21	Assortment Optimization Under History-Dependent Effects	Taotao He et.al.	2408.10967	null
2024-08-20	Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling	Jaideep Pathak et.al.	2408.10958	null
2024-08-20	SysBench: Can Large Language Models Follow System Messages?	Yanzhao Qin et.al.	2408.10943	link
2024-08-20	A Closer Look at Data Augmentation Strategies for Finetuning-Based Low/Few-Shot Object Detection	Vladislav Li et.al.	2408.10940	null
2024-08-20	Large Point-to-Gaussian Model for Image-to-3D Generation	Longfei Lu et.al.	2408.10935	null
2024-08-19	MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model	Minghua Liu et.al.	2408.10198	null
2024-08-19	SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views	Chao Xu et.al.	2408.10195	null
2024-08-19	Customizing Language Models with Instance-wise LoRA for Sequential Recommendation	Xiaoyu Kong et.al.	2408.10159	link
2024-08-19	Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a Low-Resource Language	Manjil Karki et.al.	2408.10128	null
2024-08-19	Learning Precise Affordances from Egocentric Videos for Robotic Manipulation	Gen Li et.al.	2408.10123	null
2024-08-19	Convert and Speak: Zero-shot Accent Conversion with Minimum Supervision	Zhijun Jia et.al.	2408.10096	null
2024-08-19	Stacked Intelligent Metasurfaces for Integrated Sensing and Communications	Haoxian Niu et.al.	2408.10043	null
2024-08-19	General Impedance Modeling for Modular Multilevel Converter with Grid-forming and Grid-following Control	Chu Sun et.al.	2408.10017	null
2024-08-19	Uniting contrastive and generative learning for event sequences models	Aleksandr Yugay et.al.	2408.09995	null
2024-08-19	Multi-layer diffusion model of photovoltaic installations	Tomasz Weron et.al.	2408.09904	null
2024-08-16	Automated High-throughput Organic Crystal Structure Prediction via Population-based Sampling	Qiang Zhu et.al.	2408.08843	link
2024-08-16	PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future	Guangyi Wang et.al.	2408.08822	null
2024-08-16	A Unified Automata-Theoretic Approach to LTLf Modulo Theories (Extended Version)	Marco Faella et.al.	2408.08817	null
2024-08-16	EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics	Chenwei Wan et.al.	2408.08782	link
2024-08-16	Comparative Analysis of Generative Models: Enhancing Image Synthesis with VAEs, GANs, and Stable Diffusion	Sanchayan Vivekananthan et.al.	2408.08751	null
2024-08-16	The Blessing of Strategic Customers in Personalized Pricing	Zhi Chen et.al.	2408.08738	null
2024-08-16	ChatZero:Zero-shot Cross-Lingual Dialogue Generation via Pseudo-Target Language	Yongkang Liu et.al.	2408.08724	null
2024-08-16	An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation	Peiming Guo et.al.	2408.08650	null
2024-08-16	Modeling the Neonatal Brain Development Using Implicit Neural Representations	Florentin Bieder et.al.	2408.08647	link
2024-08-16	Sampling effects on Lasso estimation of drift functions in high-dimensional diffusion processes	Chiara Amorino et.al.	2408.08638	null
2024-08-15	Understanding the Local Geometry of Generative Model Manifolds	Ahmed Imtiaz Humayun et.al.	2408.08307	null
2024-08-15	Accelerated Image-Aware Generative Diffusion Modeling	Tanmay Asthana et.al.	2408.08306	null
2024-08-15	Marker or Markerless? Mode-Switchable Optical Tactile Sensing for Diverse Robot Tasks	Ni Ou et.al.	2408.08276	null
2024-08-15	mhGPT: A Lightweight Generative Pre-Trained Transformer for Mental Health Text Analysis	Dae-young Kim et.al.	2408.08261	null
2024-08-15	Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding	Xiner Li et.al.	2408.08252	link
2024-08-15	Picosecond laser pulses for quantum dot-microcavity based single photon generation by cascaded electro-optic modulation of a narrow-linewidth laser	Mio Poortvliet et.al.	2408.08213	null
2024-08-15	Not Every Image is Worth a Thousand Words: Quantifying Originality in Stable Diffusion	Adi Haviv et.al.	2408.08184	null
2024-08-15	Impact of Comprehensive Data Preprocessing on Predictive Modelling of COVID-19 Mortality	Sangita Das et.al.	2408.08142	link
2024-08-15	Decoding Memes: A Comparative Study of Machine Learning Models for Template Identification	Levente Murgás et.al.	2408.08126	link
2024-08-15	When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding	Pingping Zhang et.al.	2408.08093	null
2024-08-14	Detecting Near-Duplicate Face Images	Sudipta Banerjee et.al.	2408.07689	link
2024-08-14	Composing Automatic Differentiation with Custom Derivatives of Higher-Order Functions	Sam Estep et.al.	2408.07683	null
2024-08-14	Drug Discovery SMILES-to-Pharmacokinetics Diffusion Models with Deep Molecular Understanding	Bing Hu et.al.	2408.07636	null
2024-08-14	Anisotropic Diffusion Model of Communication in 2D Biofilm	Yanahan Paramalingam et.al.	2408.07626	null
2024-08-14	Neural Quantum States and Peaked Molecular Wave Functions: Curse or Blessing?	Aleksei Malyshev et.al.	2408.07625	null
2024-08-14	MatterGPT: A Generative Transformer for Multi-Property Inverse Design of Solid-State Materials	Yan Chen et.al.	2408.07608	null
2024-08-14	PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation	Sang-Hoon Lee et.al.	2408.07547	link
2024-08-14	New Curriculum, New Chance -- Retrieval Augmented Generation for Lesson Planning in Ugandan Secondary Schools. Prototype Quality Evaluation	Simon Kloker et.al.	2408.07542	null
2024-08-14	DifuzCam: Replacing Camera Lens with a Mask and a Diffusion Model	Erez Yosef et.al.	2408.07541	null
2024-08-14	Towards Real-time Video Compressive Sensing on Mobile Devices	Miao Cao et.al.	2408.07530	link
2024-08-13	Imagen 3	Imagen-Team-Google et.al.	2408.07009	null
2024-08-13	Low-Bitwidth Floating Point Quantization for Efficient High-Quality Diffusion Models	Cheng Chen et.al.	2408.06995	null
2024-08-13	DCMSA: Multi-Head Self-Attention Mechanism Based on Deformable Convolution For Seismic Data Denoising	Wang Mingwei et.al.	2408.06963	null
2024-08-13	Neural Speech and Audio Coding	Minje Kim et.al.	2408.06954	null
2024-08-13	Diffusion Model for Slate Recommendation	Federico Tomasi et.al.	2408.06883	null
2024-08-13	Efficient Search for Customized Activation Functions with Gradient Descent	Lukas Strack et.al.	2408.06820	link
2024-08-13	Enhancing Diabetic Retinopathy Diagnosis: A Lightweight CNN Architecture for Efficient Exudate Detection in Retinal Fundus Images	Mujadded Al Rabbani Alif et.al.	2408.06784	null
2024-08-13	Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective	Ouxiang Li et.al.	2408.06741	link
2024-08-13	DiffLoRA: Generating Personalized Low-Rank Adaptation Weights with Diffusion	Yujia Wu et.al.	2408.06740	null
2024-08-13	Multimodal Analysis of White Blood Cell Differentiation in Acute Myeloid Leukemia Patients using a β-Variational Autoencoder	Gizem Mert et.al.	2408.06720	null
2024-08-12	The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery	Chris Lu et.al.	2408.06292	link
2024-08-12	Open-Source Molecular Processing Pipeline for Generating Molecules	Shreyas V et.al.	2408.06261	null
2024-08-12	3D Reconstruction of Protein Structures from Multi-view AFM Images using Neural Radiance Fields (NeRFs)	Jaydeep Rade et.al.	2408.06244	null
2024-08-12	Cislunar Constellation Design for Space Situational Awareness with Time-Expanded Facility Location Problem	Yuri Shimane et.al.	2408.06238	null
2024-08-12	Novel View Synthesis from a Single Image with Pretrained Diffusion Guidance	Taewon Kang et.al.	2408.06157	null
2024-08-12	LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library	Tianhao Yu et.al.	2408.06150	null
2024-08-12	Efficient and Scalable Point Cloud Generation with Sparse Point-Voxel Diffusion Models	Ioannis Romanelis et.al.	2408.06145	link
2024-08-12	Med42-v2: A Suite of Clinical LLMs	Clément Christophe et.al.	2408.06142	null
2024-08-12	Five Pitfalls When Assessing Synthetic Medical Images with Reference Metrics	Melanie Dohmen et.al.	2408.06075	null
2024-08-12	CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer	Zhuoyi Yang et.al.	2408.06072	link
2024-08-09	Multi-Garment Customized Model Generation	Yichen Liu et.al.	2408.05206	null
2024-08-09	TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning	Yujie Feng et.al.	2408.05200	link
2024-08-09	Cell Morphology-Guided Small Molecule Generation with GFlowNets	Stephen Zhewen Lu et.al.	2408.05196	link
2024-08-09	Lithography-free patterning of chalcogenide materials for integrated photonic devices	Zhen Hu et.al.	2408.05099	null
2024-08-09	Social contagion under hybrid interactions	Xincheng Shu et.al.	2408.05050	null
2024-08-09	Infrared Beam-shaping on Demand via Tailored Geometric Phase Metasurfaces employing the Plasmonic Phase-Change Material In3SbTe2	Lukas Conrads et.al.	2408.05044	null
2024-08-09	Collaborative Static-Dynamic Teaching: A Semi-Supervised Framework for Stripe-Like Space Target Detection	Zijian Zhu et.al.	2408.05029	null
2024-08-09	Retrieval-augmented code completion for local projects using large language models	Marko Hostnik et.al.	2408.05026	null
2024-08-09	DreamCouple: Exploring High Quality Text-to-3D Generation Via Rectified Flow	Hangyu Li et.al.	2408.05008	null
2024-08-09	Pay Attention To Mean Fields For Point Cloud Generation	Benno Käch et.al.	2408.04997	link
2024-08-08	Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics	Ruining Li et.al.	2408.04631	null
2024-08-08	Transformer Explainer: Interactive Learning of Text-Generative Models	Aeree Cho et.al.	2408.04619	null
2024-08-08	Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches	Yongzhi Xu et.al.	2408.04567	null
2024-08-08	Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models	Yupeng Chang et.al.	2408.04556	link
2024-08-08	On the Asymptotic Convergence of Subgraph Generated Models	Xinchen Xu et.al.	2408.04541	null
2024-08-08	AExGym: Benchmarks and Environments for Adaptive Experimentation	Jimmy Wang et.al.	2408.04531	null
2024-08-08	NFDI4Health workflow and service for synthetic data generation, assessment and risk management	Sobhan Moazemi et.al.	2408.04478	null
2024-08-08	Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations	Julen Urain et.al.	2408.04380	null
2024-08-08	Making sense of AI systems development	Mateusz Dolata et.al.	2408.04311	null
2024-08-08	AI-Driven Chatbot for Intrusion Detection in Edge Networks: Enhancing Cybersecurity with Ethical User Consent	Mugheez Asif et.al.	2408.04281	null
2024-08-07	Prospects for using drones to test formation-flying CubeSat concepts, and other astronomical applications	John D. Monnier et.al.	2408.03911	null
2024-08-07	Hate Speech Detection and Classification in Amharic Text with Deep Learning	Samuel Minale Gashe et.al.	2408.03849	null
2024-08-07	WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models	Prannaya Gupta et.al.	2408.03837	link
2024-08-07	A broken duet: multistable dynamics of dyadic interactions	Johan Medrano et.al.	2408.03809	link
2024-08-07	Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning	Martin Moder et.al.	2408.03807	link
2024-08-07	Data Generation Scheme for Thermal Modality with Edge-Guided Adversarial Conditional Diffusion Model	Guoqing Zhu et.al.	2408.03748	link
2024-08-07	Local Topology Measures of Contextual Language Model Latent Spaces With Applications to Dialogue Term Extraction	Benjamin Matthias Ruppik et.al.	2408.03706	null
2024-08-07	Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling	Zilyu Ye et.al.	2408.03695	link
2024-08-07	Unsupervised Detection of Fetal Brain Anomalies using Denoising Diffusion Models	Markus Ditlev Sjøgren Olsen et.al.	2408.03654	null
2024-08-07	Goal-oriented Semantic Communication for the Metaverse Application	Zhe Wang et.al.	2408.03646	null
2024-08-06	MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation	Xiaofeng Mao et.al.	2408.03312	null
2024-08-06	IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts	Ciara Rowles et.al.	2408.03209	null
2024-08-06	Personalizing Federated Instrument Segmentation with Visual Trait Priors in Robotic Surgery	Jialang Xu et.al.	2408.03208	null
2024-08-06	An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion	Xingguang Yan et.al.	2408.03178	null
2024-08-06	Iterative CT Reconstruction via Latent Variable Optimization of Shallow Diffusion Models	Sho Ozaki et.al.	2408.03156	null
2024-08-06	Enhancing Twitter Bot Detection via Multimodal Invariant Representations	Jibing Gong et.al.	2408.03096	null
2024-08-06	Analysis of Argument Structure Constructions in a Deep Recurrent Language Model	Pegah Ramezani et.al.	2408.03062	null
2024-08-06	OpenOmni: A Collaborative Open Source Tool for Building Future-Ready Multimodal Conversational Agents	Qiang Sun et.al.	2408.03047	link
2024-08-06	Targeted Visual Prompting for Medical Visual Question Answering	Sergio Tascon-Morales et.al.	2408.03043	link
2024-08-06	Training-Free Condition Video Diffusion Models for single frame Spatial-Semantic Echocardiogram Synthesis	Van Phi Nguyen et.al.	2408.03035	link
2024-08-05	Command-line Obfuscation Detection using Small Language Models	Vojtech Outrata et.al.	2408.02637	null
2024-08-05	VidGen-1M: A Large-Scale Dataset for Text-to-video Generation	Zhiyu Tan et.al.	2408.02629	null
2024-08-05	YOWOv3: An Efficient and Generalized Framework for Human Action Detection and Recognition	Duc Manh Nguyen Dang et.al.	2408.02623	link
2024-08-05	LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba	Yunxiang Fu et.al.	2408.02615	link
2024-08-05	MetaParticles: Computationally engineered nanomaterials with tunable and responsive properties	Massimiliano Paesani et.al.	2408.02564	null
2024-08-05	Fairness and Bias Mitigation in Computer Vision: A Survey	Sepehr Dehdashtian et.al.	2408.02464	null
2024-08-05	TGS: Trajectory Generation and Selection using Vision Language Models in Mapless Outdoor Environments	Daeun Song et.al.	2408.02454	null
2024-08-05	Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models	Zi Liang et.al.	2408.02416	link
2024-08-05	Multi-weather Cross-view Geo-localization Using Denoising Diffusion Models	Tongtong Feng et.al.	2408.02408	null
2024-08-05	A Few-Shot Approach for Relation Extraction Domain Adaptation using Large Language Models	Vanni Zavarella et.al.	2408.02377	null
2024-08-02	Conditional LoRA Parameter Generation	Xiaolong Jin et.al.	2408.01415	null
2024-08-02	Autoencoders in Function Space	Justin Bunker et.al.	2408.01362	link
2024-08-02	MCGMark: An Encodable and Robust Online Watermark for LLM-Generated Malicious Code	Kaiwen Ning et.al.	2408.01354	link
2024-08-02	TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling	Dong Huo et.al.	2408.01291	null
2024-08-02	A General Framework to Boost 3D GS Initialization for Text-to-3D Generation by Lexical Richness	Lutao Jiang et.al.	2408.01269	null
2024-08-02	Exchange control in a MOS double quantum dot made using a 300 mm wafer process	Jacob F. Chittock-Wood et.al.	2408.01241	null
2024-08-02	CLIP4Sketch: Enhancing Sketch to Mugshot Matching through Dataset Augmentation using Diffusion Models	Kushal Kumar Jain et.al.	2408.01233	null
2024-08-02	Reality Fusion: Robust Real-time Immersive Mobile Robot Teleoperation with Volumetric Visual Data Fusion	Ke Li et.al.	2408.01225	link
2024-08-02	PSP-GEN: Stochastic inversion of the Process-Structure-Property chain in materials design through deep, generative probabilistic modeling	Yaohua Zang et.al.	2408.01114	null
2024-08-02	Six Dragons Fly Again: Reviving 15th-Century Korean Court Music with Transformers and Novel Encoding	Danbinaerin Han et.al.	2408.01096	link
2024-08-01	Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation	Yixiao Wang et.al.	2408.00766	null
2024-08-01	Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention	Susung Hong et.al.	2408.00760	link
2024-08-01	DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency	Jovan Stojkovic et.al.	2408.00741	null
2024-08-01	TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models	Gilad Deutch et.al.	2408.00735	null
2024-08-01	A Natural Language Processing Framework for Hotel Recommendation Based on Users' Text Reviews	Lavrentia Aravani et.al.	2408.00716	null
2024-08-02	Reinforcement Learning applied to Insurance Portfolio Pursuit	Edward James Young et.al.	2408.00713	link
2024-08-01	MotionFix: Text-Driven 3D Human Motion Editing	Nikos Athanasiou et.al.	2408.00712	null
2024-08-01	Synthetic dual image generation for reduction of labeling efforts in semantic segmentation of micrographs with a customized metric function	Matias Oscar Volman Stern et.al.	2408.00707	null
2024-08-01	AutoM3L: An Automated Multimodal Machine Learning Framework with Large Language Models	Daqin Luo et.al.	2408.00665	link
2024-08-01	Privacy-preserving datasets by capturing feature distributions with Conditional VAEs	Francesco Di Salvo et.al.	2408.00639	link
2024-07-31	Detecting, Explaining, and Mitigating Memorization in Diffusion Models	Yuxin Wen et.al.	2407.21720	link
2024-07-31	Tora: Trajectory-oriented Diffusion Transformer for Video Generation	Zhenghao Zhang et.al.	2407.21705	link
2024-07-31	Generative Diffusion Model for Seismic Imaging Improvement of Sparsely Acquired Data and Uncertainty Quantification	Xingchen Shi et.al.	2407.21683	null
2024-07-31	Quality Control for Radiology Report Generation Models via Auxiliary Auditing Components	Hermione Warr et.al.	2407.21638	null
2024-07-31	LLM-for-X: Application-agnostic Integration of Large Language Models to Support Personal Writing Workflows	Lukas Teufelberger et.al.	2407.21593	null
2024-07-31	Long-term investment and energy procurement risk management under uncertainty for an electrolytic green hydrogen producer	Owen Palmer et.al.	2407.21574	null
2024-07-31	Conditioned Prompt-Optimization for Continual Deepfake Detection	Francesco Laiti et.al.	2407.21554	link
2024-07-31	CXSimulator: A User Behavior Simulation using LLM Embeddings for Web-Marketing Campaign Assessment	Akira Kasuga et.al.	2407.21553	null
2024-07-31	Explainable and Controllable Motion Curve Guided Cardiac Ultrasound Video Generation	Junxuan Yu et.al.	2407.21490	null
2024-07-31	Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends	Giuliano Martinelli et.al.	2407.21489	link
2024-07-30	Matting by Generation	Zhixiang Wang et.al.	2407.21017	null
2024-07-30	Add-SD: Rational Generation without Manual Reference	Lingfeng Yang et.al.	2407.21016	link
2024-07-30	Integrating Agent-Based and Compartmental Models for Infectious Disease Modeling: A Novel Hybrid Approach	Inan Bostanci et.al.	2407.20993	null
2024-07-30	MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions	Xiaowei Chi et.al.	2407.20962	link
2024-07-30	Mitigating calibration errors from mutual coupling with time-domain filtering of 21 cm cosmological radio observations	N. Charles et.al.	2407.20923	null
2024-07-30	Impact of Geographical Separation on Spectrum Sharing Markets	Kangle Mu et.al.	2407.20909	null
2024-07-30	Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering	Yanpeng Zhao et.al.	2407.20908	link
2024-07-30	Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks	Yunfeng Diao et.al.	2407.20836	null
2024-07-30	Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning	Norman Di Palo et.al.	2407.20798	null
2024-07-30	SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models	Zheng Liu et.al.	2407.20756	link
2024-07-29	Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing	Ekaterina Iakovleva et.al.	2407.20232	null
2024-07-29	LatentArtiFusion: An Effective and Efficient Histological Artifacts Restoration Framework	Zhenqi He et.al.	2407.20172	link
2024-07-29	Diffusion Feedback Helps CLIP See Better	Wenxuan Wang et.al.	2407.20171	link
2024-07-29	DDAP: Dual-Domain Anti-Personalization against Text-to-Image Diffusion Models	Jing Yang et.al.	2407.20141	null
2024-07-29	Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning	Liyuan Mao et.al.	2407.20109	null
2024-07-29	On the significance of parameters and the projective level in the Choice and Collection axioms	Vladimir Kanovei et.al.	2407.20098	null
2024-07-29	Generative Diffusion Model Bootstraps Zero-shot Classification of Fetal Ultrasound Images In Underrepresented African Populations	Fangyijie Wang et.al.	2407.20072	link
2024-07-29	ImagiNet: A Multi-Content Dataset for Generalizable Synthetic Image Detection via Contrastive Learning	Delyan Boychev et.al.	2407.20020	link
2024-07-29	Reproducibility Study of "ITI-GEN: Inclusive Text-to-Image Generation"	Daniel Gallo Fernández et.al.	2407.19996	link
2024-07-29	HeadsetOff: Enabling Photorealistic Video Conferencing on Economical VR Headsets	Yili Jin et.al.	2407.19988	null
2024-07-26	Generative Adversarial Networks for Imputing Sparse Learning Performance	Liang Zhang et.al.	2407.18875	null
2024-07-26	Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment	Yuze Zheng et.al.	2407.18854	null
2024-07-26	Scalable Group Choreography via Variational Phase Manifold Learning	Nhat Le et.al.	2407.18839	null
2024-07-26	Revision of calcium and scandium abundances in Am stars based on NLTE calculations and comparison with diffusion stellar evolution models	L. I. Mashonkina et.al.	2407.18736	null
2024-07-26	BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation	Peng Hao et.al.	2407.18715	null
2024-07-26	Q-gen: A Parameterized Quantum Circuit Generator	Yikai Mao et.al.	2407.18697	link
2024-07-26	Adversarial Robustification via Text-to-Image Diffusion Models	Daewon Choi et.al.	2407.18658	link
2024-07-26	Robust VAEs via Generating Process of Noise Augmented Data	Hiroo Irobe et.al.	2407.18632	null
2024-07-26	Denoising Lévy Probabilistic Models	Dario Shariatian et.al.	2407.18609	link
2024-07-26	How To Segment in 3D Using 2D Models: Automated 3D Segmentation of Prostate Cancer Metastatic Lesions on PET Volumes Using Multi-Angle Maximum Intensity Projections and Diffusion Models	Amirhosein Toosi et.al.	2407.18555	link
2024-07-25	RegionDrag: Fast Region-Based Image Editing with Diffusion Models	Jingyi Lu et.al.	2407.18247	null
2024-07-25	VGGHeads: A Large-Scale Synthetic Dataset for 3D Human Heads	Orest Kupyn et.al.	2407.18245	link
2024-07-25	CodedVO: Coded Visual Odometry	Sachin Shah et.al.	2407.18240	null
2024-07-25	SuperFlow: A Fully-Customized RTL-to-GDS Design Automation Flow for Adiabatic Quantum-Flux-Parametron Superconducting Circuits	Yanyue Xie et.al.	2407.18209	null
2024-07-25	Test2VA: Reusing GUI Test Cases for Voice Assistant Features Development in Mobile Applications	Garrett Weaver et.al.	2407.18155	null
2024-07-25	Self-supervised pre-training with diffusion model for few-shot landmark detection in x-ray images	Roberto Di Via et.al.	2407.18125	null
2024-07-25	Keypoint Promptable Re-Identification	Vladimir Somers et.al.	2407.18112	link
2024-07-25	SSTD: Stripe-Like Space Target Detection using Single-Point Supervision	Zijian Zhu et.al.	2407.18097	null
2024-07-25	Cross-Observatory Coordination with tilepy: A Novel Tool for Observations of Multi-Messenger Transient Events	Monica Seglar-Arroyo et.al.	2407.18076	null
2024-07-25	AttentionHand: Text-driven Controllable Hand Image Generation for 3D Hand Reconstruction in the Wild	Junho Park et.al.	2407.18034	link
2024-07-24	SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency	Yiming Xie et.al.	2407.17470	null
2024-07-24	BlueTempNet: A Temporal Multi-network Dataset of Social Interactions in Bluesky Social	Ujun Jeong et.al.	2407.17451	link
2024-07-24	ProvenanceWidgets: A Library of UI Control Elements to Track and Dynamically Overlay Analytic Provenance	Arpit Narechania et.al.	2407.17431	link
2024-07-24	CDDIP: Constrained Diffusion-Driven Deep Image Prior for Seismic Image Reconstruction	Paul Goyes-Peñafiel et.al.	2407.17402	link
2024-07-24	Cosmic ray susceptibility of the Terahertz Intensity Mapper detector arrays	Lun-Jun Liu et.al.	2407.17381	null
2024-07-24	ViPer: Visual Personalization of Generative Models via Individual Preference Learning	Sogand Salehi et.al.	2407.17365	null
2024-07-24	Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching	Yuyang Ding et.al.	2407.17349	link
2024-07-24	Quantum nonlocal modulation cancellation with distributed clocks	Stephen D. Chapman et.al.	2407.17330	null
2024-07-25	Enhanced Deep Learning Methodologies and MRI Selection Techniques for Dementia Diagnosis in the Elderly Population	Nikolaos Ntampakis et.al.	2407.17324	null
2024-07-24	Edge-Cloud Continuum Orchestration of Critical Services: A Smart-City Approach	Rodrigo Rosmaninho et.al.	2407.17314	null
2024-07-23	Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions	Fabio Tosi et.al.	2407.16698	link
2024-07-23	From Imitation to Refinement -- Residual RL for Precise Visual Assembly	Lars Ankile et.al.	2407.16677	null
2024-07-23	RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent	Huiyu Xu et.al.	2407.16667	null
2024-07-23	MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence	Canyu Zhao et.al.	2407.16655	null
2024-07-23	Unveiling and Mitigating Bias in Audio Visual Segmentation	Peiwen Sun et.al.	2407.16638	null
2024-07-23	Knowledge-driven AI-generated data for accurate and interpretable breast ultrasound diagnoses	Haojun Yu et.al.	2407.16634	null
2024-07-23	GenRec: A Flexible Data Generator for Recommendations	Erica Coppolillo et.al.	2407.16594	null
2024-07-23	COALA: A Practical and Vision-Centric Federated Learning Platform	Weiming Zhuang et.al.	2407.16560	link
2024-07-23	DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models	Zhenyu Xie et.al.	2407.16511	null
2024-07-23	qMRI Diffusor: Quantitative T1 Mapping of the Brain using a Denoising Diffusion Probabilistic Model	Shishuai Wang et.al.	2407.16477	null
2024-07-22	Artist: Aesthetically Controllable Text-Driven Stylization without Training	Ruixiang Jiang et.al.	2407.15842	link
2024-07-23	A Large-scale Benchmark Dataset for Commuting Origin-destination Matrix Generation	Can Rong et.al.	2407.15823	link
2024-07-22	Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget	Vikash Sehwag et.al.	2407.15811	null
2024-07-22	Quantum Computing for Phonon Scattering Effects on Thermal Conductivity	Xiangjun Tan et.al.	2407.15808	null
2024-07-22	Enhancing Mass Customization Manufacturing: Multiobjective Metaheuristic Algorithms for flow shop Production in Smart Industry	Diego Rossit et.al.	2407.15802	null
2024-07-22	Diffusion Model Based Resource Allocation Strategy in Ultra-Reliable Wireless Networked Control Systems	Amirhassan Babazadeh Darabi et.al.	2407.15784	null
2024-07-22	A Hamilton-Jacobi approach to road-field reaction-diffusion models	Christopher Henderson et.al.	2407.15760	null
2024-07-22	Diffusion for Out-of-Distribution Detection on Road Scenes and Beyond	Silvio Galesso et.al.	2407.15739	link
2024-07-22	DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan Design	Zhi Hao Luo et.al.	2407.15723	link
2024-07-22	Estimating Probability Densities with Transformer and Denoising Diffusion	Henry W. Leung et.al.	2407.15703	link
2024-07-19	DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks	Sarah Jabbour et.al.	2407.14509	null
2024-07-19	On Pre-training of Multimodal Language Models Customized for Chart Understanding	Wan-Cyuan Fan et.al.	2407.14506	null
2024-07-19	T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation	Kaiyue Sun et.al.	2407.14505	link
2024-07-19	M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models	Seunggeun Chi et.al.	2407.14502	null
2024-07-19	A Precision Cryogenic Positioning Stage for Detector Dithering and Flexure Compensation	Stephen A. Smee et.al.	2407.14493	null
2024-07-19	Contrastive Learning with Counterfactual Explanations for Radiology Report Generation	Mingjie Li et.al.	2407.14474	null
2024-07-19	Describe Data to get Science-Data-Ready Tooling: Awkward as a Target for Kaitai Struct YAML	Manasvi Goyal et.al.	2407.14461	null
2024-07-19	Co-synthesis of Histopathology Nuclei Image-Label Pairs using a Context-Conditioned Joint Diffusion Model	Seonghui Min et.al.	2407.14434	null
2024-07-19	Controllable and Efficient Multi-Class Pathology Nuclei Data Augmentation using Text-Conditioned Diffusion Models	Hyun-Jic Oh et.al.	2407.14426	null
2024-07-19	GLAudio Listens to the Sound of the Graph	Aurelio Sulser et.al.	2407.14387	link
2024-07-18	LogoSticker: Inserting Logos into Diffusion Models for Customized Generation	Mingkang Zhu et.al.	2407.13752	null
2024-07-18	Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review	Masatoshi Uehara et.al.	2407.13734	link
2024-07-18	Shaded Route Planning Using Active Segmentation and Identification of Satellite Images	Longchao Da et.al.	2407.13689	null
2024-07-18	PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers	Songlin Li et.al.	2407.13677	link
2024-07-18	MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture Synthesis	Ziming Zhong et.al.	2407.13675	link
2024-07-18	Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models	Xiaoyu Zhu et.al.	2407.13642	null
2024-07-18	Training-free Composite Scene Generation for Layout-to-Image Synthesis	Jiaqi Liu et.al.	2407.13609	link
2024-07-18	EnergyDiff: Universal Time-Series Energy Data Generation using Diffusion Models	Nan Lin et.al.	2407.13538	null
2024-07-18	VeriQR: A Robustness Verification Tool for Quantum Machine Learning Models	Yanling Lin et.al.	2407.13533	null
2024-07-18	All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models	Charumathi Badrinath et.al.	2407.13449	link
2024-07-17	SMooDi: Stylized Motion Diffusion Model	Lei Zhong et.al.	2407.12783	null
2024-07-17	VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control	Sherwin Bahmani et.al.	2407.12781	null
2024-07-17	Hallucination Index: An Image Quality Metric for Generative Reconstruction Models	Matthew Tivnan et.al.	2407.12780	null
2024-07-17	GroundUp: Rapid Sketch-Based 3D City Massing	Gizem Esra Unlu et.al.	2407.12739	null
2024-07-17	EchoSight: Advancing Visual-Language Models with Wiki Knowledge	Yibin Yan et.al.	2407.12735	null
2024-07-17	NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model	Zhongqun Zhang et.al.	2407.12727	null
2024-07-17	An Evaluation of Continual Learning for Advanced Node Semiconductor Defect Inspection	Amit Prasad et.al.	2407.12724	null
2024-07-17	Unlocking planetesimal magnetic field histories: a refined, versatile model for thermal evolution and dynamo generation	Hannah R. Sanderson et.al.	2407.12721	null
2024-07-17	SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow	Yuanzhi Zhu et.al.	2407.12718	link
2024-07-17	Teleoperation in Robot-assisted MIS with Adaptive RCM via Admittance Control	Ehsan Nasiri et.al.	2407.12711	null
2024-07-16	Efficient Training with Denoised Neural Weights	Yifan Gong et.al.	2407.11966	null
2024-07-16	UrbanWorld: An Urban World Model for 3D City Generation	Yu Shang et.al.	2407.11965	link
2024-07-16	Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design	Leo Klarner et.al.	2407.11942	link
2024-07-16	Code Documentation and Analysis to Secure Software Development	Paul Attie et.al.	2407.11934	null
2024-07-16	Global Optimisation of Black-Box Functions with Generative Models in the Wasserstein Space	Tigran Ramazyan et.al.	2407.11917	link
2024-07-16	Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data	Tim Elsner et.al.	2407.11913	null
2024-07-16	Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development	Daoyuan Chen et.al.	2407.11784	link
2024-07-16	Diffusion-driven self-assembly of emerin nanodomains at the nuclear envelope	Carlos D. Alas et.al.	2407.11758	null
2024-07-16	Generating Multi-Modal and Multi-Attribute Single-Cell Counts with CFGen	Alessandro Palma et.al.	2407.11734	link
2024-07-16	Theoretical Insights into CycleGAN: Analyzing Approximation and Estimation Errors in Unpaired Data Generation	Luwei Sun et.al.	2407.11678	null
2024-07-15	Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion	Yongyuan Liang et.al.	2407.10973	null
2024-07-15	Fast Matrix Multiplications for Lookup Table-Quantized LLMs	Han Guo et.al.	2407.10960	link
2024-07-15	InVi: Object Insertion In Videos Using Off-the-Shelf Diffusion Models	Nirat Saini et.al.	2407.10958	null
2024-07-16	DataDream: Few-shot Guided Dataset Generation	Jae Myung Kim et.al.	2407.10910	link
2024-07-15	Optical Diffusion Models for Image Generation	Ilker Oguz et.al.	2407.10897	null
2024-07-15	R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection	Zheyuan Zhou et.al.	2407.10862	null
2024-07-15	Physics-Inspired Generative Models in Medical Imaging: A Review	Dennis Hein et.al.	2407.10856	null
2024-07-15	Inferring dark energy properties from the scale factor parametrisation	Upala Mukhopadhayay et.al.	2407.10845	null
2024-07-15	MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration	Yulin Ren et.al.	2407.10833	null
2024-07-15	Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation	Tu Vu et.al.	2407.10817	null
2024-07-12	StyleSplat: 3D Object Style Transfer with Gaussian Splatting	Sahil Jain et.al.	2407.09473	null
2024-07-12	FairyLandAI: Personalized Fairy Tales utilizing ChatGPT and DALLE-3	Georgios Makridis et.al.	2407.09467	null
2024-07-12	The $μ\mathcal{G}$ Language for Programming Graph Neural Networks	Matteo Belenchia et.al.	2407.09441	null
2024-07-12	Graph Neural Network Causal Explanation via Neural Causal Models	Arman Behnam et.al.	2407.09378	link
2024-07-12	Computationally Efficient Estimation of Large Probit Models	Patrick Ding et.al.	2407.09371	null
2024-07-12	Is Contrasting All You Need? Contrastive Learning for the Detection and Attribution of AI-generated Text	Lucio La Cava et.al.	2407.09364	null
2024-07-15	Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees	Alexia Jolicoeur-Martineau et.al.	2407.09357	link
2024-07-12	PID: Physics-Informed Diffusion Model for Infrared Image Generation	Fangyuan Mao et.al.	2407.09299	link
2024-07-12	Learning Distances from Data with Normalizing Flows and Score Matching	Peter Sorrenson et.al.	2407.09297	null
2024-07-12	Surgical Text-to-Image Generation	Chinedu Innocent Nwoye et.al.	2407.09230	null
2024-07-11	Video Diffusion Alignment via Reward Gradients	Mihir Prabhudesai et.al.	2407.08737	link
2024-07-11	Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models	Zhening Xing et.al.	2407.08701	null
2024-07-11	FAR-Trans: An Investment Dataset for Financial Asset Recommendation	Javier Sanz-Cruzado et.al.	2407.08692	null
2024-07-11	Scattering transforms on the sphere, application to large scale structure modelling	Louise Mousset et.al.	2407.08687	null
2024-07-11	CAD-Prompted Generative Models: A Pathway to Feasible and Novel Engineering Designs	Leah Chong et.al.	2407.08675	null
2024-07-11	Still-Moving: Customized Video Generation without Customized Video Data	Hila Chefer et.al.	2407.08674	null
2024-07-11	Controlling the Fidelity and Diversity of Deep Generative Models via Pseudo Density	Shuangqi Li et.al.	2407.08659	null
2024-07-11	Adaptive Smooth Non-Stationary Bandits	Joe Suk et.al.	2407.08654	null
2024-07-11	Fine-Tuning Stable Diffusion XL for Stylistic Icon Generation: A Comparison of Caption Size	Youssef Sultan et.al.	2407.08513	null
2024-07-11	Latent Conditional Diffusion-based Data Augmentation for Continuous-Time Dynamic Graph Mode	Yuxing Tian et.al.	2407.08500	null
2024-07-10	Generative Image as Action Models	Mohit Shridhar et.al.	2407.07875	link
2024-07-10	Dynamical Measure Transport and Neural PDE Solvers for Sampling	Jingtong Sun et.al.	2407.07873	null
2024-07-10	Controlling Space and Time with Diffusion Models	Daniel Watson et.al.	2407.07860	null
2024-07-10	Generic Numerical Analysis of Stochastic Reaction Diffusion Model with applications in excitable media	Yahya Alnashri et.al.	2407.07834	null
2024-07-10	Universal and non-universal signatures in the scaling functions of critical variables	Gianluca Teza et.al.	2407.07782	null
2024-07-10	Towards Human-Like Driving: Active Inference in Autonomous Vehicle Control	Elahe Delavari et.al.	2407.07684	null
2024-07-10	VEnhancer: Generative Space-Time Enhancement for Video Generation	Jingwen He et.al.	2407.07667	null
2024-07-10	A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry	Martin Lindström et.al.	2407.07664	link
2024-07-10	The heterogeneous impact of the EU-Canada agreement with causal machine	Lionel Fontagné et.al.	2407.07652	null
2024-07-11	MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis	Wanggui He et.al.	2407.07614	link
2024-07-09	ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction	Shaozhe Hao et.al.	2407.07077	link
2024-07-09	Latent Space Imaging	Matheus Souza et.al.	2407.07052	null
2024-07-09	Generative models of astrophysical fields with scattering transforms on the sphere	Louise Mousset et.al.	2407.07007	link
2024-07-10	PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning Methods	Yiying Wang et.al.	2407.06985	link
2024-07-09	Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach	Taolin Zhang et.al.	2407.06964	null
2024-07-09	RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models	Bowen Zhang et.al.	2407.06938	null
2024-07-09	HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance	Guian Fang et.al.	2407.06937	link
2024-07-09	Fine-grained large-scale content recommendations for MSX sellers	Manpreet Singh et.al.	2407.06910	null
2024-07-09	Enhanced Battery Degradation-Aware Scheduling for Distribution Network with Electric Vehicle Load	Vijay Babu Pamshetti et.al.	2407.06857	null
2024-07-09	A reaction-diffusion model for relapsing-remitting multiple sclerosis with a treatment term	Romina Travaglini et.al.	2407.06802	null
2024-07-08	Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images	Zhangyang Qi et.al.	2407.06191	null
2024-07-08	CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation	Xinying Guo et.al.	2407.06188	null
2024-07-08	JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation	Yu Zeng et.al.	2407.06187	null
2024-07-08	The Tug-of-War Between Deepfake Generation and Detection	Hannah Lee et.al.	2407.06174	null
2024-07-08	ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation	Ethan Chern et.al.	2407.06135	link
2024-07-08	Structured Generations: Using Hierarchical Clusters to guide Diffusion Models	Jorge da Silva Goncalves et.al.	2407.06124	link
2024-07-08	PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models	Jinhua Zhang et.al.	2407.06109	link
2024-07-08	Accelerating Diffusion for SAR-to-Optical Image Translation via Adversarial Consistency Distillation	Xinyu Bai et.al.	2407.06095	null
2024-07-08	Assessing Cardiomegaly in Dogs Using a Simple CNN Model	Nikhil Deekonda et.al.	2407.06092	null
2024-07-08	Layered Diffusion Model for One-Shot High Resolution Text-to-Image Synthesis	Emaad Khwaja et.al.	2407.06079	null
2024-07-05	RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation	Yuxuan Kuang et.al.	2407.04689	link
2024-07-05	Thermal and mechanical study of a parametrised cryostat model for optical characterisation of upcoming CMB experiments	Thomas J. L. J. Gascard et.al.	2407.04613	link
2024-07-08	PartCraft: Crafting Creative Objects by Parts	Kam Woh Ng et.al.	2407.04604	link
2024-07-05	Structural Constraint Integration in Generative Model for Discovery of Quantum Material Candidates	Ryotaro Okabe et.al.	2407.04557	null
2024-07-05	Unified continuous-time q-learning for mean-field game and mean-field control problems	Xiaoli Wei et.al.	2407.04521	null
2024-07-08	Speed-accuracy trade-off for the diffusion models: Wisdom from nonequilibrium thermodynamics and optimal transport	Kotaro Ikeda et.al.	2407.04495	null
2024-07-05	PROUD: PaRetO-gUided Diffusion Model for Multi-objective Generation	Yinghua Yao et.al.	2407.04493	link
2024-07-05	Dude: Dual Distribution-Aware Context Prompt Learning For Large Vision-Language Model	Duy M. H. Nguyen et.al.	2407.04489	null
2024-07-05	Leveraging Graph Structures to Detect Hallucinations in Large Language Models	Noa Nonkes et.al.	2407.04485	link
2024-07-05	VCD-Texture: Variance Alignment based 3D-2D Co-Denoising for Text-Guided Texturing	Shang Liu et.al.	2407.04461	null
2024-07-03	DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents	Yilun Xu et.al.	2407.03300	link
2024-07-03	Improved Noise Schedule for Diffusion Training	Tiankai Hang et.al.	2407.03297	null
2024-07-03	Anomaly-based Framework for Detecting Power Overloading Cyberattacks in Smart Grid AMI	Abdelaziz Amara Korba et.al.	2407.03264	null
2024-07-03	SOS! Soft Prompt Attack Against Open-Source Large Language Models	Ziqing Yang et.al.	2407.03160	null
2024-07-04	Spatio-Temporal Adaptive Diffusion Models for EEG Super-Resolution in Epilepsy Diagnosis	Tong Zhou et.al.	2407.03089	null
2024-07-03	Artificial Inductive Bias for Synthetic Tabular Data Generation in Data-Scarce Scenarios	Patricia A. Apellániz et.al.	2407.03080	link
2024-07-03	Electromagnetic Property Sensing Based on Diffusion Model in ISAC System	Yuhua Jiang et.al.	2407.03075	null
2024-07-03	Semantic-Aware Power Allocation for Generative Semantic Communications with Foundation Models	Chunmei Xu et.al.	2407.03050	null
2024-07-03	SlerpFace: Face Template Protection via Spherical Linear Interpolation	Zhizhou Zhong et.al.	2407.03043	null
2024-07-03	An Organism Starts with a Single Pix-Cell: A Neural Cellular Diffusion for High-Resolution Image Synthesis	Marawan Elbatel et.al.	2407.03018	link
2024-07-02	Magic Insert: Style-Aware Drag-and-Drop	Nataniel Ruiz et.al.	2407.02489	null
2024-07-02	Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models	Fei Shen et.al.	2407.02482	link
2024-07-02	A Pattern Language for Machine Learning Tasks	Benjamin Rodatz et.al.	2407.02424	null
2024-07-02	GCF: Graph Convolutional Networks for Facial Expression Recognition	Hozaifa Kassab et.al.	2407.02361	null
2024-07-02	MORPHEUS: Modeling Role from Personalized Dialogue History by Exploring and Utilizing Latent Space	Yihong Tang et.al.	2407.02345	null
2024-07-02	Choice-based time slot management in attended home delivery	Dorsa Abdolhamidi et.al.	2407.02339	null
2024-07-02	Mining Constraints from Reference Process Models for Detecting Best-Practice Violations in Event Log	Adrian Rebmann et.al.	2407.02336	link
2024-07-02	A tactical time slot management problem under mixed logit demand	Dorsa Abdolhamidi et.al.	2407.02308	null
2024-07-02	Renard: A Modular Pipeline for Extracting Character Networks from Narrative Texts	Arthur Amalvy et.al.	2407.02284	link
2024-07-03	Federated Distillation for Medical Image Classification: Towards Trustworthy Computer-Aided Diagnosis	Sufen Ren et.al.	2407.02261	null
2024-06-28	Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language	Yicheng Chen et.al.	2406.20085	null
2024-06-28	The hybrid Josephson rhombus: A superconducting element with tailored current-phase relation	L. Banszerus et.al.	2406.20082	null
2024-06-28	HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model	Hieu T. Nguyen et.al.	2406.20077	null
2024-06-28	Modeling and LQR Control of Insect Sized Flapping Wing Robot	Daksh Dhingra et.al.	2406.20061	null
2024-06-28	Neural Differentiable Modeling with Diffusion-Based Super-resolution for Two-Dimensional Spatiotemporal Turbulence	Xiantao Fan et.al.	2406.20047	null
2024-06-28	Electrostatics-based particle sampling and approximate inference	Yongchao Huang et.al.	2406.20044	link
2024-06-28	HAITCH: A Framework for Distortion and Motion Correction in Fetal Multi-Shell Diffusion-Weighted MRI	Haykel Snoussi et.al.	2406.20042	null
2024-06-28	Concept Lens: Visually Analyzing the Consistency of Semantic Manipulation in GANs	Sangwon Jeong et.al.	2406.19987	null
2024-07-01	Text2Robot: Evolutionary Robot Design from Text Descriptions	Ryan P. Ringel et.al.	2406.19963	link
2024-06-28	Kolmogorov-Smirnov GAN	Maciej Falkiewicz et.al.	2406.19948	link
2024-06-27	Looking 3D: Anomaly Detection with 2D-3D Alignment	Ankan Bhunia et.al.	2406.19393	link
2024-06-27	Taming Data and Transformers for Audio Generation	Moayed Haji-Ali et.al.	2406.19388	null
2024-06-27	Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space	Core Francisco Park et.al.	2406.19370	link
2024-06-27	Accelerating Multiphase Flow Simulations with Denoising Diffusion Model Driven Initializations	Jaehong Chung et.al.	2406.19333	null
2024-06-27	Subtractive Training for Music Stem Insertion using Latent Diffusion Models	Ivan Villa-Renteria et.al.	2406.19328	null
2024-06-27	Efficient World Models with Context-Aware Tokenization	Vincent Micheli et.al.	2406.19320	link
2024-06-27	PNeRV: A Polynomial Neural Representation for Videos	Sonam Gupta et.al.	2406.19299	null
2024-06-27	Compositional Image Decomposition with Diffusion Models	Jocelin Su et.al.	2406.19298	null
2024-06-27	BISeizuRe: BERT-Inspired Seizure Data Representation to Improve Epilepsy Monitoring	Luca Benfenati et.al.	2406.19189	null
2024-06-27	On Pólya-Young urn models and growth processes	Markus Kuba et.al.	2406.19110	null
2024-06-26	MatchTime: Towards Automatic Soccer Game Commentary Generation	Jiayuan Rao et.al.	2406.18530	link
2024-06-26	MultiDiff: Consistent Novel View Synthesis from a Single Image	Norman Müller et.al.	2406.18524	null
2024-06-26	Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration	Kang Liao et.al.	2406.18516	link
2024-06-26	DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance	Younghyun Kim et.al.	2406.18459	link
2024-06-26	Cascading Large Language Models for Salient Event Graph Generation	Xingwei Tan et.al.	2406.18449	link
2024-06-26	Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling	Abril Corona-Figueroa et.al.	2406.18422	link
2024-06-26	Towards diffusion models for large-scale sea-ice modelling	Tobias Sebastian Finn et.al.	2406.18417	null
2024-06-27	Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process	Tianyu Lin et.al.	2406.18361	link
2024-06-26	Molecular Diffusion Models with Virtual Receptors	Matan Halfon et.al.	2406.18330	null
2024-06-27	Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems	Italo Luis da Silva et.al.	2406.18245	link
2024-06-25	DiffusionPDE: Generative PDE-Solving Under Partial Observation	Jiahe Huang et.al.	2406.17763	link
2024-06-25	MotionBooth: Motion-Aware Customized Text-to-Video Generation	Jianzong Wu et.al.	2406.17758	null
2024-06-25	Accelerating Clinical Evidence Synthesis with Large Language Models	Zifeng Wang et.al.	2406.17755	null
2024-06-25	Extensions of Panjer's recursion for mixed compound distributions	Spyridon M. Tzaninis et.al.	2406.17726	null
2024-06-25	PANDA: A self-driving lab for studying electrodeposited polymer films	Harley Quinn et.al.	2406.17725	null
2024-06-25	Unified Auto-Encoding with Masked Diffusion	Philippe Hansen-Estruch et.al.	2406.17688	link
2024-06-25	LaTable: Towards Large Tabular Models	Boris van Breugel et.al.	2406.17673	null
2024-06-26	SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond	Marco Comunità et.al.	2406.17672	null
2024-06-25	Banishing LLM Hallucinations Requires Rethinking Generalization	Johnny Li et.al.	2406.17642	null
2024-06-25	The experience of humans' and robots' mutual (im)politeness in enacted service scenarios: An empirical study	Victor Kaptelinin et.al.	2406.17641	null
2024-06-24	FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models	Haonan Qiu et.al.	2406.16863	link
2024-06-24	Dreamitate: Real-World Visuomotor Policy Learning via Video Generation	Junbang Liang et.al.	2406.16862	null
2024-06-24	DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation	Yuang Peng et.al.	2406.16855	link
2024-06-24	USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$ onversations	Mounika Marreddy et.al.	2406.16833	null
2024-06-24	General Binding Affinity Guidance for Diffusion Models in Structure-Based Drug Design	Yue Jian et.al.	2406.16821	null
2024-06-24	ClotheDreamer: Text-Guided Garment Generation with 3D Gaussians	Yufei Liu et.al.	2406.16815	null
2024-06-24	Conformal time series decomposition with component-wise exchangeability	Derck W. E. Prinzhorn et.al.	2406.16766	link
2024-06-24	Inferring stochastic low-rank recurrent neural networks from neural data	Matthijs Pals et.al.	2406.16749	link
2024-06-24	Portrait3D: 3D Head Generation from Single In-the-wild Portrait Image	Jinkun Hao et.al.	2406.16710	null
2024-06-24	Geometry-Aware Score Distillation via 3D Consistent Noising and Gradient Consistency Modeling	Min-Seop Kwak et.al.	2406.16695	null
2024-06-21	Masked Extended Attention for Zero-Shot Virtual Try-On In The Wild	Nadav Orzech et.al.	2406.15331	null
2024-06-21	Rethinking Remote Sensing Change Detection With A Mask View	Xiaowen Ma et.al.	2406.15320	link
2024-06-21	You Only Acquire Sparse-channel (YOAS): A Unified Framework for Dense-channel EEG Generation	Hongyu Chen et.al.	2406.15269	null
2024-06-21	Evaluating Diversity in Automatic Poetry Generation	Yanran Chen et.al.	2406.15267	link
2024-06-21	Fingerprint Membership and Identity Inference Against Generative Adversarial Networks	Saverio Cavasin et.al.	2406.15253	null
2024-06-21	MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation	Xuan He et.al.	2406.15252	null
2024-06-21	Unsupervised Bayesian Generation of Synthetic CT from CBCT Using Patient-Specific Score-Based Prior	Junbo Peng et.al.	2406.15219	null
2024-06-21	Sound and Fury, Signifying Nothing? Impact of Data Breach Disclosure Laws	Muhammad Zia Hydari et.al.	2406.15215	null
2024-06-21	Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors	Ali Naseh et.al.	2406.15213	link
2024-06-21	Exploring the Efficacy of Robotic Assistants with ChatGPT and Claude in Enhancing ADHD Therapy: Innovating Treatment Paradigms	Santiago Berrezueta-Guzman et.al.	2406.15198	null
2024-06-20	A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models	Xincheng Shuai et.al.	2406.14555	link
2024-06-21	Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation	Eyal Michaeli et.al.	2406.14551	link
2024-06-20	Consistency Models Made Easy	Zhengyang Geng et.al.	2406.14548	link
2024-06-20	IRASim: Learning Interactive Real-Robot Action Simulators	Fangqi Zhu et.al.	2406.14540	null
2024-06-20	Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps	Nikita Starodubcev et.al.	2406.14539	null
2024-06-20	Fantastic Copyrighted Beasts and How (Not) to Generate Them	Luxi He et.al.	2406.14526	null
2024-06-20	Photoacoustic methane detection assisted by a gas-filled anti-resonant hollow-core fiber laser	Cuiling Zhang et.al.	2406.14521	null
2024-06-20	V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data	Rotem Shalev-Arkushin et.al.	2406.14510	null
2024-06-20	CodeRAG-Bench: Can Retrieval Augment Code Generation?	Zora Zhiruo Wang et.al.	2406.14497	link
2024-06-20	SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset	Josef Dai et.al.	2406.14477	link
2024-06-20	CollaFuse: Collaborative Diffusion Models	Simeon Allmendinger et.al.	2406.14429	link
2024-06-20	Active Diffusion Subsampling	Oisin Nolan et.al.	2406.14388	link
2024-06-20	Multicoloured Hardcore Model: Fast Mixing and Queueing	Sam Olesker-Taylor et.al.	2406.14376	null
2024-06-20	FairX: A comprehensive benchmarking tool for model analysis using fairness, utility, and explainability	Md Fahim Sikder et.al.	2406.14281	link
2024-06-20	In Tree Structure Should Sentence Be Generated	Yaguang Li et.al.	2406.14189	link
2024-06-20	CriDiff: Criss-cross Injection Diffusion Framework via Generative Pre-train for Prostate Segmentation	Tingwei Liu et.al.	2406.14186	link
2024-06-20	Tractable Equilibrium Computation in Markov Games through Risk Aversion	Eric Mazumdar et.al.	2406.14156	null
2024-06-20	ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning	Zhongjie Duan et.al.	2406.14130	link
2024-06-20	Dye4AI: Assuring Data Boundary on Generative AI Services	Shu Wang et.al.	2406.14114	null
2024-06-20	HeartBeat: Towards Controllable Echocardiography Video Synthesis with Multimodal Conditions-Guided Diffusion Models	Xinrui Zhou et.al.	2406.14098	null
2024-06-20	Bridging bulk and surface: An interacting particle system towards the field-road diffusion model	Matthieu Alfaro et.al.	2406.14093	null
2024-06-20	A Practical Diffusion Path for Sampling	Omar Chehab et.al.	2406.14040	null
2024-06-20	Leveraging eBPF and AI for Ransomware Nose Out	Arjun Sekar et.al.	2406.14020	null
2024-06-20	Feature Fusion Based on Mutual-Cross-Attention Mechanism for EEG Emotion Recognition	Yimin Zhao et.al.	2406.14014	link
2024-06-20	Exploring Changes in Nation Perception with Nationality-Assigned Personas in LLMs	Mahammed Kamruzzaman et.al.	2406.13993	null
2024-06-20	The Elusive Pursuit of Replicating PATE-GAN: Benchmarking, Auditing, Debugging	Georgi Ganev et.al.	2406.13985	link
2024-06-20	Similarity-aware Syncretic Latent Diffusion Model for Medical Image Translation with Representation Learning	Tingyi Lin et.al.	2406.13977	null
2024-06-20	Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models	Yuan Zhong et.al.	2406.13942	null
2024-06-20	EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations	Jie Ren et.al.	2406.13933	null
2024-06-20	Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions	Hamdireza Rouzegar et.al.	2406.13903	null
2024-06-19	INFusion: Diffusion Regularized Implicit Neural Representations for 2D and 3D accelerated MRI reconstruction	Yamin Arefeen et.al.	2406.13895	null
2024-06-19	Open Generative Large Language Models for Galician	Pablo Gamallo et.al.	2406.13893	null
2024-06-19	StackRAG Agent: Improving Developer Answers with Retrieval-Augmented Generation	Davit Abrahamyan et.al.	2406.13840	link
2024-06-19	RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design	Rishabh Anand et.al.	2406.13839	link
2024-06-19	COAC: Cross-layer Optimization of Accelerator Configurability for Efficient CNN Processing	Steven Colleman et.al.	2406.13752	null
2024-06-19	GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation	Baiqi Li et.al.	2406.13743	link
2024-06-19	Tree-Sliced Wasserstein Distance on a System of Lines	Viet-Hoang Tran et.al.	2406.13725	null
2024-06-19	Hitchhiker's guide on Energy-Based Models: a comprehensive review on the relation with other generative models, sampling and statistical physics	Davide Carbone et.al.	2406.13661	null
2024-06-19	Towards Minimal Targeted Updates of Language Models with Targeted Negative Training	Lily H. Zhang et.al.	2406.13660	link
2024-06-19	Stability and Generalizability in SDE Diffusion Models with Measure-Preserving Dynamics	Weitong Zhang et.al.	2406.13652	null
2024-06-19	On AI-Inspired UI-Design	Jialiang Wei et.al.	2406.13631	null
2024-06-19	Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy	Elena Tomasi et.al.	2406.13627	link
2024-06-19	Enhance the Image: Super Resolution using Artificial Intelligence in MRI	Ziyu Li et.al.	2406.13625	null
2024-06-19	Generative Modeling by Minimizing the Wasserstein-2 Loss	Yu-Jui Huang et.al.	2406.13619	null
2024-06-19	Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks	Liangxin Qian et.al.	2406.13602	null
2024-06-19	ModSec-Learn: Boosting ModSecurity with Machine Learning	Christian Scano et.al.	2406.13547	link
2024-06-19	Towards Cyber Threat Intelligence for the IoT	Alfonso Iacovazzi et.al.	2406.13543	null
2024-06-19	Image Distillation for Safe Data Sharing in Histopathology	Zhe Li et.al.	2406.13536	link
2024-06-19	Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement	Chenda Li et.al.	2406.13471	null
2024-06-19	Unifying nonlinearly constrained nonconvex optimization	Charlie Vanaret et.al.	2406.13454	link
2024-06-19	Federating to Grow Transformers with Constrained Resources without Model Sharing	Shikun Shen et.al.	2406.13450	null
2024-06-19	Multi-messenger modeling of the Monogem pulsar halo	Youyou Li et.al.	2406.13426	null
2024-06-19	Style-NeRF2NeRF: 3D Style Transfer From Style-Aligned Multi-View Images	Haruo Fujiwara et.al.	2406.13393	null
2024-06-19	Effective Edge-wise Representation Learning in Edge-Attributed Bipartite Graphs	Hewen Wang et.al.	2406.13369	null
2024-06-19	Situational Instructions Database: Task Guidance in Dynamic Environments	Muhammad Saif Ullah Khan et.al.	2406.13302	link
2024-06-19	ARDuP: Active Region Video Diffusion for Universal Policies	Shuaiyi Huang et.al.	2406.13301	null
2024-06-19	AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models	Ken Chen et.al.	2406.13272	null
2024-06-19	Self-Supervised Diffusion Model for 3-D Seismic Data Reconstruction	Xinyang Wang et.al.	2406.13252	null
2024-06-19	Optimizing Inventory Management through Multiobjective Reverse Logistics with Environmental Impact	I. B. Wadhawan et.al.	2406.13226	null
2024-06-19	Neural Residual Diffusion Models for Deep Scalable Vision Generation	Zhiyuan Ma et.al.	2406.13215	null
2024-06-19	Surgical Triplet Recognition via Diffusion Model	Daochang Liu et.al.	2406.13210	null
2024-06-19	Diffusion Model-based FOD Restoration from High Distortion in dMRI	Shuo Huang et.al.	2406.13209	null
2024-06-19	Toward Structure Fairness in Dynamic Graph Embedding: A Trend-aware Dual Debiasing Approach	Yicong Li et.al.	2406.13201	link
2024-06-19	Synthetic Context Generation for Question Generation	Naiming Liu et.al.	2406.13188	null
2024-06-19	Conditional score-based diffusion models for solving inverse problems in mechanics	Agnimitra Dasgupta et.al.	2406.13154	null
2024-06-19	von Mises Quasi-Processes for Bayesian Circular Regression	Yarden Cohen et.al.	2406.13151	null
2024-06-19	MCAD: Multi-modal Conditioned Adversarial Diffusion Model for High-Quality PET Image Reconstruction	Jiaqi Cui et.al.	2406.13150	null
2024-06-19	GVT2RPM: An Empirical Study for General Video Transformer Adaptation to Remote Physiological Measurement	Hao Wang et.al.	2406.13136	null
2024-06-19	Thruster-Assisted Incline Walking	Kaushik Venkatesh Krishnamurthy et.al.	2406.13118	null
2024-06-18	Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models	Paul Henderson et.al.	2406.13099	null
2024-06-18	RITA: A Real-time Interactive Talking Avatars Framework	Wuxinlin Cheng et.al.	2406.13093	null
2024-06-18	PIPPIN: Generating variable length full events from partons	Guillaume Quétant et.al.	2406.13074	link
2024-06-18	MaskPure: Improving Defense Against Text Adversaries with Stochastic Purification	Harrison Gietz et.al.	2406.13066	link
2024-06-18	Traffic Prediction considering Multiple Levels of Spatial-temporal Information: A Multi-scale Graph Wavelet-based Approach	Zilin Bian et.al.	2406.13038	null
2024-06-18	Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities	Matthew T. C. Li et.al.	2406.13036	null
2024-06-18	Data Plagiarism Index: Characterizing the Privacy Risk of Data-Copying in Tabular Generative Models	Joshua Ward et.al.	2406.13012	null
2024-06-18	Synergizing Foundation Models and Federated Learning: A Survey	Shenghui Li et.al.	2406.12844	null
2024-06-18	Evaluating the design space of diffusion-based generative models	Yuqing Wang et.al.	2406.12839	null
2024-06-18	Neural Approximate Mirror Maps for Constrained Diffusion Models	Berthy T. Feng et.al.	2406.12816	null
2024-06-19	AITTI: Learning Adaptive Inclusive Token for Text-to-Image Generation	Xinyu Hou et.al.	2406.12805	link
2024-06-18	Extracting Training Data from Unconditional Diffusion Models	Yunhao Chen et.al.	2406.12752	null
2024-06-18	Useful stochastic bounds in time-varying queues with service and patience times having general joint distribution	Shreehari Anand Bodas et.al.	2406.12745	null
2024-06-18	SUPER: Selfie Undistortion and Head Pose Editing with Identity Preservation	Polina Karpikova et.al.	2406.12700	null
2024-06-18	Speak in the Scene: Diffusion-based Acoustic Scene Transfer toward Immersive Speech Generation	Miseul Kim et.al.	2406.12688	null
2024-06-18	GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models	Yongtao Ge et.al.	2406.12671	link
2024-06-18	Research and Implementation of Data Enhancement Techniques for Graph Neural Networks	Jingzhao Gu et.al.	2406.12640	null
2024-06-18	News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation	Andreea Iana et.al.	2406.12634	link
2024-06-18	Learning Diffusion at Lightspeed	Antonio Terpin et.al.	2406.12616	null
2024-06-18	Unmasking the Veil: An Investigation into Concept Ablation for Privacy and Copyright Protection in Images	Shivank Garg et.al.	2406.12592	link
2024-06-18	Behavior-Dependent Linear Recurrent Units for Efficient Sequential Recommendation	Chengkai Liu et.al.	2406.12580	link
2024-06-18	Training Diffusion Models with Federated Learning	Matthijs de Goede et.al.	2406.12575	null
2024-06-18	P-Tailor: Customizing Personality Traits for Language Models via Mixture of Specialized LoRA Experts	Yuhao Dan et.al.	2406.12548	null
2024-06-18	Structured Detection for Simultaneous Super-Resolution and Optical Sectioning in Laser Scanning Microscopy	Alessandro Zunino et.al.	2406.12542	link
2024-06-18	Variational Distillation of Diffusion Policies into Mixture of Experts	Hongyi Zhou et.al.	2406.12538	null
2024-06-18	HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors	Panwang Pan et.al.	2406.12459	link
2024-06-18	Planning Using Schrödinger Bridge Diffusion Models	Adarsh Srivastava et.al.	2406.12458	link
2024-06-18	Deep Temporal Deaggregation: Large-Scale Spatio-Temporal Generative Models	David Bergström et.al.	2406.12423	null
2024-06-18	ROVER: RTL Optimization via Verified E-Graph Rewriting	Samuel Coward et.al.	2406.12421	null
2024-06-18	TADM: Temporally-Aware Diffusion Model for Neurodegenerative Progression on Brain MRI	Mattia Litrico et.al.	2406.12411	null
2024-06-18	SDNIA-YOLO: A Robust Object Detection Model for Extreme Weather Conditions	Yuexiong Ding et.al.	2406.12395	null

(back to top)

Vision-Language Models

Publish Date	Title	Authors	PDF	Code
2024-12-19	OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving	Shuo Xing et.al.	2412.15208	null
2024-12-19	LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation	Weijia Shi et.al.	2412.15188	null
2024-12-19	Qwen2.5 Technical Report	Qwen et.al.	2412.15115	null
2024-12-19	Progressive Multimodal Reasoning via Active Retrieval	Guanting Dong et.al.	2412.14835	null
2024-12-19	Explainable Tampered Text Detection via Multimodal Large Models	Chenfan Qu et.al.	2412.14816	null
2024-12-18	Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception	Yanpeng Sun et.al.	2412.14233	link
2024-12-18	AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities	Guillaume Astruc et.al.	2412.14123	link
2024-12-19	G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o	Tony Cheng Tong et.al.	2412.13647	link
2024-12-18	Detecting Machine-Generated Music with Explainability -- A Challenge and Early Benchmarks	Yupei Li et.al.	2412.13421	null
2024-12-17	DoPTA: Improving Document Layout Analysis using Patch-Text Alignment	Nikitha SR et.al.	2412.12902	null
2024-12-17	Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models	YiFan Zhang et.al.	2412.12606	null
2024-12-17	PBVS 2024 Solution: Self-Supervised Learning and Sampling Strategies for SAR Classification in Extreme Long-Tail Distribution	Yuhyun Kim et.al.	2412.12565	null
2024-12-17	Causal Diffusion Transformers for Generative Modeling	Chaorui Deng et.al.	2412.12095	link
2024-12-16	CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology	Yuxuan Sun et.al.	2412.12077	null
2024-12-16	Gramian Multimodal Representation Learning and Alignment	Giordano Cicchetti et.al.	2412.11959	null
2024-12-16	LMM-Regularized CLIP Embeddings for Image Classification	Maria Tzelepi et.al.	2412.11663	null
2024-12-15	Seeing the Forest and the Trees: Solving Visual Graph and Tree Based Data Structure Problems using Large Multimodal Models	Sebastian Gutierrez et.al.	2412.11088	null
2024-12-13	Apollo: An Exploration of Video Understanding in Large Multimodal Models	Orr Zohar et.al.	2412.10360	null
2024-12-13	Performance of ChatGPT on tasks involving physics visual representations: the case of the Brief Electricity and Magnetism Assessment	Giulia Polverini et.al.	2412.10019	null
2024-12-12	Vision-Language Models Represent Darker-Skinned Black Individuals as More Homogeneous than Lighter-Skinned Black Individuals	Messi H. J. Lee et.al.	2412.09668	null
2024-12-12	Exemplar Masking for Multimodal Incremental Learning	Yi-Lun Lee et.al.	2412.09549	link
2024-12-12	Embeddings are all you need! Achieving High Performance Medical Image Classification through Training-Free Embedding Analysis	Raj Hansini Khoiwal et.al.	2412.09445	null
2024-12-12	Enhancing Modality Representation and Alignment for Multimodal Cold-start Active Learning	Meng Shen et.al.	2412.09126	null
2024-12-12	A Wander Through the Multimodal Landscape: Efficient Transfer Learning via Low-rank Sequence Multimodal Adapter	Zirun Guo et.al.	2412.08979	null
2024-12-11	StreamChat: Chatting with Streaming Video	Jihao Liu et.al.	2412.08646	null
2024-12-11	Multimodal Latent Language Modeling with Next-Token Diffusion	Yutao Sun et.al.	2412.08635	link
2024-12-12	Design2GarmentCode: Turning Design Concepts to Tangible Garments Through Program Synthesis	Feng Zhou et.al.	2412.08603	null
2024-12-11	Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual Illusions	Mohammadmostafa Rostamkhani et.al.	2412.08169	link
2024-12-10	Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning	Can Yaras et.al.	2412.07909	null
2024-12-10	BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities	Sahal Shaji Mullappilly et.al.	2412.07769	link
2024-12-10	ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer	Jinyi Hu et.al.	2412.07720	link
2024-12-13	DriveMM: All-in-One Large Multimodal Model for Autonomous Driving	Zhijian Huang et.al.	2412.07689	link
2024-12-10	Driving with InternVL: Oustanding Champion in the Track on Driving with Language of the Autonomous Grand Challenge at CVPR 2024	Jiahan Li et.al.	2412.07247	null
2024-12-10	Maya: An Instruction Finetuned Multilingual Multimodal Model	Nahid Alam et.al.	2412.07112	link
2024-12-09	How to Merge Your Multimodal Models Over Time?	Sebastian Dziadzio et.al.	2412.06712	link
2024-12-09	Ranked from Within: Ranking Large Multimodal Models for Visual Question Answering Without Labels	Weijie Tu et.al.	2412.06461	null
2024-12-09	iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models	Lianyu Hu et.al.	2412.06263	link
2024-12-08	A Self-Learning Multimodal Approach for Fake News Detection	Hao Chen et.al.	2412.05843	null
2024-12-08	SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation	Leigang Qu et.al.	2412.05818	null
2024-12-07	WavFusion: Towards wav2vec 2.0 Multimodal Speech Emotion Recognition	Feng Li et.al.	2412.05558	null
2024-12-07	Comprehensive Evaluation of Multimodal AI Models in Medical Imaging Diagnosis: From Data Augmentation to Preference-Based Comparison	Cailian Ruan et.al.	2412.05536	null
2024-12-06	Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling	Zhe Chen et.al.	2412.05271	link
2024-12-05	Lattice Lingo: Effect of Textual Detail on Multimodal Learning for Property Prediction of Crystals	Mrigi Munjal et.al.	2412.04670	null
2024-12-05	BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks	Juan Rodriguez et.al.	2412.04626	null
2024-12-05	MageBench: Bridging Large Multimodal Models to Agents	Miaosen Zhang et.al.	2412.04531	link
2024-12-04	Video Quality Assessment: A Comprehensive Survey	Qi Zheng et.al.	2412.04508	link
2024-12-05	SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model	Zhenglin Huang et.al.	2412.04292	null
2024-12-05	CALMM-Drive: Confidence-Aware Autonomous Driving with Large Multimodal Model	Ruoyu Yao et.al.	2412.04209	null
2024-12-05	AIpparel: A Large Multimodal Generative Model for Digital Garments	Kiyohiro Nakayama et.al.	2412.03937	null
2024-12-05	MegaCOIN: Enhancing Medium-Grained Color Perception for Vision-Language Models	Ming-Chang Chiu et.al.	2412.03927	link
2024-12-04	Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning	Wujian Peng et.al.	2412.03565	link
2024-12-04	Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning	Neale Ratzlaff et.al.	2412.03467	null
2024-12-06	SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection	Joongwon Chae et.al.	2412.02565	link
2024-12-03	Initial Study On Improving Segmentation By Combining Preoperative CT And Intraoperative CBCT Using Synthetic Data	Maximilian E. Tschuchnig et.al.	2412.02294	null
2024-12-05	CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy	Zhibo Yang et.al.	2412.02210	null
2024-12-03	VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding	Kangsan Kim et.al.	2412.02186	link
2024-12-04	Agri-LLaVA: Knowledge-Infused Large Multimodal Assistant on Agricultural Pests and Diseases	Liqiong Wang et.al.	2412.02158	link
2024-12-02	Attacks on multimodal models	Viacheslav Iablochnikov et.al.	2412.01725	link
2024-12-02	LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant	Yikun Liu et.al.	2412.01720	null
2024-12-01	VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation	Weiming Ren et.al.	2412.00927	null
2024-11-30	MaintAGT:Sim2Real-Guided Multimodal Large Model for Intelligent Maintenance with Chain-of-Thought Reasoning	Hongliang He et.al.	2412.00481	null
2024-11-30	Approximate Fiber Product: A Preliminary Algebraic-Geometric Perspective on Multimodal Embedding Alignment	Dongfang Zhao et.al.	2412.00373	null
2024-12-04	ROSE: Revolutionizing Open-Set Dense Segmentation with Patch-Wise Perceptual Large Multimodal Model	Kunyang Han et.al.	2412.00153	null
2024-11-28	Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers	Chancharik Mitra et.al.	2412.00142	null
2024-12-02	LUMIA: Linear probing for Unimodal and MultiModal Membership Inference Attacks leveraging internal LLM states	Luis Ibanez-Lissen et.al.	2411.19876	null
2024-11-29	SDR-GNN: Spectral Domain Reconstruction Graph Neural Network for Incomplete Multimodal Learning in Conversational Emotion Recognition	Fangze Fu et.al.	2411.19822	null
2024-11-29	JetFormer: An Autoregressive Generative Model of Raw Images and Text	Michael Tschannen et.al.	2411.19722	null
2024-11-28	Beyond Logit Lens: Contextual Embeddings for Robust Hallucination Detection & Grounding in VLMs	Anirudh Phukan et.al.	2411.19187	null
2024-11-28	Examining Multimodal Gender and Content Bias in ChatGPT-4o	Roberto Balestri et.al.	2411.19140	null
2024-11-28	ScratchEval: Are GPT-4o Smarter than My Child? Evaluating Large Multimodal Models with Visual Programming Challenges	Rao Fu et.al.	2411.18932	link
2024-11-27	Active Data Curation Effectively Distills Large-Scale Multimodal Models	Vishaal Udandarao et.al.	2411.18674	null
2024-11-27	AMPS: ASR with Multimodal Paraphrase Supervision	Amruta Parulekar et.al.	2411.18368	null
2024-12-03	Large Language Model-Brained GUI Agents: A Survey	Chaoyun Zhang et.al.	2411.18279	link
2024-11-27	Grid-augumented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents	Joongwon Chae et.al.	2411.18270	link
2024-11-27	Multimodal Integration of Longitudinal Noninvasive Diagnostics for Survival Prediction in Immunotherapy Using Deep Learning	Melda Yeghaian et.al.	2411.18253	null
2024-11-26	NEMO: Can Multimodal LLMs Identify Attribute-Modified Objects?	Jiaxuan Li et.al.	2411.17794	null
2024-11-26	Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis	Akshita Gupta et.al.	2411.17690	null
2024-11-26	AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM	Jiarui Wang et.al.	2411.17221	link
2024-11-26	Learning Robust Anymodal Segmentor with Unimodal and Cross-modal Distillation	Xu Zheng et.al.	2411.17141	link
2024-11-26	Relations, Negations, and Numbers: Looking for Logic in Generative Text-to-Image Models	Colin Conwell et.al.	2411.17066	link
2024-11-26	Multimodal Alignment and Fusion: A Survey	Songtao Li et.al.	2411.17040	null
2024-11-27	SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE	Yongwei Chen et.al.	2411.16856	null
2024-11-23	Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents	Jun Chen et.al.	2411.16740	link
2024-11-26	All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages	Ashmal Vayani et.al.	2411.16508	link
2024-11-25	Boosting 3D Object Generation through PBR Materials	Yitong Wang et.al.	2411.16080	null
2024-11-24	M3-CVC: Controllable Video Compression with Multimodal Generative Models	Rui Wan et.al.	2411.15798	null
2024-11-23	Knowledge Transfer Across Modalities with Natural Language Supervision	Carlo Alberto Barbano et.al.	2411.15611	null
2024-11-23	From Complexity to Parsimony: Integrating Latent Class Analysis to Uncover Multimodal Learning Patterns in Collaborative Learning	Lixiang Yan et.al.	2411.15590	null
2024-11-23	Botfip-LLM: An Enhanced Multimodal Scientific Computing Framework Leveraging Knowledge Distillation from Large Language Models	Tianhao Chen et.al.	2411.15525	null
2024-11-23	MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking	Xinqi Liu et.al.	2411.15459	null
2024-11-23	freePruner: A Training-free Approach for Large Multimodal Model Acceleration	Bingxin Xu et.al.	2411.15446	null
2024-11-22	PRIMUS: Pretraining IMU Encoders with Multimodal Self-Supervision	Arnav M. Das et.al.	2411.15127	null
2024-11-22	Large Multi-modal Models Can Interpret Features in Large Multi-modal Models	Kaichen Zhang et.al.	2411.14982	link
2024-11-25	Information Extraction from Heterogeneous Documents without Ground Truth Labels using Synthetic Label Generation and Knowledge Distillation	Aniket Bhattacharyya et.al.	2411.14957	null
2024-11-22	Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains	Yurii Paniv et.al.	2411.14647	null
2024-11-21	Generative AI for Music and Audio	Hao-Wen Dong et.al.	2411.14627	null
2024-11-21	FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers	Zehua Pei et.al.	2411.14507	null
2024-11-21	MMGenBench: Evaluating the Limits of LMMs from the Text-to-Image Generation Perspective	Hailang Huang et.al.	2411.14062	link
2024-11-21	Multimodal 3D Reasoning Segmentation with Complex Scenes	Xueying Jiang et.al.	2411.13927	null
2024-11-20	VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation	Ziyang Luo et.al.	2411.13281	null
2024-11-19	VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge	Vishwesh Nath et.al.	2411.12915	null
2024-11-19	Mitigating Perception Bias: A Training-Free Approach to Enhance LMM for Image Quality Assessment	Siyi Pan et.al.	2411.12791	null
2024-11-18	MMBind: Unleashing the Potential of Distributed and Heterogeneous Data for Multimodal Learning in IoT	Xiaomin Ouyang et.al.	2411.12126	null
2024-11-17	SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization	Hongrui Jia et.al.	2411.11909	link
2024-11-18	The Power of Many: Multi-Agent Multimodal Models for Cultural Image Captioning	Longju Bai et.al.	2411.11758	link
2024-11-18	Artificial Scientific Discovery	Antonio Norelli et.al.	2411.11672	null
2024-11-18	InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models	Yu Yan et.al.	2411.11394	null
2024-11-19	SoK: Unifying Cybersecurity and Cybersafety of Multimodal Foundation Models with an Information Theory Approach	Ruoxi Sun et.al.	2411.11195	null
2024-11-16	ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models	Vipula Rawte et.al.	2411.10867	null
2024-11-19	MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models	Jianhong Tu et.al.	2411.10557	link
2024-11-15	Everything is a Video: Unifying Modalities through Next-Frame Prediction	G. Thomas Hudson et.al.	2411.10503	null
2024-11-15	Weakly-Supervised Multimodal Learning on MIMIC-CXR	Andrea Agostini et.al.	2411.10356	link
2024-11-21	Instruction-Guided Editing Controls for Images and Multimedia: A Survey in LLM era	Thanh Tam Nguyen et.al.	2411.09955	link
2024-11-14	Cross-Modal Consistency in Multimodal Large Language Models	Xiang Zhang et.al.	2411.09273	null
2024-11-14	SmartInv: Multimodal Learning for Smart Contract Invariant Inference	Sally Junsong Wang et.al.	2411.09217	null
2024-11-13	Multimodal Object Detection using Depth and Image Data for Manufacturing Parts	Nazanin Mahjourian et.al.	2411.09062	null
2024-11-13	Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions	Moran Yanuka et.al.	2411.09018	null
2024-11-13	AstroM $^3$ : A self-supervised multimodal model for astronomy	Mariia Rizhko et.al.	2411.08842	null
2024-11-13	Multimodal Instruction Tuning with Hybrid State Space Models	Jianing Zhou et.al.	2411.08840	null
2024-11-13	Retrieval Augmented Recipe Generation	Guoshan Liu et.al.	2411.08715	null
2024-11-12	DPU: Dynamic Prototype Updating for Multimodal Out-of-Distribution Detection	Shawn Li et.al.	2411.08227	link
2024-11-12	Leveraging Multimodal Models for Enhanced Neuroimaging Diagnostics in Alzheimer's Disease	Francesco Chiumento et.al.	2411.07871	null
2024-11-12	SparrowVQE: Visual Question Explanation for Course Content Understanding	Jialu Li et.al.	2411.07516	link
2024-11-12	BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions	Anas Awadalla et.al.	2411.07461	null
2024-11-11	Multimodal Fusion Balancing Through Game-Theoretic Regularization	Konstantinos Kontras et.al.	2411.07335	null
2024-11-11	OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision	Cong Wei et.al.	2411.07199	null
2024-11-09	M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework	Yew Ken Chia et.al.	2411.06176	null
2024-11-09	An Empirical Analysis on Spatial Reasoning Capabilities of Large Multimodal Models	Fatemeh Shiri et.al.	2411.06048	link
2024-11-08	Towards Low-Resource Harmful Meme Detection with LMM Agents	Jianzhao Huang et.al.	2411.05383	link
2024-11-08	Exploring the Alignment Landscape: LLMs and Geometric Deep Models in Protein Representation	Dong Shu et.al.	2411.05316	link
2024-11-07	HourVideo: 1-Hour Video-Language Understanding	Keshigeyan Chandrasegaran et.al.	2411.04998	link
2024-11-07	VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos	Shehan Munasinghe et.al.	2411.04923	null
2024-11-07	Exploring Hierarchical Molecular Graph Representation in Multimodal LLMs	Chengxin Hu et.al.	2411.04708	null
2024-11-06	AutoGameUI: Constructing High-Fidelity Game UIs via Multimodal Learning and Interactive Web-Based Tool	Zhongliang Tang et.al.	2411.03709	null
2024-11-05	MME-Finance: A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning	Ziliang Gan et.al.	2411.03314	null
2024-11-05	HumanVLM: Foundation for Human-Scene Vision-Language Model	Dawei Dai et.al.	2411.03034	null
2024-11-05	Toward Robust Incomplete Multimodal Sentiment Analysis via Hierarchical Representation Learning	Mingcheng Li et.al.	2411.02793	null
2024-11-11	INQUIRE: A Natural World Text-to-Image Retrieval Benchmark	Edward Vendrow et.al.	2411.02537	link
2024-11-04	See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers	Jiaxin Zhuang et.al.	2411.02465	null
2024-11-07	TableGPT2: A Large Multimodal Model with Tabular Data Integration	Aofeng Su et.al.	2411.02059	link
2024-11-04	Foundations and Recent Trends in Multimodal Mobile Agents: A Survey	Biao Wu et.al.	2411.02006	link
2024-11-04	KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension	Jie Yang et.al.	2411.01846	null
2024-11-03	EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark	Ming Li et.al.	2411.01492	null
2024-11-03	Classifier-guided Gradient Modulation for Enhanced Multimodal Learning	Zirun Guo et.al.	2411.01409	link
2024-11-02	LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding	Jian Chen et.al.	2411.01106	null
2024-11-01	Text2Freq: Learning Series Patterns from Text via Frequency Domain	Ming-Chih Lo et.al.	2411.00929	null
2024-11-01	V-LoRA: An Efficient and Flexible System Boosts Vision Applications with LoRA LMM	Liang Mi et.al.	2411.00915	null
2024-11-01	Analyzing Multimodal Integration in the Variational Autoencoder from an Information-Theoretic Perspective	Carlotta Langer et.al.	2411.00522	null
2024-10-31	TurtleBench: A Visual Programming Benchmark in Turtle Geometry	Sina Rismanchian et.al.	2411.00264	link
2024-10-31	ResiDual Transformer Alignment with Spectral Decomposition	Lorenzo Basile et.al.	2411.00246	null
2024-10-31	Nearest Neighbor Normalization Improves Multimodal Retrieval	Neil Chowdhury et.al.	2410.24114	link
2024-11-04	AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents	Yifan Xu et.al.	2410.24024	link
2024-10-31	Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models	Hao Yang et.al.	2410.23861	null
2024-10-30	CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP	Tianyu Yang et.al.	2410.23330	null
2024-10-30	EMMA: End-to-End Multimodal Model for Autonomous Driving	Jyh-Jing Hwang et.al.	2410.23262	null
2024-10-29	ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding	Kimihiro Hasegawa et.al.	2410.22211	link
2024-10-29	Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial Applications	Monica Riedler et.al.	2410.21943	link
2024-10-28	AiSciVision: A Framework for Specializing Large Multimodal Models in Scientific Image Classification	Brendan Hogan et.al.	2410.21480	link
2024-10-27	Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse	Ryan Liu et.al.	2410.21333	null
2024-10-28	IndraEye: Infrared Electro-Optical UAV-based Perception Dataset for Robust Downstream Tasks	Manjunath D et.al.	2410.20953	link
2024-10-27	Generator Matching: Generative modeling with arbitrary Markov processes	Peter Holderrieth et.al.	2410.20587	null
2024-10-27	PaPaGei: Open Foundation Models for Optical Physiological Signals	Arvind Pillai et.al.	2410.20542	link
2024-10-25	Turn-by-Turn Indoor Navigation for the Visually Impaired	Santosh Srinivasaiah et.al.	2410.19954	null
2024-10-25	A Multimodal Approach For Endoscopic VCE Image Classification Using BiomedCLIP-PubMedBERT	Nagarajan Ganapathy et.al.	2410.19944	link
2024-10-25	OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization	Hongliang He et.al.	2410.19609	link
2024-10-24	Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant	Abhirama Subramanyam Penamakuri et.al.	2410.19144	link
2024-10-24	VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks	Lawrence Jang et.al.	2410.19100	null
2024-10-24	CAMEL-Bench: A Comprehensive Arabic LMM Benchmark	Sara Ghaboura et.al.	2410.18976	link
2024-10-24	Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques	David Ortiz-Perez et.al.	2410.18972	null
2024-10-24	OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning	Xiaoqiang Wang et.al.	2410.18963	null
2024-10-24	A Survey of Multimodal Sarcasm Detection	Shafkat Farabi et.al.	2410.18882	null
2024-10-27	R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models	Linger Deng et.al.	2410.17885	link
2024-10-22	JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation	Shota Onohara et.al.	2410.17250	null
2024-10-22	An Eye for an AI: Evaluating GPT-4o's Visual Perception Skills and Geometric Reasoning Skills Using Computer Graphics Questions	Tony Haoran Feng et.al.	2410.16991	null
2024-10-21	DocEdit-v2: Document Structure Editing Via Multimodal LLM Grounding	Manan Suri et.al.	2410.16472	null
2024-10-21	Promoting cross-modal representations to improve multimodal foundation models for physiological signals	Ching Fang et.al.	2410.16424	null
2024-10-22	Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance	Zhangwei Gao et.al.	2410.16261	link
2024-10-22	MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report	Samrajya Thapa et.al.	2410.16239	link
2024-10-21	Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models	Yufei Zhan et.al.	2410.16163	link
2024-10-21	LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset	Ruikun Zhang et.al.	2410.16095	link
2024-10-21	How to Build a Pre-trained Multimodal model for Simultaneously Chatting and Decision-making?	Zuojin Tang et.al.	2410.15885	null
2024-10-21	Multimodal Learning for Embryo Viability Prediction in Clinical IVF	Junsik Kim et.al.	2410.15581	null
2024-10-20	IPO: Interpretable Prompt Optimization for Vision-Language Models	Yingjun Du et.al.	2410.15397	link
2024-10-20	Modality-Fair Preference Optimization for Trustworthy MLLM Alignment	Songtao Jiang et.al.	2410.15334	null
2024-10-19	ChitroJera: A Regionally Relevant Visual Question Answering Dataset for Bangla	Deeparghya Dutta Barua et.al.	2410.14991	null
2024-10-19	SemiHVision: Enhancing Medical Multimodal Models with a Semi-Human Annotated Dataset and Fine-Tuned Instruction Generation	Junda Wang et.al.	2410.14948	link
2024-10-18	Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension	Yin Xie et.al.	2410.14332	link
2024-10-18	Personalized Image Generation with Large Multimodal Models	Yiyan Xu et.al.	2410.14170	null
2024-10-18	Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents	Sabit Hassan et.al.	2410.14141	null
2024-10-17	Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation	Chengyue Wu et.al.	2410.13848	link
2024-10-18	Harnessing Webpage UIs for Text-Rich Visual Understanding	Junpeng Liu et.al.	2410.13824	null
2024-10-17	Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR	Abhishek Gupta et.al.	2410.13445	null
2024-10-16	The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio	Sicong Leng et.al.	2410.12787	null
2024-10-16	HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks	Fengji Zhang et.al.	2410.12381	link
2024-10-15	CtrlSynth: Controllable Image Text Synthesis for Data-Efficient Multimodal Learning	Qingqing Cao et.al.	2410.11963	null
2024-10-15	Generalizable Spacecraft Trajectory Generation via Multimodal Learning with Transformers	Davide Celestini et.al.	2410.11723	null
2024-10-15	Unveiling the Mystery of Visual Attributes of Concrete and Abstract Concepts: Variability, Nearest Neighbors, and Challenging Categories	Tarun Tater et.al.	2410.11657	link
2024-10-15	On-the-fly Modulation for Balanced Multimodal Learning	Yake Wei et.al.	2410.11582	link
2024-10-15	Enhancing Unimodal Latent Representations in Multimodal VAEs through Iterative Amortized Inference	Yuta Oshima et.al.	2410.11403	null
2024-10-14	Saliency Guided Optimization of Diffusion Latents	Xiwen Wang et.al.	2410.10257	null
2024-10-14	MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models	Peng Xia et.al.	2410.10139	link
2024-10-13	LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models	Junyan Ye et.al.	2410.09732	null
2024-10-12	Reconstructive Visual Instruction Tuning	Haochen Wang et.al.	2410.09575	null
2024-10-11	Can GPTs Evaluate Graphic Design Based on Design Principles?	Daichi Haraguchi et.al.	2410.08885	null
2024-10-11	VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding	Houlun Chen et.al.	2410.08593	link
2024-10-10	ElasticTok: Adaptive Tokenization for Image and Video	Wilson Yan et.al.	2410.08368	null
2024-10-10	Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts	Sukwon Yun et.al.	2410.08245	link
2024-10-10	LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts	Anh-Quan Cao et.al.	2410.08211	null
2024-10-10	Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision	Shengcao Cao et.al.	2410.08209	null
2024-10-10	MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models	Wenbo Hu et.al.	2410.08182	null
2024-10-10	Generated Bias: Auditing Internal Bias Dynamics of Text-To-Image Generative Models	Abhishek Mandal et.al.	2410.07884	null
2024-10-09	The Cognitive Capabilities of Generative AI: A Comparative Analysis with Human Benchmarks	Isaac R. Galatzer-Levy et.al.	2410.07391	null
2024-10-12	Deep Correlated Prompting for Visual Recognition with Missing Modalities	Lianyu Hu et.al.	2410.06558	link
2024-10-11	Chip-Tuning: Classify Before Language Models Say	Fangwei Zhu et.al.	2410.06541	link
2024-10-09	Does Spatial Cognition Emerge in Frontier Models?	Santhosh Kumar Ramakrishnan et.al.	2410.06468	null
2024-10-08	Multimodal Representation Learning using Adaptive Graph Construction	Weichen Huang et.al.	2410.06395	null
2024-10-08	Temporal Image Caption Retrieval Competition -- Description and Results	Jakub Pokrywka et.al.	2410.06314	null
2024-10-08	PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling	Xudong Xie et.al.	2410.05970	link
2024-10-08	ModalPrompt:Dual-Modality Guided Prompt for Continual Learning of Large Multimodal Models	Fanhu Zeng et.al.	2410.05849	null
2024-10-08	Multimodal Large Language Models and Tunings: Vision, Language, Sensors, Audio, and Beyond	Soyeon Caren Han et.al.	2410.05608	link
2024-10-08	TeaserGen: Generating Teasers for Long Documentaries	Weihan Xu et.al.	2410.05586	null
2024-10-07	R-Bench: Are your Large Multimodal Model Robust to Real-world Corruptions?	Chunyi Li et.al.	2410.05474	link
2024-10-07	RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction	Yuwei Zhang et.al.	2410.05361	null
2024-10-07	Patch is Enough: Naturalistic Adversarial Patch against Vision-Language Pre-training Models	Dehong Kong et.al.	2410.04884	null
2024-10-06	VISTA: A Visual and Textual Attention Dataset for Interpreting Multimodal Models	Harshit et.al.	2410.04609	null
2024-10-06	UniMuMo: Unified Text, Music and Motion Generation	Han Yang et.al.	2410.04534	link
2024-10-08	Gamified crowd-sourcing of high-quality data for visual fine-tuning	Shashank Yadav et.al.	2410.04038	null
2024-10-07	Multimodal Point-of-Interest Recommendation	Yuta Kanzawa et.al.	2410.03265	null
2024-10-04	Bridging the Gap between Text, Audio, Image, and Any Sequence: A Novel Approach using Gloss-based Annotation	Sen Fang et.al.	2410.03146	null
2024-10-04	AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark	Wenhao Chai et.al.	2410.03051	null
2024-10-07	CPFD: Confidence-aware Privileged Feature Distillation for Short Video Classification	Jinghao Shi et.al.	2410.03038	null
2024-10-07	MMP: Towards Robust Multi-Modal Learning with Masked Modality Projection	Niki Nezakati et.al.	2410.03010	null
2024-10-03	Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos	Jianrui Zhang et.al.	2410.02763	null
2024-10-03	Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models	Zhengfeng Lai et.al.	2410.02740	null
2024-10-04	Video Instruction Tuning With Synthetic Data	Yuanhan Zhang et.al.	2410.02713	null
2024-10-03	LLaVA-Critic: Learning to Evaluate Multimodal Models	Tianyi Xiong et.al.	2410.02712	null
2024-10-03	Plots Unlock Time-Series Understanding in Multimodal Models	Mayank Daswani et.al.	2410.02637	null
2024-10-02	Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations	Minoh Jeong et.al.	2410.02086	null
2024-10-02	Toward a Holistic Evaluation of Robustness in CLIP Models	Weijie Tu et.al.	2410.01534	null
2024-10-02	SHAP-CAT: A interpretable multi-modal framework enhancing WSI classification via virtual staining and shapley-value-based multimodal fusion	Jun Wang et.al.	2410.01408	null
2024-10-02	Backdooring Vision-Language Models with Out-Of-Distribution Data	Weimin Lyu et.al.	2410.01264	null
2024-10-02	OCC-MLLM:Empowering Multimodal Large Language Model For the Understanding of Occluded Objects	Wenmo Qiu et.al.	2410.01261	null
2024-09-30	Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning	Weitai Kang et.al.	2410.00255	link
2024-09-30	Using Large Multimodal Models to Extract Knowledge Components for Knowledge Tracing from Multimedia Question Information	Hyeongdon Moon et.al.	2409.20167	link
2024-10-02	Visual Context Window Extension: A New Perspective for Long Video Understanding	Hongchen Wei et.al.	2409.20018	null
2024-09-30	Towards Robust Multimodal Sentiment Analysis with Incomplete Data	Haoyu Zhang et.al.	2409.20012	link
2024-09-28	FairPIVARA: Reducing and Assessing Biases in CLIP-Based Multimodal Models	Diego A. B. Moreira et.al.	2409.19474	link
2024-09-28	From Unimodal to Multimodal: Scaling up Projectors to Align Modalities	Mayug Maniparambil et.al.	2409.19425	null
2024-10-02	CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling	Jihai Zhang et.al.	2409.19291	link
2024-09-28	TrojVLM: Backdoor Attack Against Vision Language Models	Weimin Lyu et.al.	2409.19232	null
2024-09-27	Multimodal Markup Document Models for Graphic Design Completion	Kotaro Kikuchi et.al.	2409.19051	null
2024-09-27	Emu3: Next-Token Prediction is All You Need	Xinlong Wang et.al.	2409.18869	null
2024-09-27	Data Analysis in the Era of Generative AI	Jeevana Priya Inala et.al.	2409.18475	null
2024-09-26	MultiClimate: Multimodal Stance Detection on Climate Change Videos	Jiawen Wang et.al.	2409.18346	link
2024-09-26	LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness	Chenming Zhu et.al.	2409.18125	null
2024-09-26	GSON: A Group-based Social Navigation Framework with Large Multimodal Model	Shangyi Luo et.al.	2409.18084	null
2024-09-26	A Multimodal Single-Branch Embedding Network for Recommendation in Cold-Start and Missing Modality Scenarios	Christian Ganhör et.al.	2409.17864	link
2024-09-26	Harnessing Shared Relations via Multimodal Mixup Contrastive Learning for Multimodal Classification	Raja Kumar et.al.	2409.17777	link
2024-09-26	MIO: A Foundation Model on Multimodal Tokens	Zekun Wang et.al.	2409.17692	link
2024-09-25	Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models	Matt Deitke et.al.	2409.17146	link
2024-09-24	CDChat: A Large Multimodal Model for Remote Sensing Change Description	Mubashir Noman et.al.	2409.16261	link
2024-09-24	CLSP: High-Fidelity Contrastive Language-State Pre-training for Agent State Representation	Fuxian Huang et.al.	2409.15806	null
2024-09-18	Recommendation with Generative Models	Yashar Deldjoo et.al.	2409.15173	null
2024-09-23	With Ears to See and Eyes to Hear: Sound Symbolism Experiments with Multimodal Large Language Models	Tyler Loakman et.al.	2409.14917	link
2024-09-22	Patch Ranking: Efficient CLIP by Learning to Rank Local Patches	Cheng-En Wu et.al.	2409.14607	null
2024-09-22	Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models	Yew Ken Chia et.al.	2409.14277	null
2024-09-20	Brain-Cognition Fingerprinting via Graph-GCCA with Contrastive Learning	Yixin Wang et.al.	2409.13887	null
2024-09-20	Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model	Li Zhou et.al.	2409.13407	link
2024-09-20	A Novel Adaptive Fine-Tuning Algorithm for Multimodal Models: Self-Optimizing Classification and Selection of High-Quality Datasets in Remote Sensing	Yi Ren et.al.	2409.13345	null
2024-09-20	ChemDFM-X: Towards Large Multimodal Model for Chemistry	Zihan Zhao et.al.	2409.13194	null
2024-09-19	MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines	Dongzhi Jiang et.al.	2409.12959	null
2024-09-24	TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation	Junjie Wen et.al.	2409.12514	null
2024-09-18	Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution	Peng Wang et.al.	2409.12191	link
2024-09-18	All-in-one foundational models learning across quantum chemical levels	Yuxinxin Chen et.al.	2409.12015	link
2024-09-18	LMMCoDrive: Cooperative Driving with Large Multimodal Model	Haichao Liu et.al.	2409.11981	link
2024-09-16	MusicLIME: Explainable Multimodal Music Understanding	Theodoros Sotirou et.al.	2409.10496	link
2024-09-19	IRIS: Interactive Responsive Intelligent Segmentation for 3D Affordance Analysis	Meng Chu et.al.	2409.10078	null
2024-09-16	AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature Parsing	Huawei Ji et.al.	2409.10016	link
2024-09-14	Keypoints-Integrated Instruction-Following Data Generation for Enhanced Human Pose Understanding in Multimodal Models	Dewen Zhang et.al.	2409.09306	null
2024-09-13	Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing	Minh-Duc Vu et.al.	2409.08885	null
2024-09-13	A Multimodal Approach for Fluid Overload Prediction: Integrating Lung Ultrasound and Clinical Data	Tianqi Yang et.al.	2409.08790	null
2024-09-13	Dynamics of Collective Group Affect: Group-level Annotations and the Multimodal Modeling of Convergence and Divergence	Navin Raj Prabhu et.al.	2409.08578	null
2024-09-13	A Comprehensive Survey on Deep Multimodal Learning with Missing Modality	Renjie Wu et.al.	2409.07825	null
2024-09-12	Top-down Activity Representation Learning for Video Question Answering	Yanan Wang et.al.	2409.07748	null
2024-09-11	What to align in multimodal contrastive learning?	Benoit Dufumier et.al.	2409.07402	null
2024-09-11	MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis	Hanyu Jiang et.al.	2409.07129	null
2024-09-11	FSMDet: Vision-guided feature diffusion for fully sparse 3D detector	Tianran Liu et.al.	2409.06945	null
2024-09-16	Scaling Law Hypothesis for Multimodal Model	Qingyun Sun et.al.	2409.06754	null
2024-09-10	Multiclass Arrhythmia Classification using Smartwatch Photoplethysmography Signals Collected in Real-life Settings	Dong Han et.al.	2409.06147	null
2024-09-11	A Survey of Multimodal Composite Editing and Retrieval	Suyan Li et.al.	2409.05405	link
2024-09-05	Learning in Order! A Sequential Strategy to Learn Invariant Features for Multimodal Sentiment Analysis	Xianbing Zhao et.al.	2409.04473	null
2024-09-06	Generating Faithful and Salient Text from Multimodal Data	Tahsina Hashem et.al.	2409.03961	link
2024-09-06	CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models	Wentao Liu et.al.	2409.02834	link
2024-09-10	MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark	Xiang Yue et.al.	2409.02813	null
2024-09-04	Understanding eGFR Trajectories and Kidney Function Decline via Large Multimodal Models	Chih-Yuan Li et.al.	2409.02530	null
2024-09-03	Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models	Bin Fu et.al.	2409.01560	null
2024-09-03	Think Twice Before Recognizing: Large Multimodal Models for General Fine-grained Traffic Sign Recognition	Yaozong Gan et.al.	2409.01534	null
2024-09-02	Towards General Industrial Intelligence: A Survey on IIoT-Enhanced Continual Large Models	Jiao Chen et.al.	2409.01207	null
2024-09-02	Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information	Yi Chen et.al.	2409.01179	null
2024-08-31	Comparative Analysis of Modality Fusion Approaches for Audio-Visual Person Identification and Verification	Aref Farhadipour et.al.	2409.00562	null
2024-08-30	UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios	Baichuan Zhou et.al.	2408.17267	null
2024-08-29	Seeking the Sufficiency and Necessity Causal Features in Multimodal Representation Learning	Boyu Chen et.al.	2408.16577	null
2024-08-29	Toward Robust Early Detection of Alzheimer's Disease via an Integrated Multimodal Learning Approach	Yifei Chen et.al.	2408.16343	link
2024-08-28	Meta-Learn Unimodal Signals with Weak Supervision for Multimodal Sentiment Analysis	Sijie Mai et.al.	2408.16029	null
2024-08-28	ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation	Tiantian Feng et.al.	2408.15803	null
2024-08-28	Visual Prompt Engineering for Medical Vision Language Models in Radiology	Stefan Denner et.al.	2408.15802	null
2024-08-27	X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation	Hanjia Lyu et.al.	2408.15172	null
2024-08-27	The Benefits of Balance: From Information Projections to Variance Reduction	Lang Liu et.al.	2408.15065	null
2024-08-27	NeuralOOD: Improving Out-of-Distribution Generalization Performance with Brain-machine Fusion Learning Framework	Shuangchen Zhao et.al.	2408.14950	null
2024-08-26	MMR: Evaluating Reading Ability of Large Multimodal Models	Jian Chen et.al.	2408.14594	null
2024-09-03	Foundation Models for Music: A Survey	Yinghao Ma et.al.	2408.14340	link
2024-08-26	LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models	Qihang Ge et.al.	2408.14008	null
2024-08-27	Quantum Multimodal Contrastive Learning Framework	Chi-Sheng Chen et.al.	2408.13919	null
2024-08-25	Tangram: A Challenging Benchmark for Geometric Element Recognizing	Jiamin Tang et.al.	2408.13854	null
2024-08-25	Multimodal Ensemble with Conditional Feature Fusion for Dysgraphia Diagnosis in Children from Handwriting Samples	Jayakanth Kunhoth et.al.	2408.13754	null
2024-08-24	Preliminary Investigations of a Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models	Sakhinana Sagar Srinivas et.al.	2408.13621	null
2024-08-23	Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption	Sakhinana Sagar Srinivas et.al.	2408.13248	null
2024-08-23	Indoor scene recognition from images under visual corruptions	Willams de Lima Costa et.al.	2408.13029	null
2024-08-23	Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion Recognition	Cam-Van Thi Nguyen et.al.	2408.12895	null
2024-08-23	Has Multimodal Learning Delivered Universal Intelligence in Healthcare? A Comprehensive Survey	Qika Lin et.al.	2408.12880	link
2024-08-22	Assessing Modality Bias in Video Question Answering Benchmarks with Multimodal Large Language Models	Jean Park et.al.	2408.12763	null
2024-08-22	Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization	Luyao Cheng et.al.	2408.12102	null
2024-08-22	Mental-Perceiver: Audio-Textual Multimodal Learning for Mental Health Assessment	Jinghui Qin et.al.	2408.12088	null
2024-08-21	GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models	Jonathan Roberts et.al.	2408.11817	null
2024-08-21	D-RMGPT: Robot-assisted collaborative tasks driven by large multimodal models	M. Forlini et.al.	2408.11761	null
2024-08-21	UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation	Xiangyu Zhao et.al.	2408.11305	link
2024-08-21	BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation	Haotian Peng et.al.	2408.11281	link
2024-08-20	Exploring the use of Generative AI to Support Automated Just-in-Time Programming for Visual Scene Displays	Cynthia Zastudil et.al.	2408.11137	null
2024-08-21	SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition	Zebang Cheng et.al.	2408.10500	link
2024-08-19	Enhance Modality Robustness in Text-Centric Multimodal Alignment with Adversarial Prompting	Yun-Da Tsai et.al.	2408.09798	null
2024-08-19	Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation	Yunxin Li et.al.	2408.09787	link
2024-08-18	PA-LLaVA: A Large Language-Vision Assistant for Human Pathology Image Understanding	Dawei Dai et.al.	2408.09530	link
2024-08-17	Measuring Visual Sycophancy in Multimodal Models	Jaehyuk Lim et.al.	2408.09111	link
2024-08-16	AdaRank: Disagreement Based Module Rank Prediction for Low-rank Adaptation	Yihe Dong et.al.	2408.09015	link
2024-08-16	xGen-MM (BLIP-3): A Family of Open Large Multimodal Models	Le Xue et.al.	2408.08872	null
2024-08-16	Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs	Jinming Liu et.al.	2408.08575	null
2024-08-15	LLaVA-Surg: Towards Multimodal Surgical Assistant via Structured Surgical Video Learning	Jiajie Li et.al.	2408.07981	null
2024-08-15	MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark	Minxuan Zhou et.al.	2408.07543	link
2024-08-14	Modality Invariant Multimodal Learning to Handle Missing Modalities: A Single-Branch Approach	Muhammad Saad Saeed et.al.	2408.07445	null
2024-08-14	Robust Semi-supervised Multimodal Medical Image Segmentation via Cross Modality Collaboration	Xiaogen Zhon et.al.	2408.07341	link
2024-08-14	Enhancing Visual Question Answering through Ranking-Based Hybrid Training and Multimodal Fusion	Peiyuan Chen et.al.	2408.07303	null
2024-08-13	PathInsight: Instruction Tuning of Multimodal Datasets and Models for Intelligence Assisted Diagnosis in Histopathology	Xiaomin Wu et.al.	2408.07037	null
2024-08-13	EditScribe: Non-Visual Image Editing with Natural Language Verification Loops	Ruei-Che Chang et.al.	2408.06632	null
2024-08-13	CROME: Cross-Modal Adapters for Efficient Multimodal LLM	Sayna Ebrahimi et.al.	2408.06610	null
2024-08-13	Prioritizing Modalities: Flexible Importance Scheduling in Federated Multimodal Learning	Jieming Bian et.al.	2408.06549	null
2024-08-12	VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents	Xiao Liu et.al.	2408.06327	link
2024-08-11	HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes	Xuanyu Su et.al.	2408.05794	null
2024-08-08	Enhancing Journalism with AI: A Study of Contextualized Image Captioning for News Articles using LLMs and LMMs	Aliki Anagnostopoulou et.al.	2408.04331	null
2024-08-06	LLaVA-OneVision: Easy Visual Task Transfer	Bo Li et.al.	2408.03326	link
2024-08-06	Multitask and Multimodal Neural Tuning for Large Models	Hao Sun et.al.	2408.03001	null
2024-08-06	Body of Her: A Preliminary Study on End-to-End Humanoid Agent	Tenglong Ao et.al.	2408.02879	null
2024-08-04	Distribution-Level Memory Recall for Continual Learning: Preserving Knowledge and Avoiding Confusion	Shaoxu Cheng et.al.	2408.02695	null
2024-08-02	A Systematic Review of Intermediate Fusion in Multimodal Deep Learning for Biomedical Applications	Valerio Guarrasi et.al.	2408.02686	null
2024-08-05	REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models	Agneet Chatterjee et.al.	2408.02231	null
2024-08-04	CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event Localization	Xiang He et.al.	2408.01952	link
2024-08-02	MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models	Benno Weck et.al.	2408.01337	link
2024-08-05	Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions	Jin Gao et.al.	2408.01091	link
2024-08-02	GraphAge: Unleashing the power of Graph Neural Network to Decode Epigenetic Aging	Saleh Sakib Ahmed et.al.	2408.00984	link
2024-08-01	MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities	Weihao Yu et.al.	2408.00765	link
2024-08-01	GalleryGPT: Analyzing Paintings with Large Multimodal Models	Yi Bin et.al.	2408.00491	link
2024-08-01	Everything We Hear: Towards Tackling Misinformation in Podcasts	Sachin Pathiyan Cherumanal et.al.	2408.00292	null
2024-08-01	OmniParser for Pure Vision Based GUI Agent	Yadong Lu et.al.	2408.00203	null
2024-07-30	Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection	Jinfa Huang et.al.	2407.21004	null
2024-07-30	HyperMM : Robust Multimodal Learning with Varying-sized Inputs	Hava Chaptoukaev et.al.	2407.20768	null
2024-07-30	Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos	Dhruv Verma et.al.	2407.20642	link
2024-07-29	Adversarial Robustness in RGB-Skeleton Action Recognition: Leveraging Attention Modality Reweighter	Chao Liu et.al.	2407.19981	null
2024-07-29	ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2	Wenjun Huang et.al.	2407.19832	null
2024-08-02	XLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training	Biao Wu et.al.	2407.19546	link
2024-07-28	Detached and Interactive Multimodal Learning	Yunfeng Fan et.al.	2407.19514	link
2024-07-27	Data Processing Techniques for Modern Multimodal Models	Yinheng Li et.al.	2407.19180	null
2024-07-26	MangaUB: A Manga Understanding Benchmark for Large Multimodal Models	Hikaru Ikuta et.al.	2407.19034	null
2024-07-26	Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment	Yuze Zheng et.al.	2407.18854	null
2024-07-26	ChatSchema: A pipeline of extracting structured information with Large Multimodal Models based on schema	Fei Wang et.al.	2407.18716	null
2024-07-25	Sparse vs Contiguous Adversarial Pixel Perturbations in Multimodal Models: An Empirical Analysis	Cristian-Alexandru Botocan et.al.	2407.18251	link
2024-07-25	$\mathbb{X}$ -Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs	Vlad Sobal et.al.	2407.18134	null
2024-07-25	Cross-Vendor Reproducibility of Radiomics-based Machine Learning Models for Computer-aided Diagnosis	Jatin Chaudhary et.al.	2407.18060	null
2024-07-25	What does Kiki look like? Cross-modal associations between speech sounds and visual shapes in vision-and-language models	Tessa Verhoef et.al.	2407.17974	null
2024-07-25	Shapley Value-based Contrastive Alignment for Multimodal Information Extraction	Wen Luo et.al.	2407.17854	null
2024-07-25	Enhancing Model Performance: Another Approach to Vision-Language Instruction Tuning	Vedanshu et.al.	2407.17813	null
2024-07-25	KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models	Eunice Yiu et.al.	2407.17773	link
2024-07-24	Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles	Zuoyin Tang et.al.	2407.17211	null
2024-07-23	Chameleon: Images Are What You Need For Multimodal Learning Robust To Missing Modalities	Muhammad Irzam Liaqat et.al.	2407.16243	null
2024-07-22	LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding	Haoning Wu et.al.	2407.15754	link
2024-07-22	Resource-Efficient Federated Multimodal Learning via Layer-wise and Progressive Training	Ye Lin Tun et.al.	2407.15426	null
2024-07-21	VideoGameBunny: Towards vision assistants for video games	Mohammad Reza Taesiri et.al.	2407.15295	null
2024-07-22	Patch-based Intuitive Multimodal Prototypes Network (PIMPNet) for Alzheimer's Disease classification	Lisa Anita De Santi et.al.	2407.14277	link
2024-07-18	Visual Haystacks: Answering Harder Questions About Sets of Images	Tsung-Han Wu et.al.	2407.13766	link
2024-07-17	Text- and Feature-based Models for Compound Multimodal Emotion Recognition in the Wild	Nicolas Richet et.al.	2407.12927	link
2024-07-16	ChatBCG: Can AI Read Your Slide Deck?	Nikita Singh et.al.	2407.12875	null
2024-07-17	LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models	Kaichen Zhang et.al.	2407.12772	link
2024-07-17	Missing Modality Prediction for Unpaired Multimodal Learning via Joint Embedding of Unimodal Models	Donggeun Kim et.al.	2407.12616	null
2024-07-17	E5-V: Universal Embeddings with Multimodal Large Language Models	Ting Jiang et.al.	2407.12580	link
2024-07-16	FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models	Pengxiang Li et.al.	2407.11522	null
2024-07-16	COMET: "Cone of experience" enhanced large multimodal model for mathematical problem generation	Sannyuya Liu et.al.	2407.11315	null
2024-07-15	OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models	Zijian Zhou et.al.	2407.11213	link
2024-07-15	FabGPT: An Efficient Large Multimodal Model for Complex Wafer Defect Knowledge Queries	Yuqi Jiang et.al.	2407.10810	null
2024-07-15	Scaling 3D Reasoning with LMMs to Large Robot Mission Environments Using Datagraphs	W. J. Meijer et.al.	2407.10743	null
2024-07-16	Qwen2 Technical Report	An Yang et.al.	2407.10671	link
2024-07-15	How and where does CLIP process negation?	Vincent Quantmeyer et.al.	2407.10488	null
2024-07-12	Diagnosing and Re-learning for Balanced Multimodal Learning	Yake Wei et.al.	2407.09705	link
2024-07-12	Unifying Sequences, Structures, and Descriptions for Any-to-Any Protein Generation with the Large Multimodal Model HelixProtX	Zhiyuan Chen et.al.	2407.09274	link
2024-07-12	DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training	Chen Xin et.al.	2407.09174	link
2024-07-11	Emerging Practices for Large Multimodal Model (LMM) Assistance for People with Visual Impairments: Implications for Design	Jingyi Xie et.al.	2407.08882	null
2024-07-10	RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization	Xijie Huang et.al.	2407.08044	link
2024-07-10	LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models	Feng Li et.al.	2407.07895	link
2024-07-11	InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior	Chenguo Lin et.al.	2407.07580	null
2024-07-10	Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model	Wenqi Zhang et.al.	2407.07053	link
2024-07-08	ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation	Ethan Chern et.al.	2407.06135	link
2024-07-07	Multimodal Language Models for Domain-Specific Procedural Video Summarization	Nafisa Hussain et.al.	2407.05419	null
2024-07-07	Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition	Zirun Guo et.al.	2407.05374	link
2024-07-06	Enhance the Robustness of Text-Centric Multimodal Alignments	Ting-Yu Yen et.al.	2407.05036	null
2024-07-06	Completed Feature Disentanglement Learning for Multimodal MRIs Analysis	Tianling Liu et.al.	2407.04916	null
2024-07-06	MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension	Zekun Li et.al.	2407.04903	link
2024-07-05	VCoME: Verbal Video Composition with Multimodal Editing Effects	Weibo Gong et.al.	2407.04697	null
2024-07-05	Multimodal Classification via Modal-Aware Interactive Enhancement	Qing-Yuan Jiang et.al.	2407.04587	null
2024-07-05	Robust Multimodal Learning via Representation Decoupling	Shicai Wei et.al.	2407.04458	null
2024-07-05	Smart Vision-Language Reasoners	Denisa Roberts et.al.	2407.04212	link
2024-07-04	Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks	Amit Parekh et.al.	2407.03967	link
2024-07-04	ADAPT: Multimodal Learning for Detecting Physiological Changes under Missing Modalities	Julie Mordacq et.al.	2407.03836	link
2024-07-04	M $\mathbf5$ -- A Diverse Benchmark to Assess the Performance of Large Multimodal Models Across Multilingual and Multicultural Vision-Language Tasks	Florian Schneider et.al.	2407.03791	null
2024-07-03	HEMM: Holistic Evaluation of Multimodal Foundation Models	Paul Pu Liang et.al.	2407.03418	link
2024-07-02	Multi-Peptide: Multimodality Leveraged Language-Graph Learning of Peptide Properties	Srivathsan Badrinarayanan et.al.	2407.03380	link
2024-07-02	Understanding Alignment in Multimodal LLMs: A Comprehensive Study	Elmira Amirloo et.al.	2407.02477	null
2024-07-02	Synthetic Multimodal Question Generation	Ian Wu et.al.	2407.02233	null
2024-07-02	Crossroads of Continents: Automated Artifact Extraction for Cultural Adaptation with Large Multimodal Models	Anjishnu Mukherjee et.al.	2407.02067	link
2024-07-01	Empathic Grounding: Explorations using Multimodal Interaction and Large Language Models with Conversational Agents	Mehdi Arjmand et.al.	2407.01824	link
2024-07-01	We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?	Runqi Qiao et.al.	2407.01284	link
2024-07-01	Unaligning Everything: Or Aligning Any Text to Any Image in Multimodal Models	Shaeke Salman et.al.	2407.01157	null
2024-06-29	AI-powered multimodal modeling of personalized hemodynamics in aortic stenosis	Caglar Ozturk et.al.	2407.00535	null
2024-06-29	MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation	Jinsheng Huang et.al.	2407.00468	link
2024-06-29	How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models	Jaeyoung Lee et.al.	2407.00369	null
2024-06-28	PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration	Yuxuan Sun et.al.	2407.00203	null
2024-06-28	EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model	Yuxuan Zhang et.al.	2406.20076	link
2024-06-28	InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding	Kirolos Ataallah et.al.	2406.19875	link
2024-06-28	MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis	Jun-Yan He et.al.	2406.19859	null
2024-06-28	MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment	Jihao Liu et.al.	2406.19736	link
2024-06-28	Enhancing Radiological Diagnosis: A Collaborative Approach Integrating AI and Human Expertise for Visual Miss Correction	Akash Awasthi et.al.	2406.19686	null
2024-06-28	SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs	Xin Su et.al.	2406.19593	null
2024-06-27	OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding	Tao Zhang et.al.	2406.19389	null
2024-06-28	FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts	Shubhankar Singh et.al.	2406.19237	null
2024-06-27	RAVEN: Multitask Retrieval Augmented Vision-Language Learning	Varun Nagaraj Rao et.al.	2406.19150	null
2024-06-27	DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming	Jiaxin Zhang et.al.	2406.19101	null
2024-06-27	Fairness and Bias in Multimodal AI: A Survey	Tosin Adewumi et.al.	2406.19097	null
2024-06-27	MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation	Sanggeon Yun et.al.	2406.18815	null
2024-06-26	MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data	William Berman et.al.	2406.18790	null
2024-06-26	S3: A Simple Strong Sample-effective Multimodal Dialog System	Elisei Rykov et.al.	2406.18305	link
2024-06-26	EHR-Based Mobile and Web Platform for Chronic Disease Risk Prediction Using Large Language Multimodal Models	Chun-Chieh Liao et.al.	2406.18087	null
2024-06-26	Speech2UnifiedExpressions: Synchronous Synthesis of Co-Speech Affective Face and Body Expressions from Affordable Inputs	Uttaran Bhattacharya et.al.	2406.18068	null
2024-06-25	Human-centered In-building Embodied Delivery Benchmark	Zhuoqun Xu et.al.	2406.17898	link
2024-06-25	InFiConD: Interactive No-code Fine-tuning with Concept-based Knowledge Distillation	Jinbin Huang et.al.	2406.17838	null
2024-06-25	Data curation via joint example selection further accelerates multimodal learning	Talfan Evans et.al.	2406.17711	null
2024-06-25	Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights	Hao Yang et.al.	2406.17430	link
2024-06-24	At First Sight: Zero-Shot Classification of Astronomical Images with Large Multimodal Models	Dimitrios Tanoglidis et.al.	2406.17057	null
2024-06-24	Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models	Jierun Chen et.al.	2406.16866	link
2024-06-24	Long Context Transfer from Language to Vision	Peiyuan Zhang et.al.	2406.16852	link
2024-06-24	QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds	Ye Wang et.al.	2406.16578	null
2024-06-21	Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning	Brandon Huang et.al.	2406.15334	link
2024-06-21	Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models	Jiayu Wang et.al.	2406.14852	link
2024-06-20	Evaluating vision-capable chatbots in interpreting kinematics graphs: a comparative study of free and subscription-based models	Giulia Polverini et.al.	2406.14685	null
2024-06-20	Revealing Vision-Language Integration in the Brain with Multimodal Networks	Vighnesh Subramaniam et.al.	2406.14481	link
2024-06-25	iWISDM: Assessing instruction following in multimodal models at scale	Xiaoxuan Lei et.al.	2406.14343	link
2024-06-20	Two Giraffes in a Dirt Field: Using Game Play to Investigate Situation Modelling in Large Multimodal Models	Sherzod Hakimov et.al.	2406.14035	null
2024-06-20	Knowledge-driven Subspace Fusion and Gradient Coordination for Multi-modal Learning	Yupei Zhang et.al.	2406.13979	link
2024-06-20	PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents	Junjie Wang et.al.	2406.13923	null
2024-06-19	Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models	Zhawnen Chen et.al.	2406.13763	null
2024-06-19	GUI Action Narrator: Where and When Did That Action Take Place?	Qinchen Wu et.al.	2406.13719	null
2024-06-19	Is AI fun? HumorDB: a curated dataset and benchmark to investigate graphical humor	Veedant Jain et.al.	2406.13564	null
2024-06-19	VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models	Haowen Hou et.al.	2406.13362	link
2024-06-19	Learnable In-Context Vector for Visual Question Answering	Yingzhe Peng et.al.	2406.13185	link
2024-06-18	Synergizing Foundation Models and Federated Learning: A Survey	Shenghui Li et.al.	2406.12844	null
2024-06-18	OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI	Zhen Huang et.al.	2406.12753	link
2024-06-18	Disturbing Image Detection Using LMM-Elicited Emotion Embeddings	Maria Tzelepi et.al.	2406.12668	null
2024-06-18	Automatic benchmarking of large multimodal models via iterative experiment programming	Alessandro Conti et.al.	2406.12321	link
2024-06-18	Language and Multimodal Models in Sports: A Survey of Datasets and Applications	Haotian Xia et.al.	2406.12252	null
2024-06-17	VideoLLM-online: Online Video Large Language Model for Streaming Video	Joya Chen et.al.	2406.11816	null
2024-06-17	LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning	Dantong Niu et.al.	2406.11815	null
2024-06-17	Multimodal Learning To Improve Segmentation With Intraoperative CBCT & Preoperative CT	Maximilian E. Tschuchnig et.al.	2406.11650	null
2024-06-17	Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment	Chao Wen et.al.	2406.11334	null
2024-06-17	VideoVista: A Versatile Benchmark for Video Understanding and Reasoning	Yunxin Li et.al.	2406.11303	null
2024-06-17	i-SRT: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective Judgment	Daechul Ahn et.al.	2406.11280	link
2024-06-17	MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens	Anas Awadalla et.al.	2406.11271	link
2024-06-17	Generative Visual Instruction Tuning	Jefferson Hernandez et.al.	2406.11262	link
2024-06-17	Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective	Yang Chen et.al.	2406.11249	null
2024-06-16	Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies	Hung-Ting Su et.al.	2406.10923	null
2024-06-15	Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model	Lu Xu et.al.	2406.10484	link
2024-06-12	MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases	Rithesh Murthy et.al.	2406.10290	null
2024-06-14	VideoGUI: A Benchmark for GUI Automation from Instructional Videos	Kevin Qinghong Lin et.al.	2406.10227	null
2024-06-14	ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation	Chufan Shi et.al.	2406.09961	link
2024-06-14	BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval	Imanol Miranda et.al.	2406.09952	link
2024-06-13	VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding	Muhammad Maaz et.al.	2406.09418	link
2024-06-13	Explore the Limits of Omni-modal Pretraining at Scale	Yiyuan Zhang et.al.	2406.09412	link
2024-06-14	4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities	Roman Bachmann et.al.	2406.09406	null
2024-06-13	Yo'LLaVA: Your Personalized Language and Vision Assistant	Thao Nguyen et.al.	2406.09400	link
2024-06-13	CMC-Bench: Towards a New Paradigm of Visual Signal Compression	Chunyi Li et.al.	2406.09356	link
2024-06-13	Comparison Visual Instruction Tuning	Wei Lin et.al.	2406.09240	null
2024-06-13	Zoom and Shift are All You Need	Jiahao Qin et.al.	2406.08866	null
2024-06-11	Embedding-based Multimodal Learning on Pan-Squamous Cell Carcinomas for Improved Survival Outcomes	Asim Waqas et.al.	2406.08521	null
2024-06-14	Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models	Yi-Fan Zhang et.al.	2406.08487	link
2024-06-13	OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text	Qingyun Li et.al.	2406.08418	link
2024-06-12	A Concept-Based Explainability Framework for Large Multimodal Models	Jayneel Parekh et.al.	2406.08074	link
2024-06-12	LVBench: An Extreme Long Video Understanding Benchmark	Weihan Wang et.al.	2406.08035	link
2024-06-11	Cognitive Insights Across Languages: Enhancing Multimodal Interview Analysis	David Ortiz-Perez et.al.	2406.07542	link
2024-06-11	Understanding Visual Concepts Across Models	Brandon Trabucco et.al.	2406.07506	link
2024-06-11	Unified Modeling Enhanced Multimodal Learning for Precision Neuro-Oncology	Huahui Yi et.al.	2406.07078	link
2024-06-14	BTS: Bridging Text and Sound Modalities for Metadata-Aided Respiratory Sound Classification	June-Woo Kim et.al.	2406.06786	link
2024-06-10	Vript: A Video Is Worth Thousands of Words	Dongjie Yang et.al.	2406.06040	link
2024-06-10	FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model	Yebin Lee et.al.	2406.06004	link
2024-06-10	CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark	David Romero et.al.	2406.05967	null
2024-06-09	Stealthy Targeted Backdoor Attacks against Image Captioning	Wenshu Fan et.al.	2406.05874	link
2024-06-09	F-LMM: Grounding Frozen Large Multimodal Models	Size Wu et.al.	2406.05821	link
2024-06-08	Generalist Multimodal AI: A Review of Architectures, Challenges and Opportunities	Sai Munikoti et.al.	2406.05496	null
2024-06-07	Semantic Segmentation on VSPW Dataset through Masked Video Consistency	Chen Liang et.al.	2406.04979	null
2024-06-07	Predictive Dynamic Fusion	Bing Cao et.al.	2406.04802	link
2024-06-07	MGIMM: Multi-Granularity Instruction Multimodal Model for Attribute-Guided Remote Sensing Image Detailed Description	Cong Yang et.al.	2406.04716	link
2024-06-07	AICoderEval: Improving AI Domain Code Generation of Large Language Models	Yinghui Xia et.al.	2406.04712	null
2024-06-06	GenAI Arena: An Open Evaluation Platform for Generative Models	Dongfu Jiang et.al.	2406.04485	null
2024-06-06	MAIRA-2: Grounded Radiology Report Generation	Shruthi Bannur et.al.	2406.04449	link
2024-06-06	DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs	Lingchen Meng et.al.	2406.04334	null
2024-06-06	BLSP-Emo: Towards Empathetic Large Speech-Language Models	Chen Wang et.al.	2406.03872	link
2024-06-05	Identification of Stone Deterioration Patterns with Large Multimodal Models	Daniele Corradetti et.al.	2406.03207	link
2024-06-05	Exploiting LMM-based knowledge for image classification tasks	Maria Tzelepi et.al.	2406.03071	null
2024-06-02	Multimodal Deep Learning for Low-Resource Settings: A Vector Embedding Alignment Approach for Healthcare Applications	David Restrepo et.al.	2406.02601	null
2024-06-04	Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning	Alex Jinpeng Wang et.al.	2406.02547	link
2024-06-04	Dealing with All-stage Missing Modality: Towards A Universal Model with Robust Reconstruction and Personalization	Yunpeng Zhao et.al.	2406.01987	null
2024-06-03	Automatic Fused Multimodal Deep Learning for Plant Identification	Alfreds Lapkovskis et.al.	2406.01455	link
2024-06-05	Pulmonary Embolism Mortality Prediction Using Multimodal Learning Based on Computed Tomography Angiography and Clinical Data	Zhusi Zhong et.al.	2406.01302	null
2024-06-03	Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model	Kezhen Chen et.al.	2406.00977	link
2024-06-02	Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient	Zechu Li et.al.	2406.00681	null
2024-06-04	StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond	Pengyuan Lyu et.al.	2405.21013	null
2024-05-31	Don't Buy it! Reassessing the Ad Understanding Abilities of Contrastive Multimodal Models	A. Bavaresco et.al.	2405.20846	link
2024-06-17	Ovis: Structural Embedding Alignment for Multimodal Large Language Model	Shiyin Lu et.al.	2405.20797	link
2024-05-31	Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation Learning	Yang Chen et.al.	2405.20606	link
2024-05-30	Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA	Qianqi Yan et.al.	2405.20421	link
2024-05-30	Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use	Franz Louis Cesista et.al.	2405.20245	null
2024-05-31	Visual Attention Analysis in Online Learning	Miriam Navarro et.al.	2405.20091	null
2024-05-30	MM-Lego: Modular Biomedical Multimodal Models with Minimal Fine-Tuning	Konstantin Hemker et.al.	2405.19950	null
2024-05-30	Instruction-Guided Visual Masking	Jinliang Zheng et.al.	2405.19783	link
2024-05-29	Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining	Blake R. Duschatko et.al.	2405.19386	null
2024-06-09	LLMs Meet Multimodal Generation and Editing: A Survey	Yingqing He et.al.	2405.19334	link
2024-05-29	Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare	Hanwei Zhu et.al.	2405.19298	link
2024-05-31	Benchmarking and Improving Detail Image Caption	Hongyuan Dong et.al.	2405.19092	link
2024-05-29	Topological Perspectives on Optimal Multimodal Embedding Spaces	Abdul Aziz A. B et.al.	2405.18867	null
2024-05-29	Exploring Exotic Decays of the Higgs Boson to Multi-Photons at the LHC via Multimodal Learning Approaches	A. Hammad et.al.	2405.18834	null
2024-05-28	The Evolution of Multimodal Model Architectures	Shakti N. Wadekar et.al.	2405.17927	null
2024-05-28	Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment	Xin Xiao et.al.	2405.17871	link
2024-05-28	Full-Stack Allreduce on Multi-Rail Networks	Enda Yu et.al.	2405.17870	null
2024-05-28	MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance	Yake Wei et.al.	2405.17730	link
2024-05-27	Matryoshka Multimodal Models	Mu Cai et.al.	2405.17430	null
2024-05-27	XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser	Xianfu Cheng et.al.	2405.17336	link
2024-05-28	LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding	Haoyu Zhao et.al.	2405.17104	null
2024-05-27	Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning	Zihua Zhao et.al.	2405.16996	link
2024-05-27	Multilingual Diversity Improves Vision-Language Representations	Thao Nguyen et.al.	2405.16915	null
2024-05-26	Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs	Mustafa Shukor et.al.	2405.16700	link
2024-05-25	How Well Do Deep Learning Models Capture Human Concepts? The Case of the Typicality Effect	Siddhartha K. Vemuri et.al.	2405.16128	null
2024-05-24	ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models	Chunjiang Ge et.al.	2405.15738	link
2024-05-24	Chain-of-Thought Prompting for Demographic Inference with Large Multimodal Models	Yongsheng Yu et.al.	2405.15687	null
2024-05-24	M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models	Hongyu Wang et.al.	2405.15638	link
2024-05-24	DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception	Run Luo et.al.	2405.15232	link
2024-05-24	Shopping Queries Image Dataset (SQID): An Image-Enriched ESCI Dataset for Exploring Multimodal Learning in Product Search	Marie Al Ghossein et.al.	2405.15190	link

(back to top)

Generative Weight Space Modeling

Publish Date	Title	Authors	PDF	Code
2024-12-19	DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation	Wang Zhao et.al.	2412.15200	null
2024-12-18	On the principle of linearized stability for quasilinear evolution equations in time-weighted spaces	Bogdan-Vasile Matioc et.al.	2412.13940	null
2024-12-17	On the Bäcklund transform and the stability of the line soliton of the KP-II equation on $\mathbb R^2$	Lorenzo Pompili et.al.	2412.12530	null
2024-12-13	On the embedding of weighted Sobolev spaces with applications to a planar nonlinear Schrödinger equation	Antonio Azzolini et.al.	2412.10067	null
2024-12-12	Modified scattering for the cubic dispersion-managed NLS	Jason Murphy et.al.	2412.09762	null
2024-12-12	LoRACLR: Contrastive Adaptation for Customization of Diffusion Models	Enis Simsar et.al.	2412.09622	null
2024-12-11	Exploring superconformal Yang-Mills theories through matrix Bessel kernels	Zoltan Bajnok et.al.	2412.08732	null
2024-12-09	Bilinear singular integral operators with kernels in weighted spaces	Petr Honzík et.al.	2412.07014	null
2024-12-04	Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach	Lingchen Sun et.al.	2412.03017	link
2024-11-21	Strong localization blurs criticality of time series for spreading phenomena on networks	Juliane T. Moraes et.al.	2412.01842	null
2024-12-02	Geometric invariant theory and stretched Kostka quasi-polynomials	Marc Besson et.al.	2412.01651	null
2024-11-29	Origin-Destination Demand Prediction: An Urban Radiation and Attraction Perspective	Xuan Ma et.al.	2412.00167	null
2024-11-29	Rényi complexity in mean-field disordered systems	Nina Javerzat et.al.	2411.19817	null
2024-11-28	An Extensive Evaluation of Factual Consistency in Large Language Models for Data-to-Text Generation	Joy Mahapatra et.al.	2411.19203	null
2024-11-27	Task Arithmetic Through The Lens Of One-Shot Federated Learning	Zhixu Tao et.al.	2411.18607	null
2024-11-25	Spectral properties of Lévy Fokker--Planck equations	Hardy Chan et.al.	2411.16424	null
2024-11-20	Nonlinear orbital stability of stationary shock profiles for the Lax-Wendroff scheme	Jean-François Coulombel et.al.	2411.13094	null
2024-11-26	Enhancing generalization in high energy physics using white-box adversarial attacks	Franck Rothen et.al.	2411.09296	null
2024-11-11	Minimal nilpotent finite $W$-algebra and cuspidal module category of $\mathfrak{sp}_{2n}$	Genqiang Liu et.al.	2411.06768	null
2024-11-07	Well-Posedness and Regularity of the Heat Equation with Robin Boundary Conditions in the Two-Dimensional Wedge	Marco Bravin et.al.	2411.04651	null
2024-11-04	SALSA: Soup-based Alignment Learning for Stronger Adaptation in RLHF	Atoosa Chegini et.al.	2411.01798	null
2024-12-06	Modular Duality in Deep Learning	Jeremy Bernstein et.al.	2410.21265	null
2024-10-26	MarDini: Masked Autoregressive Diffusion for Video Generation at Scale	Haozhe Liu et.al.	2410.20280	null
2024-10-25	Four-parameter Mittag-Leffler functions and their associated coherent states	Dušan Popov et.al.	2410.19462	null
2024-10-24	Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation	Krzysztof Ociepa et.al.	2410.18565	null
2024-10-21	Two dimensional delta Bose gas in a weighted space	Sudheesh Surendranath et.al.	2410.16550	null
2024-10-21	In Search of the Successful Interpolation: On the Role of Sharpness in CLIP Generalization	Alireza Abdollahpoorrostam et.al.	2410.16476	link
2024-10-23	Universal approximation results for neural networks with non-polynomial activation function over non-compact domains	Ariel Neufeld et.al.	2410.14759	null
2024-10-23	Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching	Jie Peng et.al.	2410.14740	null
2024-10-16	Differential Shape Optimization with Image Representation for Photonic Design	Zhaocheng Liu et.al.	2410.13074	null
2024-10-15	Scaling Laws for Multilingual Language Models	Yifei He et.al.	2410.12883	null
2024-10-16	AutoSimTTF: A Fully Automatic Pipeline for Electric Field Simulation and Treatment Planning of Tumor Treating Fields	Minmin Wang et.al.	2410.12196	null
2024-10-15	Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence	Shangbin Feng et.al.	2410.11163	null
2024-10-14	Deep Linear Probe Generators for Weight Space Learning	Jonathan Kahana et.al.	2410.10811	null
2024-10-14	Generating Model Parameters for Controlling: Parameter Diffusion for Controllable Multi-Task Recommendation	Chenglei Shen et.al.	2410.10639	null
2024-10-14	MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer	Minghao Zhu et.al.	2410.10589	link
2024-10-15	Regions of Level $\ell$ of Catalan/Semiorder-Type Arrangements	Yanru Chen et.al.	2410.10198	null
2024-10-13	A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning	Chen-Yu Liu et.al.	2410.09846	null
2024-10-11	Meta-Transfer Learning Empowered Temporal Graph Networks for Cross-City Real Estate Appraisal	Weijia Zhang et.al.	2410.08947	null
2024-10-09	Efficient Weight-Space Laplace-Gaussian Filtering and Smoothing for Sequential Deep Learning	Joanna Sliwa et.al.	2410.06800	null
2024-10-09	Revisiting Multi-Permutation Equivariance through the Lens of Irreducible Representations	Yonatan Sverdlov et.al.	2410.06665	link
2024-10-08	Weighted Embeddings for Low-Dimensional Graph Representation	Thomas Bläsius et.al.	2410.06042	null
2024-10-05	Computing ground states of Bose-Einstein condensation by normalized deep neural network	Weizhu Bao et.al.	2410.05319	link
2024-10-07	Hyper-Representations: Learning from Populations of Neural Networks	Konstantin Schürholt et.al.	2410.05107	link
2024-10-06	Integrable Modules of Map full Toroidal Lie Algebras	Pradeep Bisht et.al.	2410.04495	null
2024-10-06	Global well-posedness for the defocusing 3D quadratic NLS in the sharp critical space	Jia Shen et.al.	2410.04337	null
2024-10-05	Equivariant Neural Functional Networks for Transformers	Viet-Hoang Tran et.al.	2410.04209	null
2024-10-15	Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models	Theo Putterman et.al.	2410.04207	null
2024-10-04	Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks	Ann Huang et.al.	2410.03972	null
2024-10-04	Autoregressive Moving-average Attention Mechanism for Time Series Forecasting	Jiecheng Lu et.al.	2410.03159	link
2024-10-02	Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets	Yuandong Tian et.al.	2410.01779	link
2024-10-01	SynCOM: A tool for simulating coronal outflows	Valmir Moraes Filho et.al.	2410.01004	null
2024-10-01	On the prime ideals of higher secant varieties of Veronese embeddings of small degrees	Katsuhisa Furukawa et.al.	2410.00652	null
2024-09-30	Old Optimizer, New Norm: An Anthology	Jeremy Bernstein et.al.	2409.20325	null
2024-09-27	Effects of Peierls phases in open linear chains	Anselmo M. Marques et.al.	2409.18780	null
2024-09-27	Density of states in neural networks: an in-depth exploration of learning in parameter space	Margherita Mele et.al.	2409.18683	null
2024-09-26	The time periodic problem for the Navier-Stokes equations in exterior domains in weighted spaces	Reinhard Farwig et.al.	2409.17590	null
2024-09-25	Scalable Ensemble Diversification for OOD Generalization and Detection	Alexander Rubinstein et.al.	2409.16797	null
2024-10-04	Lessons Learned from a Unifying Empirical Study of Parameter-Efficient Transfer Learning (PETL) in Visual Recognition	Zheda Mai et.al.	2409.16434	link
2024-09-24	VascX Models: Model Ensembles for Retinal Vascular Analysis from Color Fundus Images	Jose Vargas Quiros et.al.	2409.16016	link
2024-09-23	Efficient Large-Scale Quantum Optimization via Counterdiabatic Ansatz	Jie Liu et.al.	2409.15055	null
2024-09-24	Weighted Approximation By Max-Product Generalized Exponential Sampling Series	Satyaranjan Pradhan et.al.	2409.14884	null
2024-09-21	Weakly magnetized black holes in Einstein-ModMax theory	Haryanto M. Siahaan et.al.	2409.13967	null
2024-09-18	Monomial Matrix Group Equivariant Neural Functional Networks	Hoang V. Tran et.al.	2409.11697	link
2024-09-17	Existence of an extremal function of Sobolev critical embedding with an $α$ -homogeneous weight	Petr Gurka et.al.	2409.11193	null
2024-09-16	Inferring stellar parameters and their uncertainties from high-resolution spectroscopy using invertible neural networks	Nils Candebat et.al.	2409.10621	null
2024-09-13	Non-unitary Wightman CFTs and non-unitary vertex algebras	Sebastiano Carpi et.al.	2409.08454	null
2024-09-12	Global well-posedness and scattering in weighted space for nonlinear Schrödinger equations below the Strauss exponent without gauge-invariance	Masaki Kawamoto et.al.	2409.08432	null
2024-09-09	Fast gradient-free optimization of excitations in variational quantum eigensolvers	Jonas Jäger et.al.	2409.05939	null
2024-09-06	SCARF: Scalable Continual Learning Framework for Memory-efficient Multiple Neural Radiance Fields	Yuze Wang et.al.	2409.04482	null
2024-09-04	Federated Quantum-Train with Batched Parameter Generation	Chen-Yu Liu et.al.	2409.02763	null
2024-09-16	Regret Analysis for Randomized Gaussian Process Upper Confidence Bound	Shion Takeno et.al.	2409.00979	null
2024-08-30	Abstracted Gaussian Prototypes for One-Shot Concept Learning	Chelsea Zou et.al.	2408.17251	link
2024-08-23	Emergence of global receptive fields capturing multipartite quantum correlations	Oleg M. Sotnikov et.al.	2408.13033	null
2024-08-22	**Action of $\mathfrak{osp}(1	2n)$ on polynomials tensor $\mathbb{C}^{0	2n}$**	Dwight Anderson Williams II et.al.
2024-08-19	Unimodal sequences and mixed false theta functions	Kevin Allen et.al.	2408.09789	null
2024-08-16	Onsager-Machlup functional for stochastic lattice dynamical systems driven by time-varying noise	Xinze Zhang et.al.	2408.08465	null
2024-08-10	Variational Inference Failures Under Model Symmetries: Permutation Invariant Posteriors for Bayesian Neural Networks	Yoav Gelberg et.al.	2408.05496	null
2024-08-09	Quasilinear parabolic equations with superlinear nonlinearities in critical spaces	Bogdan-Vasile Matioc et.al.	2408.05067	null
2024-08-08	A framework for generalizing toric inequalities for holographic entanglement entropy	Ning Bao et.al.	2408.04741	null
2024-08-07	Counterfactuals and Uncertainty-Based Explainable Paradigm for the Automated Detection and Segmentation of Renal Cysts in Computed Tomography Images: A Multi-Center Study	Zohaib Salahuddin et.al.	2408.03789	null
2024-08-05	BOTS-LM: Training Large Language Models for Setswana	Nathan Brown et.al.	2408.02239	null
2024-08-02	Conditional LoRA Parameter Generation	Xiaolong Jin et.al.	2408.01415	null
2024-08-01	Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization	Róisín Luo et.al.	2408.00923	null
2024-07-31	Semantic Codebook Learning for Dynamic Recommendation Models	Zheqi Lv et.al.	2408.00123	null
2024-07-29	Tensor product weight modules over the affine-Virasoro algebra	Qiu-Fan Chen et.al.	2407.19844	null
2024-07-24	Generalized Hilbert operators acting on weighted spaces of holomorphic functions with sup-norms	María J. Beltrán-Meneu et.al.	2407.17646	null
2024-07-24	Generalized Ordinal Priority Approach for Multi-Attribute Decision-Making under Incomplete Preference Information	Renlong Wang et.al.	2407.17099	null
2024-07-22	WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation	Zirui Shao et.al.	2407.15502	link
2024-07-18	FSP-Laplace: Function-Space Priors for the Laplace Approximation in Bayesian Deep Learning	Tristan Cinquin et.al.	2407.13711	null
2024-07-19	Parameter Generation of Quantum Approximate Optimization Algorithm with Diffusion Model	Fanxu Meng et.al.	2407.12242	null
2024-07-24	Effect Heterogeneity with Earth Observation in Randomized Controlled Trials: Exploring the Role of Data, Model, and Evaluation Metric Choice	Connor T. Jerzak et.al.	2407.11674	link
2024-07-15	Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion	Yongyuan Liang et.al.	2407.10973	null
2024-07-16	The well-posedness of generalized nonlinear wave equations on the lattice graph	Bobo Hua et.al.	2407.09815	null
2024-07-15	Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Cross-Regularization	Jinlong Li et.al.	2407.08374	null
2024-07-09	Fine-Tuning Linear Layers Only Is a Simple yet Effective Way for Task Arithmetic	Ruochen Jin et.al.	2407.07089	link
2024-07-04	Recovering Initial States in Semilinear Parabolic Problems from Time-Averages	Lina Sophie Schmitz et.al.	2407.03829	null
2024-07-01	A quantum deformation of the ${\mathcal N}=2$ superconformal algebra	H. Awata et.al.	2407.00901	null
2024-06-24	WARP: On the Benefits of Weight Averaged Rewarded Policies	Alexandre Ramé et.al.	2406.16768	null
2024-06-24	Improving robustness to corruptions with multiplicative weight perturbations	Trung Trinh et.al.	2406.16540	link
2024-06-21	Determination of certain mod $p$ Galois representations using local constancy	Abhik Ganguli et.al.	2406.15600	null
2024-06-21	Elliptic analysis on collapsing gravitational instantons modelled using the Gibbons-Hawking ansatz	Willem Adriaan Salm et.al.	2406.15008	null
2024-06-20	MEAT: Median-Ensemble Adversarial Training for Improving Robustness and Generalization	Zhaozhe Hu et.al.	2406.14259	link
2024-06-18	From Instance Training to Instruction Learning: Task Adapters Generation from Instructions	Huanxuan Liao et.al.	2406.12382	link
2024-06-17	Kaniadakis entropy in extreme gravitational and cosmological environments: a review on the state-of-the-art and future prospects	Giuseppe Gaetano Luciano et.al.	2406.11373	null
2024-06-16	Analysis and approximation of elliptic problems with Uhlenbeck structure in convex polytopes	Tadele Mengesha et.al.	2406.10762	null
2024-06-14	Towards Scalable and Versatile Weight Space Learning	Konstantin Schürholt et.al.	2406.09997	link
2024-06-13	Interpreting the Weight Space of Customized Diffusion Models	Amil Dravid et.al.	2406.09413	link
2024-06-12	Diffusion Soup: Model Merging for Text-to-Image Diffusion Models	Benjamin Biggs et.al.	2406.08431	null
2024-06-24	Cartan monopoles	Andrei Smilga et.al.	2406.06042	null
2024-06-08	Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models	Minho Park et.al.	2406.05432	link
2024-06-06	Regularized KL-Divergence for Well-Defined Function-Space Variational Inference in Bayesian neural networks	Tristan Cinquin et.al.	2406.04317	null
2024-06-06	A characterization of $(μ,ν)$ -dichotomies via admissibility	Lucas Backes et.al.	2406.04126	null
2024-06-05	Reproducing Kernel Thesis of Hankel Operators on Weighted Hardy Spaces	Ana Čolović et.al.	2406.03106	null
2024-05-21	Backpropogation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration	Wei Ji et.al.	2406.01601	null
2024-05-29	Thermodynamics of the most generalized form of Holographic Dark Energy and some particular cases with Corrected Entropies	Sanghati Saha et.al.	2405.20783	null
2024-06-20	The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof	Derek Lim et.al.	2405.20231	link
2024-05-28	Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography	Jie Liu et.al.	2405.18356	link
2024-05-28	$C^2M^3$ : Cycle-Consistent Multi-Model Merging	Donato Crisostomi et.al.	2405.17897	link
2024-05-27	Smoothing effects and extinction in finite time for fractional fast diffusions on Riemannian manifolds	Elvise Berchio et.al.	2405.17126	null
2024-05-31	FedSheafHN: Personalized Federated Learning on Graph-structured Data	Wenfei Liang et.al.	2405.16056	null
2024-05-27	HyperInterval: Hypernetwork approach to training weight interval regions in continual learning	Patryk Krukowski et.al.	2405.15444	link
2024-05-23	Scalable Optimization in the Modular Norm	Tim Large et.al.	2405.14813	link
2024-06-16	A refined Weyl character formula for comodules on $\operatorname{GL}_{2,A}$	Helge Øystein Maakestad et.al.	2405.09210	null
2024-05-13	Localizing Task Information for Improved Model Merging and Compression	Ke Wang et.al.	2405.07813	link
2024-05-13	$α$ VIL: Learning to Leverage Auxiliary Tasks for Multitask Learning	Rafael Kourdis et.al.	2405.07769	null
2024-05-12	Approximation by a new sequence of operators involving Laguerre polynomials	Kapil Kumar et.al.	2405.07228	null
2024-05-06	Swarm intelligence for full Stokes dynamic imaging reconstruction of interferometric data	Alejandro Mus et.al.	2405.03330	null
2024-05-04	Large Deviation Principles of Invariant Measures of Stochastic Reaction-Diffusion Lattice Systems	Bixiang Wang et.al.	2405.02720	null
2024-05-03	The Immersed Inextensible Interface Problem in 2D Stokes Flow	Eduardo García-Juárez et.al.	2405.02446	null
2024-05-02	Customizing Text-to-Image Models with a Single Image Pair	Maxwell Jones et.al.	2405.01536	null
2024-04-25	Robust Fine-tuning for Pre-trained 3D Point Cloud Models	Zhibo Zhang et.al.	2404.16422	null
2024-04-23	The Geometry of the Set of Equivalent Linear Neural Networks	Jonathan Richard Shewchuk et.al.	2404.14855	null
2024-04-24	Nonexistence of solutions to parabolic problems with a potential on weighted graphs	Dario D. Monticelli et.al.	2404.12058	null
2024-04-17	On the relaxation to equilibrium of a quantum oscillator interacting with a radiation field	Pierre-A. Vuillermot et.al.	2404.11329	null
2024-04-15	Higher-curvature gravity in AdS $_3$, holographic $c$ -theorems and black hole microstates	Mariano Chernicoff et.al.	2404.10128	null
2024-04-16	Asymptotic-preserving approximations for stochastic incompressible viscous fluids and SPDEs on graph	Jianbo Cui et.al.	2404.09168	null
2024-04-09	Perspective on Physical Interpretations of Rényi Entropy in Statistical Mechanics	Misaki Ozawa et.al.	2404.06436	null
2024-04-09	A gluing construction of singular solutions for a fully non-linear equation in conformal geometry	María Fernanda Espinal et.al.	2404.05965	null
2024-04-05	Dissipative Euler flows originating from circular vortex filaments	Francisco Gancedo et.al.	2404.04250	null
2024-04-05	Macdonald characters from a new formula for Macdonald polynomials	Houcine Ben Dali et.al.	2404.03904	null
2024-04-04	Fundamental inequalities for the iterated Fourier-cosine convolution with Gaussian weight and its application	Nguyen Thi Hong Phuong et.al.	2404.03609	null
2024-03-29	Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real World	Bowen Lei et.al.	2403.20047	link
2024-03-28	Model Stock: All we need is just a few fine-tuned models	Dong-Hwan Jang et.al.	2403.19522	link
2024-03-26	A location Invariant Statistic-Based Consistent Estimation Method for Three-Parameter Generalized Exponential Distribution	Kiran Prajapat et.al.	2403.17609	null
2024-06-03	FissionFusion: Fast Geometric Generation and Hierarchical Souping for Medical Image Analysis	Santosh Sanjeev et.al.	2403.13341	link
2024-06-18	Learning Useful Representations of Recurrent Neural Network Weight Matrices	Vincent Herrmann et.al.	2403.11998	link
2024-03-16	Function-space Parameterization of Neural Networks for Sequential Learning	Aidan Scannell et.al.	2403.10929	link
2024-03-14	Imprints of Barrow-Tsallis Cosmology in Primordial Gravitational Waves	Petr Jizba et.al.	2403.09797	null
2024-03-14	Eigenvariety for partially classical Hilbert modular forms	Mladen Dimitrov et.al.	2403.09784	null
2024-03-12	The solenoidal Heisenberg Virasoro algebra and its simple weight modules	Boujemaa Agrebaoui et.al.	2403.07381	null
2024-03-10	FrameQuant: Flexible Low-Bit Quantization for Transformers	Harshavardhan Adepu et.al.	2403.06082	link
2024-03-06	The solenoidal Virasoro algebra and its simple weight modules	Boujemaa Agrebaoui et.al.	2403.03753	null
2024-03-05	Tensor Decomposition-based Time Varying Channel Estimation for mmWave MIMO-OFDM Systems	Ruizhe Wang et.al.	2403.02942	null
2024-03-05	Neural Redshift: Random Networks are not Random Functions	Damien Teney et.al.	2403.02241	null
2024-03-04	Tiny fluctuations of the averaging process around its degenerate steady state	Federico Sau et.al.	2403.02032	null
2024-03-15	Training-Free Pretrained Model Merging	Zhengqi Xu et.al.	2403.01753	link
2024-04-22	HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances	Supreeth Narasimhaswamy et.al.	2403.01693	null
2024-03-13	TOOLVERIFIER: Generalization to New Tools via Self-Verification	Dheeraj Mekala et.al.	2402.14158	link
2024-02-21	Computing Tangent Spaces to Eigenvarieties	James Rawson et.al.	2402.13799	null
2024-05-28	Neural Network Parameter Diffusion	Kai Wang et.al.	2402.13144	link
2024-02-19	Exponential attractors for a nonlocal delayed reaction-diffusion equation on an unbounded domain	Wenjie Hu et.al.	2402.11856	null
2024-02-18	Discrete Neural Algorithmic Reasoning	Gleb Rodionov et.al.	2402.11628	link
2024-02-17	Uncertainty Quantification of Graph Convolution Neural Network Models of Evolving Processes	Jeremiah Hauth et.al.	2402.11179	null
2024-06-06	Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning	Tuc Nguyen et.al.	2402.10639	null
2024-02-14	TAI-GAN: A Temporally and Anatomically Informed Generative Adversarial Network for early-to-late frame conversion in dynamic cardiac PET inter-frame motion correction	Xueqi Guo et.al.	2402.09567	null
2024-02-14	The cohomology of $p$ -adic Deligne-Luszitg schemes of Coxeter type	Alexander B. Ivanov et.al.	2402.09017	null
2024-02-09	The Asymptotic Structure of Cosmological Integrals	Paolo Benincasa et.al.	2402.06558	null
2024-02-07	Universal Neural Functionals	Allan Zhou et.al.	2402.05232	link
2024-02-06	Maximal regularity and optimal control for a non-local Cahn-Hilliard tumour growth model	Matteo Fornoni et.al.	2402.04204	null
2024-02-06	Improved Generalization of Weight Space Networks via Augmentations	Aviv Shamsian et.al.	2402.04081	link
2024-02-02	Training-time Neuron Alignment through Permutation Subspace for Improving Linear Mode Connectivity and Model Fusion	Zexi Li et.al.	2402.01342	null
2024-02-01	Understanding Neural Network Systems for Image Analysis using Vector Spaces and Inverse Maps	Rebecca Pattichis et.al.	2402.00261	link
2024-01-26	Do deep neural networks utilize the weight space efficiently?	Onur Can Koyun et.al.	2401.16438	null
2024-01-22	On strong growth conditions for weighted spaces of entire functions	Gerhard Schindl et.al.	2401.14330	null
2024-01-24	Task structure and nonlinearity jointly determine learned representational geometry	Matteo Alleman et.al.	2401.13558	null
2024-01-25	Sparse Domination of Singular Bilinear Forms on Non-Homogeneous spaces	Paco Villarroya et.al.	2401.13130	null
2024-01-22	WARM: On the Benefits of Weight Averaged Reward Models	Alexandre Ramé et.al.	2401.12187	null
2024-01-17	Cesàro operators associated with Borel measures acting on weighted spaces of holomorphic functions with sup-norm	Maria José Beltrán Meneu et.al.	2401.09406	null
2024-01-15	Singular fractal dimension at periodicity cascades in parameters spaces	Carlos E. P. Abreu et.al.	2401.07648	null
2024-01-17	Computing Fringe Presentations of Multigraded Persistence Modules	Fabian Lenzen et.al.	2401.06008	null
2024-01-10	Grimoire is All You Need for Enhancing Large Language Models	Ding Chen et.al.	2401.03385	link
2024-03-26	Artificial Intelligence for Operations Research: Revolutionizing the Operations Research Process	Zhenan Fan et.al.	2401.03244	null
2023-12-31	A Compact Representation for Bayesian Neural Networks By Removing Permutation Symmetry	Tim Z. Xiao et.al.	2401.00611	link
2023-12-28	Fractional non-homogeneous counting process	Nick Laskin et.al.	2312.17389	null
2023-12-28	Some unimodal sequences of Kronecker coefficients	Alimzhan Amanov et.al.	2312.17054	null
2023-12-24	The Vlasov-Maxwell-Boltzmann/Landau system with polynomial perturbation near Maxwellian	Chuqi Cao et.al.	2312.15510	null
2023-12-22	Emage: Non-Autoregressive Text-to-Image Generation	Zhangyin Feng et.al.	2312.14988	null
2023-12-21	Hypercyclic shifts on lattice graphs	Anton Baranov et.al.	2312.13934	null
2023-12-21	Scattering for 2d semi-relativistic Hartree equations with short range potential	Changhun Yang et.al.	2312.13606	null
2023-12-21	Entropic Inflation in Presence of Scalar Field	Sergei D. Odintsov et.al.	2312.13587	null
2023-12-30	Time is Encoded in the Weights of Finetuned Language Models	Kai Nylund et.al.	2312.13401	link
2023-12-14	Efficient momentum space approach to superconductivity in quasiperiodic systems	Mao Yoshii et.al.	2312.09124	null
2023-12-13	Best one-sided algebraic approximation by average modulus	Raheam A. Al-Saphory et.al.	2312.08407	null
2023-12-19	Well-Posedness of Quasilinear Parabolic Equations in Time-Weighted Spaces	Bogdan Matioc et.al.	2312.07974	null
2023-12-12	Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models	Arnav Chavan et.al.	2312.07046	link
2023-12-11	Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks	MohammadReza Davari et.al.	2312.06795	null
2023-12-08	Stoichiometry preservation and generalization of Bilger mixture fraction for non-premixed combustion with differential molecular diffusion	Haifeng Wang et.al.	2312.05204	null
2023-12-01	New polyconvolution product for Fourier-cosine and Laplace integral operators and their applications	Trinh Tuan et.al.	2312.00764	null
2023-11-30	Modelling Einstein cluster using Einasto profile	Ritwik Acharyya et.al.	2311.18622	null
2023-11-27	Extraction of the microscopic properties of quasi-particles using deep neural networks	Olga Soloveva et.al.	2311.15984	null
2024-01-24	Deep Latent Force Models: ODE-based Process Convolutions for Bayesian Deep Learning	Thomas Baldwin-McDonald et.al.	2311.14828	null

(back to top)

Data Distillation

Publish Date	Title	Authors	PDF	Code
2024-10-25	FLiP: Privacy-Preserving Federated Learning based on the Principle of Least Privileg	ShiMao Xu et.al.	2410.19548	null
2024-10-25	SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models	Jahyun Koo et.al.	2410.19503	null
2024-10-24	AlignCap: Aligning Speech Emotion Captioning to Human Preferences	Ziqi Liang et.al.	2410.19134	null
2024-10-24	High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws	M. Emrullah Ildiz et.al.	2410.18837	null
2024-10-24	Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data	Anup Shirgaonkar et.al.	2410.18588	null
2024-10-24	SIKeD: Self-guided Iterative Knowledge Distillation for mathematical reasoning	Shivam Adarsh et.al.	2410.18574	link
2024-10-23	ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams	Srija Anand et.al.	2410.17901	null
2024-10-23	Towards Active Participant-Centric Vertical Federated Learning: Some Representations May Be All You Need	Jon Irureta et.al.	2410.17648	null
2024-10-23	Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation	Muquan Li et.al.	2410.17606	link
2024-10-23	Physics-driven AI for Channel Estimation in Cellular Network	Xiaoqian Qi et.al.	2410.17525	null
2024-10-22	MiniPLM: Knowledge Distillation for Pre-Training Language Models	Yuxian Gu et.al.	2410.17215	link
2024-10-22	Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios	Kai Wang et.al.	2410.17193	link
2024-10-22	CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare	Nicholas I-Hsien Kuo et.al.	2410.16872	null
2024-10-22	AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models	Yongjian Wu et.al.	2410.16820	link
2024-10-22	SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation	Jing-Jing Li et.al.	2410.16665	null
2024-10-21	Pre-training Distillation for Large Language Models: A Design Space Exploration	Hao Peng et.al.	2410.16215	null
2024-10-18	Interpreting Microbiome Relative Abundance Data Using Symbolic Regression	Swagatam Haldar et.al.	2410.16109	link
2024-10-21	Are Large-scale Soft Labels Necessary for Large-scale Dataset Distillation?	Lingao Xiao et.al.	2410.15919	link
2024-10-21	Model Mimic Attack: Knowledge Distillation for Provably Transferable Adversarial Examples	Kirill Lukyanov et.al.	2410.15889	null
2024-10-20	Hybrid Memory Replay: Blending Real and Distilled Data for Class Incremental Learning	Jiangtao Kong et.al.	2410.15372	null
2024-10-20	GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning	Haiwen Diao et.al.	2410.15266	link
2024-10-19	LLaVA-Ultra: Large Chinese Language and Vision Assistant for Ultrasound	Xuechen Guo et.al.	2410.15074	null
2024-10-19	Improving Pronunciation and Accent Conversion through Knowledge Distillation And Synthetic Ground-Truth from Native TTS	Tuan Nam Nguyen et.al.	2410.14997	null
2024-10-17	CAKD: A Correlation-Aware Knowledge Distillation Framework Based on Decoupling Kullback-Leibler Divergence	Zao Zhang et.al.	2410.14741	null
2024-10-18	Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation	Shuai Zhao et.al.	2410.14425	link
2024-10-18	Preview-based Category Contrastive Learning for Knowledge Distillation	Muhe Ding et.al.	2410.14143	null
2024-10-17	Leveraging Fine-Tuned Language Models for Efficient and Accurate Smart Contract Auditing	Zhiyuan Wei et.al.	2410.13918	link
2024-10-17	GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning	Guibin Zhang et.al.	2410.13761	link
2024-10-17	An Active Learning Framework for Inclusive Generation by Large Language Models	Sabit Hassan et.al.	2410.13641	null
2024-10-18	Towards Satellite Non-IID Imagery: A Spectral Clustering-Assisted Federated Learning Approach	Luyao Zou et.al.	2410.13602	null
2024-10-17	Enhancing Dataset Distillation via Label Inconsistency Elimination and Learning Pattern Refinement	Chuhao Zhou et.al.	2410.13311	link
2024-10-18	Cyber Attacks Prevention Towards Prosumer-based EV Charging Stations: An Edge-assisted Federated Prototype Knowledge Distillation Approach	Luyao Zou et.al.	2410.13260	null
2024-10-16	TAS: Distilling Arbitrary Teacher and Student via a Hybrid Assistant	Guopeng Li et.al.	2410.12342	null
2024-10-16	Optimizing YOLOv5s Object Detection through Knowledge Distillation algorithm	Guanming Huang et.al.	2410.12259	null
2024-10-16	TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration	Yiwei Guo et.al.	2410.12183	link
2024-10-17	SAM-Guided Masked Token Prediction for 3D Scene Understanding	Zhimin Chen et.al.	2410.12158	null
2024-10-15	MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router	Yanyue Xie et.al.	2410.12013	null
2024-10-15	Breaking Modality Gap in RGBT Tracking: Coupled Knowledge Distillation	Andong Lu et.al.	2410.11586	link
2024-10-15	Learning from Imperfect Data: Towards Efficient Knowledge Distillation of Autoregressive Language Models for Text-to-SQL	Qihuang Zhong et.al.	2410.11371	null
2024-10-15	Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling	Wenda Xu et.al.	2410.11325	null
2024-10-14	BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRI	Shaohao Rui et.al.	2410.10604	null
2024-10-14	ROSAR: An Adversarial Re-Training Framework for Robust Side-Scan Sonar Object Detection	Martin Aubard et.al.	2410.10554	link
2024-10-14	Temperature-Centric Investigation of Speculative Decoding with Knowledge Distillation	Siru Ouyang et.al.	2410.10141	null
2024-10-14	REHRSeg: Unleashing the Power of Self-Supervised Super-Resolution for Resource-Efficient 3D MRI Segmentation	Zhiyun Song et.al.	2410.10097	null
2024-10-15	Self-Data Distillation for Recovering Quality in Pruned Large Language Models	Vithursan Thangarasa et.al.	2410.09982	null
2024-10-13	Generalized Group Data Attribution	Dan Ley et.al.	2410.09940	null
2024-10-12	Distilling Invariant Representations with Dual Augmentation	Nikolaos Giakoumoglou et.al.	2410.09474	null
2024-10-12	Declarative Knowledge Distillation from Large Language Models for Visual Question Answering Datasets	Thomas Eiter et.al.	2410.09428	link
2024-10-15	Transforming In-Vehicle Network Intrusion Detection: VAE-based Knowledge Distillation Meets Explainable AI	Muhammet Anil Yagiz et.al.	2410.09043	null
2024-10-11	Mentor-KD: Making Small Language Models Better Multi-step Reasoners	Hojae Lee et.al.	2410.09037	link
2024-10-11	Contrastive Knowledge Distillation for Robust Multimodal Sentiment Analysis	Zhongyi Sang et.al.	2410.08692	null
2024-10-11	DistDD: Distributed Data Distillation Aggregation through Gradient Matching	Peiran Wang et.al.	2410.08665	null
2024-10-11	GAI-Enabled Explainable Personalized Federated Semi-Supervised Learning	Yubo Peng et.al.	2410.08634	null
2024-10-11	Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both	Abhijnan Nath et.al.	2410.08458	null
2024-10-10	What is Left After Distillation? How Knowledge Transfer Impacts Fairness and Bias	Aida Mohammadshahi et.al.	2410.08407	null
2024-10-10	A Lightweight Target-Driven Network of Stereo Matching for Inland Waterways	Jing Su et.al.	2410.07915	null
2024-10-10	SNN-PAR: Energy Efficient Pedestrian Attribute Recognition via Spiking Neural Networks	Haiyang Wang et.al.	2410.07857	link
2024-10-12	Relational Diffusion Distillation for Efficient Image Generation	Weilun Feng et.al.	2410.07679	link
2024-10-10	Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching	Ruonan Yu et.al.	[2410.07579](http://arxiv.org/abs/2410.07

Name		Name	Last commit message	Last commit date
Latest commit History 2,233 Commits
.github		.github
assets		assets
docs		docs
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
daily_arxiv.py		daily_arxiv.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Updated on 2024.12.21

PEFT

Text-to-Image Generation

Vision-Language Models

Generative Weight Space Modeling

Data Distillation

About

Releases

Packages

Languages

License

SKDDJ/cv-arxiv-daily

Folders and files

Latest commit

History

Repository files navigation

Updated on 2024.12.21

PEFT

Text-to-Image Generation

Vision-Language Models

Generative Weight Space Modeling

Data Distillation

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages