[arxiv 2025.01] PERSE: Personalized 3D Generative Avatars from A Single Portrait [PDF,Page]
[arxiv 2024.10] MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion [PDF]
[arxiv 2024.10] ControlMM: Controllable Masked Motion Generation [PDF,Page]
[arxiv 2024.10] MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations [PDF,Page]
[arxiv 2024.10] Multi-modal Pose Diffuser: A Multimodal Generative Conditional Pose Prior [PDF]
[arxiv 2024.10] LEAD: Latent Realignment for Human Motion Diffusion [PDF]
[arxiv 2024.10] MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms [PDF,Page]
[arxiv 2024.11] KMM: Key Frame Mask Mamba for Extended Motion Generation [PDF,Page]
[arxiv 2024.11] Rethinking Diffusion for Text-Driven Human Motion Generation [PDF]
[arxiv 2024.11] SMGDiff: Soccer Motion Generation using diffusion probabilistic models [PDF]
[arxiv 2024.11] DRiVE: Diffusion-based Rigging Empowers Generation of Versatile and Expressive Characters [PDF,Page]
[arxiv 2024.11] UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing [PDF]
[arxiv 2024.12] AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward [PDF,Page]
[arxiv 2024.12] AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans [PDF,Page]
[arxiv 2024.12] InfiniDreamer: Arbitrarily Long Human Motion Generation via Segment Score Distillation [PDF]
[arxiv 2024.12] One Shot, One Talk: Whole-body Talking Avatar from a Single Image [PDF,Page]
[arxiv 2024.12] SoPo: Text-to-Motion Generation Using Semi-Online Preference Optimization [PDF,Page]
[arxiv 2024.12] Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis [PDF,Page]
[arxiv 2024.12] CoMA: Compositional Human Motion Generation with Multi-modal Agents [PDF,Page]
[arxiv 2024.12] Move-in-2D: 2D-Conditioned Human Motion Generation [PDF,Page]
[arxiv 2024.12] Motion-2-to-3: Leveraging 2D Motion Data to Boost 3D Motion Generation [PDF,Page]
[arxiv 2024.12] ScaMo: Exploring the Scaling Law in Autoregressive Motion Generation Model [PDF,Page]
[arxiv 2024.10] Emphasizing Semantic Consistency of Salient Posture for Speech-Driven Gesture Generation [PDF]
[arxiv 2024.10]Learning Interaction-aware 3D Gaussian Splatting for One-shot Hand Avatars [PDF,Page]
[arxiv 2024.11] UniHands: Unifying Various Wild-Collected Keypoints for Personalized Hand Reconstruction [PDF,Page]
[arxiv 2024.12] FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation [PDF]
[arxiv 2024.12] HandOS: 3D Hand Reconstruction in One Stage [PDF,Page]
[arxiv 2024.12] GigaHands: A Massive Annotated Dataset of Bimanual Hand Activities [PDF,Page]
[arxiv 2024.12] Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera [PDF,Page]
[arxiv 2024.05] Scaling Up Dynamic Human-Scene Interaction Modeling [PDF,Page]
[arxiv 2024.06] Introducing HOT3D: An Egocentric Dataset for 3D Hand and Object Tracking [PDF,Page]
[arxiv 2024.10] Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes [PDF,Page]
[arxiv 2024.09] DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors[PDF,Page]
[arxiv 2024.10] GraspDiffusion: Synthesizing Realistic Whole-body Hand-Object Interaction [PDF,Page]
[arxiv 2024.11] AnchorCrafter: Animate CyberAnchors Saling Your Products via Human-Object Interacting Video Generation [PDF,Page]
[arxiv 2024.12] HOT3D: Hand and Object Tracking in 3D from Egocentric Multi-View Videos [PDF,Page]
[arxiv 2024.12] OOD-HOI: Text-Driven 3D Whole-Body Human-Object Interactions Generation Beyond Training Domains [PDF,Page]
[arxiv 2024.12] FIction: 4D Future Interaction Prediction from Video [PDF,Page]
[arxiv 2024.12] TriDi: Trilateral Diffusion of 3D Humans, Objects and Interactions [PDF,Page]
[arxiv 2024.12] ContextHOI: Spatial Context Learning for Human-Object Interaction Detection [PDF]
[arxiv 2025.01] DiffGrasp: Whole-Body Grasping Synthesis Guided by Object Motion Using a Diffusion Model [PDF,Page]
[arxiv 2024.12] Orchestrating the Symphony of Prompt Distribution Learning for Human-Object Interaction Detection [PDF]
[arxiv 2024.12] HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction [PDF]
[arxiv 2025.01] Interacted Object Grounding in Spatio-Temporal Human-Object Interactions [PDF,Page]
[arxiv 2024.10] DepthSplat:Connecting Gaussian Splatting and Depth [PDF,Page]
[arxiv 2024.10] Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats [PDF,Page]
[arxiv 2024.12] SCENIC: Scene-aware Semantic Navigation with Instruction-guided Control [PDF,Page]
[arxiv 2024.12] ZeroHSI: Zero-Shot 4D Human-Scene Interaction [PDF,Page]
[arxiv 2024.11] FreeCap: Hybrid Calibration-Free Motion Capture in Open Environments [PDF]