Skip to content

Commit

Permalink
add inference pipeline
Browse files Browse the repository at this point in the history
  • Loading branch information
xuyangcao committed Nov 14, 2024
1 parent b76f62d commit 51d41c9
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ We propose JoyVASA, a diffusion-based method for generating facial dynamics and

![Inference Pipeline](assets/imgs/pipeline_inference.png)

**Inference Pipeline of the proposed JoyVASA.** Given a reference image, we first extract the corresponding 3D facial appearance feature using the appearance encoder, and the learned motion information using the motion encoder. For the input speech, the audio features are initially extracted using the wav2vec2 encoder. The audio-driven motion sequences are then sampled using the diffusion model trained in the second stage in a sliding window fashion. Using the canonical source keypoints and the sampled target motion sequences, the target keypoints are computed. Finally, the 3D facial appearance feature is warped based on the source and target keypoints and rendered by a generator to produce the final output video.

## ⚙️ Installation

System requirements:
Expand Down
Binary file added assets/imgs/pipeline_inference.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 51d41c9

Please sign in to comment.