🚀Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling✨

📑Paper

Arxiv: Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling

🌐Project Page

STG Project Page

🎥Video Examples

Below are example videos showcasing the enhanced video quality achieved through STG:

HunyuanVideo

hunyuan1.mp4

hunyuan2.mp4

Mochi

mochi1.mp4

mochi2.mp4

CogVideoX

cogvideox1.mp4

cogvideox2.mp4

LTX-Video

ltxvideo1.mp4

SVD (Stable Video Diffusion)

svd1.mp4

🗺️Start Guide

🍡Mochi
- For installation and requirements, refer to the official repository.
- Update demos/config.py with your desired settings and simply run:
```
python ./demos/cli.py
```

🌌HunyuanVideo

For installation and requirements, refer to the official repository.

Using CFG (Default Model):

torchrun --nproc_per_node=4 sample_video.py \
 --video-size 544 960 \
 --video-length 65 \
 --infer-steps 50 \
 --prompt "A time traveler steps out of a glowing portal into a Victorian-era street filled with horse-drawn carriages, realistic style." \
 --flow-reverse \
 --seed 42 \
 --ulysses-degree 4 \
 --ring-degree 1 \
 --save-path ./results

To utilize STG, use the following command:

torchrun --nproc_per_node=4 sample_video.py \
 --video-size 544 960 \
 --video-length 65 \
 --infer-steps 50 \
 --prompt "A time traveler steps out of a glowing portal into a Victorian-era street filled with horse-drawn carriages, realistic style." \
 --flow-reverse \
 --seed 42 \
 --ulysses-degree 4 \
 --ring-degree 1 \
 --save-path ./results \
 --stg-mode "STG-R" \
 --stg-block-idx 2 \
 --stg-scale 2.0

Key Parameters:

stg_mode: Only STG-R supported.
stg_scale: 2.0 is recommended.
stg_block_idx: Specify the block index for applying STG.

🏎️LTX-Video
- For installation and requirements, refer to the official repository.
Using CFG (Default Model):
```
python inference.py --ckpt_dir './weights' --prompt "A man ..."
```
To utilize STG, use the following command:
```
python inference.py --ckpt_dir './weights' --prompt "A man ..." --stg_mode stg-a --stg_scale 1.0 --stg_block_idx 19 --do_rescaling True
```
Key Parameters:
- stg_mode: Choose between stg-a or stg-r.
- stg_scale: Recommended values are ≤2.0.
- stg_block_idx: Specify the block index for applying STG.
- do_rescaling: Set to True to enable rescaling.

🧪Diffusers

The Diffusers implementation supports Mochi, HunyuanVideo,CogVideoX and SVD as of now

To run the test script, refer to the test.py file in each folder. Below is an example using Mochi:

# test.py
import torch
from pipeline_stg_mochi import MochiSTGPipeline
from diffusers.utils import export_to_video
import os

# Load the pipeline
pipe = MochiSTGPipeline.from_pretrained("genmo/mochi-1-preview", variant="bf16", torch_dtype=torch.bfloat16)

pipe.enable_vae_tiling()
pipe = pipe.to("cuda")

#--------Option--------#
prompt = "A slow-motion capture of a beautiful woman in a flowing dress spinning in a field of sunflowers, with petals swirling around her, realistic style."
stg_mode = "STG-R" 
stg_applied_layers_idx = [35]
stg_scale = 0.8 # 0.0 for CFG (default)
do_rescaling = True # False (default)
#----------------------#

# Generate video frames
frames = pipe(
    prompt, 
    num_frames=84,
    stg_mode=stg_mode,
    stg_applied_layers_idx=stg_applied_layers_idx,
    stg_scale=stg_scale,
    do_rescaling=do_rescaling
).frames[0]
...

For details on memory efficiency, inference acceleration, and more, refer to the original pages below:

🛠️Todos

Implement STG on diffusers
Update STG with Open-Sora, SVD

🙏Acknowledgements

This project is built upon the following works:

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
HunyuanVideo		HunyuanVideo
LTX-Video		LTX-Video
assets		assets
diffusers		diffusers
mochi		mochi
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling✨

📑Paper

🌐Project Page

🎥Video Examples

HunyuanVideo

Mochi

CogVideoX

LTX-Video

SVD (Stable Video Diffusion)

🗺️Start Guide

🛠️Todos

🙏Acknowledgements

About

Releases

Packages

Contributors 2

Languages

junhahyung/STGuidance

Folders and files

Latest commit

History

Repository files navigation

🚀Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling✨

📑Paper

🌐Project Page

🎥Video Examples

HunyuanVideo

Mochi

CogVideoX

LTX-Video

SVD (Stable Video Diffusion)

🗺️Start Guide

🛠️Todos

🙏Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages