GitHub - CVL-UESTC/Internal-Guidance: Guiding a Diffusion Transformer with the Internal Dynamics of Itself (IG)

Guiding a Diffusion Transformer with the Internal Dynamics of Itself (IG)
_{Official PyTorch Implementation}

Paper | Project Page | Models (HuggingFace)

Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Xingyu Zhou¹, Qifan Li¹, Xiaobin Hu², Hai Chen^3,4, Shuhang Gu¹*
¹University of Electronic Science and Technology of China ²National University of Singapore
³Sun Yat-sen University ⁴North China Institute of Computer Systems Engineering
^*Corresponding Author

💥 News

[2025.12.31] We have released the paper and code of IG.

🌟 Highlight

🔥New SOTA on 256 × 256 ImageNet generation: LightningDiT-XL/1 + IG sets a new state of the art with FID = 1.07 (random sampling FID = 1.19) on ImageNet, while achieving FID = 1.24 (random sampling FID = 1.34) without classifier-free guidance.
Simple enough, powerful enough: We present Internal Guidance (IG), a simple yet powerful guidance mechanism for Diffusion Transformers. Just requiring an additional intermediate supervision is all that is needed.
Intermediate supervision: Only a simple intermediate supervision can achieve a similar effect to the additional designed self-supervised learning regularization.
Improved Performance: IG accelerates training and improves generation performance for DiTs, SiTs and LightningDiT.

📝 Results

State-of-the-art Performance on ImageNet 256x256 with FID=1.19 (random sampling).
State-of-the-art Performance on ImageNet 256x256 with FID=1.07 (uniform balanced sampling).

🏡 Environment Setup

conda create -n IG python=3.12 -y
conda activate IG
pip install -r requirements.txt

📜 Dataset Preparation

Currently, we provide experiments for ImageNet. You can place the data that you want and can specify it via --data-dir arguments in training scripts.
Note that we preprocess the data for faster training. Please refer to preprocessing guide for SiTs and README.md for LightningDiTs for detailed guidance.

🔥 Training

Here we provide the training code for SiTs and LightningDiTs.

5.1.Training with SiT + IG

cd SiT
accelerate launch --config_file configs/default.yaml train.py \
  --mixed-precision="fp16" \
  --seed=0 \
  --path-type="linear" \
  --prediction="v" \
  --resolution=256 \
  --batch-size=32 \
  --weighting="uniform" \
  --model="SiT-XL/2" \
  --encoder-depth=8 \
  --output-dir="exps" \
  --exp-name="sitxl-ab820-t0.2-res256" \
  --data-dir=[YOUR_DATA_PATH]

Then this script will automatically create the folder in exps to save logs,samples, and checkpoints. You can adjust the following options:

--models: Choosing from [SiT-B/2, SiT-L/2, SiT-XL/2]
--encoder-depth: Intermediate output block layer for the auxiliary supervision
--output-dir: Any directory that you want to save checkpoints, samples, and logs
--exp-name: Any string name (the folder will be created under output-dir)
--batch-size: The local batch size (by default we use 1 node of 8 GPUs), you need to adjust this value according to your GPU number to make total batch size of 256

5.2.Training with LightningDiT + SRA

cd LightningDiT
bash run_train.sh configs/lightningdit_xl_vavae_f16d32.yaml

Then this script will automatically create the folder in output to save logs and checkpoints. You can adjust the following options by the original LightningDiT.

🌠 Evaluation

Here we provide the generating code (random sampling) for SiTs and LightningDiTs to get the samples for evaluation. (and the .npz file can be used for ADM evaluation suite) through the following script:

You can download our pretrained model here:

Model	Image Resolution	Epochs	FID-50K	Inception Score
SiT-XL/2 + IG	256x256	800	1.46	265.7
LightningDiT-XL/1 + IG	256x256	680	1.19	269.0

Sampling with SiT + IG

cd SiT
bash gen.sh

Note that there are several options in gen.sh file that you need to complete:

SAMPLE_DIR: Base directory to save the generated images and .npz file
CKPT: Checkpoint path (This can also be your downloaded local file of the ckpt file we provide above)

Sampling with LightningDiT + IG

cd LightningDiT
bash run_inference.sh configs/lightningdit_xl_vavae_f16d32.yaml

📣 Note

It's possible that this code may not accurately replicate the results outlined in the paper due to potential human errors during the preparation and cleaning of the code for release as well as the difference of the hardware facility. If you encounter any difficulties in reproducing our findings, please don't hesitate to inform us.

🤝🏻 Acknowledgement

This code is mainly built upon SRA, LightningDiT, RAE repositories. Thanks for their solid work!

🌺 Citation

If you find IG useful, please kindly cite our paper:

@article{zhou2025guiding,
  title={Guiding a Diffusion Transformer with the Internal Dynamics of Itself},
  author={Zhou, Xingyu and Li, Qifan and Hu, Xiaobin and Chen, Hai and Gu, Shuhang},
  journal={arXiv preprint arXiv:2512.24176},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
LightningDiT		LightningDiT
SiT		SiT
LICENSE.txt		LICENSE.txt
README.md		README.md
randomsota.png		randomsota.png
requirements.txt		requirements.txt
unisota.png		unisota.png
visual.png		visual.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Guiding a Diffusion Transformer with the Internal Dynamics of Itself (IG)
_{Official PyTorch Implementation}

Paper | Project Page | Models (HuggingFace)

💥 News

🌟 Highlight

📝 Results

🏡 Environment Setup

📜 Dataset Preparation

🔥 Training

5.1.Training with SiT + IG

5.2.Training with LightningDiT + SRA

🌠 Evaluation

Sampling with SiT + IG

Sampling with LightningDiT + IG

📣 Note

🤝🏻 Acknowledgement

🌺 Citation

About

Uh oh!

Releases

Packages

Languages

License

CVL-UESTC/Internal-Guidance

Folders and files

Latest commit

History

Repository files navigation

Guiding a Diffusion Transformer with the Internal Dynamics of Itself (IG)Official PyTorch Implementation

Paper | Project Page | Models (HuggingFace)

💥 News

🌟 Highlight

📝 Results

🏡 Environment Setup

📜 Dataset Preparation

🔥 Training

5.1.Training with SiT + IG

5.2.Training with LightningDiT + SRA

🌠 Evaluation

Sampling with SiT + IG

Sampling with LightningDiT + IG

📣 Note

🤝🏻 Acknowledgement

🌺 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Guiding a Diffusion Transformer with the Internal Dynamics of Itself (IG)
_{Official PyTorch Implementation}

Packages