Junjie Hu,
Tianyang Han,
Kai Ma,
Jialin Gao,
Song Yang
Xianhua He,
Junfeng Luo,
Xiaoming Wei,
Wenqiang Zhang
- ✅ [2025.07.18] Our paper is now available on arXiv.
- ✅ [2026.01.12] We have released our PositionIC model for FLUX on HuggingFace!
- ⬜ Datasets and PositionIC-v2 model with enhanced generation capabilities coming soon.
# Create a new conda environment
conda create -n PositionIC python=3.10 -y
conda activate PositionIC
# Install PyTorch (adjust according to your CUDA version)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
# Install project dependencies
pip install -r requirements.txtThe evaluation JSON file should be a list of dictionaries. Each dictionary represents one test sample with the following structure:
[
{
"ref_img": ["path/to/subject1.png", "path/to/subject2.png"],
"prompt": "a boy sitting on the chair in a garden.",
"img_bbox": [[0.25, 0.35, 0.60, 0.75], [0.35, 0.30, 0.58, 0.67]],
}
]| Field | Type | Description |
|---|---|---|
ref_img |
List[str] | Paths to reference images (subjects to be customized). The reference image sequence corresponds to the far-to-near distance of objects. Consequently, Subject2 will overlap and obscure Subject1 whenever their positions coincide. |
prompt |
str | Text prompt describing the desired output scene |
img_bbox |
List[List[float]] | Target bounding boxes in the output image for each subject. Format: [x_min, y_min, x_max, y_max] in normalized coordinates (0.0-1.0) |
Note: The bounding box coordinates are normalized to [0, 1], where (0, 0) is the top-left corner and (1, 1) is the bottom-right corner of the image.
We provide a toy JSON file for testing at
assert/test.json.
cd PositionICRun inference using inference.sh or execute directly:
python inference.py \
--eval_json_path ./assert/test.json \
--dit_lora_path "ScottHan/PositionIC" \
--saved_dir "./samples" \
--width 1024 \
--height 1024 \
--ref_size 512 \
--seed 3074 \
--rope_type "uno" \
--a 5Our code is built upon UNO. We sincerely thank the authors for their excellent work and open-source contribution.
If you find our work helpful for your research, please consider giving us a star ⭐ and citing our paper:
@article{hu2025positionic,
title={Positionic: Unified position and identity consistency for image customization},
author={Hu, Junjie and Han, Tianyang and Ma, Kai and Gao, Jialin and Yang, Song and He, Xianhua and Luo, Junfeng and Wei, Xiaoming and Zhang, Wenqiang},
journal={arXiv preprint arXiv:2507.13861},
year={2025}
}This project is licensed under the Apache-2.0 License.