Skip to content

MeiGen-AI/PositionIC

Repository files navigation

PositionIC: Unified Position and Identity Consistency for Image Customization

arXiv

Junjie Hu, Tianyang Han, Kai Ma, Jialin Gao, Song Yang
Xianhua He, Junfeng Luo, Xiaoming Wei, Wenqiang Zhang


🔥 News

  • [2025.07.18] Our paper is now available on arXiv.
  • [2026.01.12] We have released our PositionIC model for FLUX on HuggingFace!
  • ⬜ Datasets and PositionIC-v2 model with enhanced generation capabilities coming soon.

⚡ Quick Start

🔧 Environment Setup

# Create a new conda environment
conda create -n PositionIC python=3.10 -y
conda activate PositionIC

# Install PyTorch (adjust according to your CUDA version)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

# Install project dependencies
pip install -r requirements.txt

📋 Data Format

The evaluation JSON file should be a list of dictionaries. Each dictionary represents one test sample with the following structure:

[
    {
        "ref_img": ["path/to/subject1.png", "path/to/subject2.png"],
        "prompt": "a boy sitting on the chair in a garden.",
        "img_bbox": [[0.25, 0.35, 0.60, 0.75], [0.35, 0.30, 0.58, 0.67]],
    }
]
Field Type Description
ref_img List[str] Paths to reference images (subjects to be customized). The reference image sequence corresponds to the far-to-near distance of objects. Consequently, Subject2 will overlap and obscure Subject1 whenever their positions coincide.
prompt str Text prompt describing the desired output scene
img_bbox List[List[float]] Target bounding boxes in the output image for each subject. Format: [x_min, y_min, x_max, y_max] in normalized coordinates (0.0-1.0)

Note: The bounding box coordinates are normalized to [0, 1], where (0, 0) is the top-left corner and (1, 1) is the bottom-right corner of the image.

We provide a toy JSON file for testing at assert/test.json.

🚀 Inference

cd PositionIC

Run inference using inference.sh or execute directly:

python inference.py \
    --eval_json_path ./assert/test.json \
    --dit_lora_path "ScottHan/PositionIC" \
    --saved_dir "./samples" \
    --width 1024 \
    --height 1024 \
    --ref_size 512 \
    --seed 3074 \
    --rope_type "uno" \
    --a 5

🙏 Acknowledgments

Our code is built upon UNO. We sincerely thank the authors for their excellent work and open-source contribution.


🌟 Citation

If you find our work helpful for your research, please consider giving us a star ⭐ and citing our paper:

@article{hu2025positionic,
  title={Positionic: Unified position and identity consistency for image customization},
  author={Hu, Junjie and Han, Tianyang and Ma, Kai and Gao, Jialin and Yang, Song and He, Xianhua and Luo, Junfeng and Wei, Xiaoming and Zhang, Wenqiang},
  journal={arXiv preprint arXiv:2507.13861},
  year={2025}
}

📄 License

This project is licensed under the Apache-2.0 License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published