DivCon: Divide and Conquer for Progressive Text-to-Image Generation

Dependencies

In your environment where Python version is 3.8 or higher, or alternatively, create a new environment:

conda create --name divcon python==3.8.0
conda activate divcon

and install related libraries

conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
pip install -r requirements.txt
pip install git+https://github.com/CompVis/taming-transformers.git
pip install git+https://github.com/openai/CLIP.git

Inference: Predict layouts with DivCon

We provide scripts to generate layouts for HRS and NSR-1K benchmark. First navigate to ./LLM_gen_layout, and set up your openai authentication at line 47:

cd LLM_gen_layout

Then run:

# for numerical prompts
python llm_gen_layout_counting.py --dataset HRS
# or for spatial prompts
python llm_gen_layout_spatial.py --dataset HRS

The generated layouts will be saved to ./LLM_gen_layout by default.

We also provide generated layout for both benchmarks at the ./LLM_gen_layout.

Inference: Generate images with DivCon

Download the layout conditioned model GLIGEN and put them in gligen_checkpoints

To generate images from numerical prompts in HRS using layouts predicted by divcon, run:

python divcon_gen.py --ckpt gligen_checkpoints/diffusion_pytorch_model.bin --file_save HRS 
                     --type counting --pred_layout ./LLM_gen_layout/HRS_counting.p

Where

--ckpt: Path to the GLIGEN checkpoint
--file_save: Path to save the generated images
--type: The category to test, counting or spatial
--pred_layout: Path to the predicted layout from LLM
--use_llm: Whether to use LLM to generate the layout. If you're using LLM (GPT-4), set your openai API key as follows:

export OPENAI_API_KEY='your-api-key'

You can modify these input parameters to generate images for different benchmarks or categories.

Layout & Image Evaluation

To evaluate the raw layouts, navigate to LLM_gen_layout and run:

cd LLM_gen_layout
# for numerical prompts in HRS benchmark
python eval_counting_layout.py --pred_layout HRS_counting.p
# or for spatial prompts in HRS benchmark
python eval_spatial_layout.py --pred_layout HRS_spatial.p

To evaluate the generated images using YOLOv8, navigate to evaluation and first run:

cd evaluation
python YOLOv8.py --in_folder ../visual/HRS_img --out_file HRS_detect.p

then run evaluation scripts:

# for numerical prompts in HRS benchmark
python eval_counting.py --in_result detection_result/HRS_detect.p
# or for spatial prompts in HRS benchmark
python eval_spatial.py --in_result detection_result/HRS_detect.p

Acknowledgements

This project is built upon the foundational work from GLIGEN and Attention-Refocusing.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LLM_gen_layout		LLM_gen_layout
configs		configs
dataset		dataset
evaluation		evaluation
figs		figs
grounding_input		grounding_input
gt_benchmark		gt_benchmark
ldm		ldm
LICENSE		LICENSE
README.md		README.md
Roboto-LightItalic.ttf		Roboto-LightItalic.ttf
SD_input_conv_weight_bias.pth		SD_input_conv_weight_bias.pth
chatGPT.py		chatGPT.py
color150.mat		color150.mat
configigure.py		configigure.py
convert_ckpt.py		convert_ckpt.py
distributed.py		distributed.py
divcon_gen.py		divcon_gen.py
inference.py		inference.py
inpaint_mask_func.py		inpaint_mask_func.py
main.py		main.py
projection_matrix		projection_matrix
requirements.txt		requirements.txt
trainer.py		trainer.py
tsv_split_merge.py		tsv_split_merge.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DivCon: Divide and Conquer for Progressive Text-to-Image Generation

Dependencies

Inference: Predict layouts with DivCon

Inference: Generate images with DivCon

Layout & Image Evaluation

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

DivCon-gen/DivCon

Folders and files

Latest commit

History

Repository files navigation

DivCon: Divide and Conquer for Progressive Text-to-Image Generation

Dependencies

Inference: Predict layouts with DivCon

Inference: Generate images with DivCon

Layout & Image Evaluation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages