-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Feature: Add Z-Image-Turbo regional guidance #8672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Add Z-Image-Turbo regional guidance #8672
Conversation
|
I will start my review after the base z-image-turbo support is merged, which should be soon. |
|
Where are we on this one? Now that the primary Z Image and Controlnet PR's ae in, this should be pretty straight forward? Can you rebase this please? I'll run through the code and tests. |
f4929ae to
56c7fbc
Compare
Implements regional prompting for Z-Image (S3-DiT Transformer) allowing different prompts to affect different image regions using attention masks. Backend changes: - Add ZImageRegionalPromptingExtension for mask preparation - Add ZImageTextConditioning and ZImageRegionalTextConditioning data classes - Patch transformer forward to inject 4D regional attention masks - Use additive float mask (0.0 attend, -inf block) in bfloat16 for compatibility - Alternate regional/full attention layers for global coherence Frontend changes: - Update buildZImageGraph to support regional conditioning collectors - Update addRegions to create z_image_text_encoder nodes for regions - Update addZImageLoRAs to handle optional negCond when guidance_scale=0 - Add Z-Image validation (no IP adapters, no autoNegative)
Fix windows path again
56c7fbc to
1a37af3
Compare
Changed the guidance_scale check from > 0 to > 1 for Z-Image models. Since Z-Image uses guidance_scale=1.0 as "no CFG" (matching FLUX convention), negative conditioning should only be created when guidance_scale > 1.
lstein
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blessedcoolant has indicated that code review and functional testing were successful and that the PR can be merged.
Summary
This PR adds Regional Guidance support for Z-Image (S3-DiT Transformer) models, enabling users to apply different prompts to different regions of the image using attention masks.
Key implementation details:
Backend:
ZImageRegionalPromptingExtensionclass that builds regional attention masksZImageTextConditioningandZImageRegionalTextConditioningdataclasses for managing regional text embeddingspatch_transformer_for_regional_promptingcontext manager0.0= attend,-inf= block) inbfloat16dtype[img_tokens, txt_tokens](different from FLUX's[txt_tokens, img_tokens])Frontend:
buildZImageGraph.tsto support regional conditioning collectorsaddRegions.tsto createz_image_text_encodernodes for Z-Image regionsaddZImageLoRAs.tsto handle optionalnegCondwhenguidance_scale=0validators.ts(no IP adapters, no autoNegative support)guidance_scale > 0Related Issues / Discussions
#8670
Extends Z-Image support (from the Z-Image-Turbo PR) with regional prompting capabilities.
QA Instructions
Merge Plan
Should be merged after the main Z-Image support PR (
feat/z-image-turbo-support), as this builds on top of that implementation.Checklist
What's Newcopy (if doing a release after this PR)