A collection of code samples for working with Stability AI's models. This repo will be used for technical assets that accompany blog posts on https://stability.ai/learning-hub
Model | Inference Speed (seconds) | GPU / CPU |
---|---|---|
SD3.5 M | 4 s | NVIDIA H100 GPU with 80 GB of VRAM |
4-Bit Quanitized SD3.5 L | 18 s | NVIDIA H100 GPU with 80 GB of VRAM Inference partially offloaded to AWS EC2 p5.48xlarge instance's CPU: AMD EPYC 7R13 |
SD3.5 L | 7 s | NVIDIA H100 GPU with 80 GB of VRAM |
TL:DR; The key to removing objects isn't negative prompting but positive prompting for object placement
-
A good test of negative prompting is object removal; for example (model 4-Bit Quanitized SD3.5 L):
prompt
: Children's birthday partynegative_prompt
: No birthday cake -
Quantization reduces the precision of the model's weights from 32-bit floating point to 4-bit floating point
-
This reduction in precision makes negative prompting more effective
-
For the base model of Stable Diffusion 3.5 Large (with no quantization or modifications) including the API, negative prompting actually works extremely well; for example:
prompt
: A group of elves hunting a dragon, 4k cinemanegative_prompt
: No green grass -
For the base model of Stable Diffusion 3.5 Large, negative prompting of specific objects (like a birthday cake) is highly dependent on prompt structure and guidance scale; for example:
prompt
:Three children sitting at a dining table There is a white table cloth on the table There are balloons in the background The kids are wearing party hats The background is a sunny day at a park
negative_prompt
: [blank]guidance_scale
:7.5
-
The key to removing objects isn't negative prompting but positive prompting for object placement
-
This is explained in the Stable Diffusion 3.5 Prompt Guide
The guidance_scale parameter has a significant impact on image generation with Stable Diffusion 3.5 models:
A higher guidance scale value encourages the model to generate images closely linked to the text prompt at the expense of lower image quality
Image quality can vary drastically based on the guidance_scale
value. The below screenshots provide some recommended guidance_scale
settings for three Stable Diffusion 3.5 models:
- Stable Diffusion 3.5 Large (SD3.5 L)
- 4-Bit Quantized Stable Diffusion 3.5 Large (NF4 SD3.5 L)
- Stable Diffusion 3.5 Medium (SD3.5 M)
Model | guidance_scale (float 1-10) | Example |
---|---|---|
SD3.5 L | guidance_scale=2.5 |
|
NF4 SD3.5 L | guidance_scale=7.5 |
|
SD3.5 M | guidance_scale=5.0 |