v3.10.1 release (Generic NCNN Upscaler)
v3.10.1 release with Windows installer.
Due to the size of the packaged python environment, the installer is within a multi-part zip file.
The multipart zip can be extracted using 7-Zip: https://www.7-zip.org/
Download both dgenerate_installer.zip.001
and dgenerate_installer.zip.002
to a folder.
Unzip dgenerate_installer.zip.001
to a directory (Right click, 7-Zip -> Extract to "dgenerate_installer") and then run dgenerate_installer\dgenerate.msi
to install.
dgenerate will be installed under C:\Program Files\dgenerate
by default with an isolated python environment provided.
The install directory will be added to PATH, and dgenerate will be available from the command line.
Portable Install
A portable install is provided via dgenerate_portable.zip.001
and dgenerate_portable.zip.002
, these contain
nothing but the dgenerate executable and a frozen python environment which can be placed anywhere.
v3.10.1 Features & Fixes
1.) Generic NCNN upscaler
ncnn
has been added as a package extra. When ncnn
is installed, the new image processor upscaler-ncnn
is available for generic upscaling using NCNN, and should work with models converted from ONNX format. This is included by default in the Windows installer / portable install environment that is attached to each release.
This upscaler supports tiling just as the normal upscaler
image processor does, and essentially the same options in terms of tiling with slightly different defaults.
It does not use the device
argument, but instead a combination of use-gpu=True
and gpu-index=N
for enabling Vulkan accelerated GPU use on a specific GPU.
By default this processor runs on the CPU.
This is because the Vulkan allocator conflicts heavily with the torch CUDA allocator used for diffusion and other image processors when they are placed on the on the same GPU, and having both allocators on the same GPU can cause hard system lockups.
You can safely use this upscaler at the same time as torch based models by running it on another GPU that torch is not going to be using.
Once you have used this processor, be aware that the process will always exit with a non-zero return code, this is due to being unable to clean up the GPU context and certain ncnn
objects properly through ncnn
python bindings before the process shuts down. It will technically create an access violation / segfault inside ncnn
, I am not sure what bad behaviors this will cause on Linux, but on Windows the process exits with no side effects or hang ups other than a non-zero return code.
See: dgenerate --image-processor-help upscaler-ncnn
And also: Upscaling With NCNN Upscaler Models in the readme.
2.) Memory Management
Image processors now have size estimates which are used as a heuristic for clearing out CPU side memory belonging to the diffusion model cache, prior to them being loaded into memory. This should help prevent avoidable out of memory conditions due to an image processor model loading when the diffusion model cache is using most of the systems memory.
This size estimate is also used as a heuristic for freeing up VRAM, particularly the last called diffusion pipeline if it currently is still in VRAM.
If an image processor still runs out of memory, due to its actual execution allocating large amounts of VRAM, it will attempt to free memory and then try again, if an OOM occurs on the second try then the OOM is raised.
Diffusion invocations will now attempt to clear memory and try again in the same fashion for CUDA out of memory errors, but not for CPU side out of memory errors, which are already more easily prevented by the heuristics that are already in place.
The main current enemy of this application running for long periods of time is VRAM fragmentation, which is not avoidable with the default CUDA allocator.
The example runner script in the examples folder has been rewritten to isolate each top level folder in the examples directory to a subprocess when not running with the --subprocess-only
flag.
The only way to clear out the memory fragmentation after running so many models of different sizes is to end the process, so each directory is isolated to a sub process to take advantage of dgenerates caching behaviors for the directory, but to avoid excessive memory fragmentation by isolating a medium sized chunk of examples to a process.
There is also now an option --torch-debug
in the run.py
script which if enabled will try to dump information about objects stuck in VRAM after an OOM condition, and generate a Graphviz graph of possible reference cycles. Currently I cannot find any evidence of anything sticking around after dgenerate tries to clean up VRAM.
dgenerate now sets a PYTORCH_CUDA_ALLOC_CONF
value max_split_size_mb
of 512
before importing torch.
It also sets PYTORCH_CUDA_LAUNCH_BLOCKING
to 0
by default.
These can be overridden in your environment.
3. Fetch CivitAI model links with --sub-command civitai-links
CivitAI has made a change to their website UI (*had some sort of outage) which renders right click copying of direct API links to models no longer possible.
I have written a dgenerate sub-command that can fetch API hard links to CivitAI models on a model page and display them to you next to their model titles.
The links that this command generates can be given directly to dgenerate, or used with the \download
directive in order to download the model from CivitAI.
You can use dgenerate --sub-command civitai-links https://civitai.com/models/4384/dreamshaper
for example to list all available model links for that model using the CivitAI API.
You can use the --token
argument of the sub-command to append an API token to the generated links, which is sometimes needed for downloading specific models.
You can also use this sub-command as the directive \civitai_links
in a config / shell mode or the Console UI.
See: dgenerate --sub-command civitai-links --help
, or \civitai_links --help
from a config / shell mode or the Console UI.
4. Config / Shell - Environmental Variable Manipulation
You can now use the directives \env
and \unset_env
to manipulate environmental variables.
# using with no args prints the entire environment
\env
# you can set multiple environmental variables at once
\env MY_ENV_VAR=1 MY_ENV_VAR2=2
# undefine them in the same manner
\unset_env MY_ENV_VAR MY_ENV_VAR2
See: dgenerate --directives-help env unset_env
5.) Config / Shell - Indirect Assignment
The config / shell language that is built into dgenerate now supports indirect assignment.
You can use a basic template expansion or environmental variable expansion to select the name of a template variable.
This now works for \set
, \sete
, \setp
, and \env
.
It also works for \unset
and \unset_env
All other directives which accepted a variable name already supported this.
\set var_name foo
\set {{ var_name }} bar
# prints bar
\print {{ foo }}
\env VAR_NAME=BAZ
\env $VAR_NAME=qux
# prints qux
\print $BAZ
6.) Config / Shell - Feature Flags and Platform Detection
The config template functions have_feature(feature_name)
and platform()
have been added.
# have_feature returns bool
# Do we have Flax/Jax?
\print {{ have_feature('flax') }}
# Do we have NCNN?
\print {{ have_feature('ncnn') }}
# platform() returns platform.system() string from pythons platform module
# prints: Windows, Linux, or Darwin. etc...
\print {{ platform() }}
7.) Exception handing fixes in dgenerate.invoker
The methods in this library module were only capable of throwing dgenerate.DgenerateUsageError
when they should have been throwing more fine grained error types when requested to do so with throw=True
.
8.) Config / Shell - Parsing fixes
Streaming heredoc templates discarded newlines from the end of the jinja stream chunks, resulting in hard to notice issues with jinja control structures used as top level templates, mostly when the result of the heredoc template was being interpreted by the shell.
9.) Image processor library API improvements
Image processors will now throw when you pass a PIL image that possesses a mode value that the processor can not understand.
Currently, all image processors only understand RGB
images.
10.) Console UI updates
Removed antiquated recipes related to image upscaling in favor of Generic Image Process
and Generic Image Process (to directory)
From the generic image process recipes you can just select the upscaler
or upscaler-ncnn
processor from a drop down and fill out its parameters to preform upscaling.
All image processors now expose parameters provided by their base class in the UI, such as device
, output-file
, output-overwrite
, and model-offload
.
This allows the ability to select a debug image output location with a file select dialog. This is useful if you are trying to use an image processor as a pre-processor for diffusion and need to see the image that is being passed to diffusion for debugging purposes.
The device
argument is hidden in the UI where not applicable, such as the Generic Image Process
recipes where the UI selects the device for the whole command instead of via an image processor URI argument.
The device
URI argument for image processors is available when selecting pre / post processors for AI image generation from the UI as well as when using the Insert Image Processor URI
edit feature.
You can now specify the frame-start
and frame-end
URI arguments for frame slicing when using the Image Seed URI builder UI.
Fixed minor syntax highlighting bugs related to indirect variable assignments.