-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ModelPatcher Overhaul and Hook Support #5583
Conversation
…xed fp8 support for model-as-lora feature
…d added call prepare current keyframe on hooks in calc_cond_batch
… for loras/model-as-loras, small renaming/refactoring
…odelPatcher at sampling time
…n work on better Create Hook Model As LoRA node
…implement ModelPatcher callbacks, attachments, and additional_models
…ed additional_models support in ModelPatcher, conds, and hooks
…ond to organize hooks by type
…er properties, improved AutoPatcherEjector usage in partially_load
…delPatcher for emb_patch and forward_timestep_embed_patch, added helper functions for removing callbacks/wrappers/additional_models by key, added custom_should_register prop to hooks
…s due to hooks should be offloaded in hooks_backup
…ormer_options as additional parameter, made the model_options stored in extra_args in inner_sample be a clone of the original model_options instead of same ref
…se __future__ so that I can use the better type annotations
All of my manual testing is complete - the PR can be merged at any time if all looks fine with @comfyanonymous |
…ous/ComfyUI#5583 The interface for this *will* change. See #63
@Kosinkadink If I create a workflow that sets a LoRA hook with no scheduling, it seems that it reloads the hook on every sampling run even if just the sampler seed changes. It causes pretty significant latency. I think it's because the patches get unloaded immediately after sampling even though there's no need to do so. Is there a way to use the hook mechanism in a way that avoids this at the moment? My quick testing shows a 13.8s to 16.5s increase when only changing the seed after warmup (sometimes even up to 18s). |
A lot of my current optimization was focused on the worst case scenarios of different hooks needing to be applied to different conditioning, so to prevent any memory issues from cached weights not being cleared, I currently have the model purge newly registered (AKA, added at sample time) hook patches and clear cached hooked weight calculations, always. In cases where there is only a single hook group to apply, I could make it not revert the model to its unhooked state at the end of sampling, so that if nothing gets changed with the hooks/ModelPatcher, it would not need to redo hooked weight application. However, that introduces some extra complexity that could introduce bugs I don't want to deal with currently - I've been working on this for 3 months, and in its current state it hasn't even been released to be tested by a wide variety of peeps. Once it gets merged and it appears to be working fine in general, I'd be down to add an optimization for that edge case. |
@Kosinkadink Fair. This PR is definitely big enough already. |
…ns_scheduled call
…uld have their own additional_models, and add robustness for circular additional_models references
…oks_improved_memory
Thanks for the considerable effort it must have taken to make this so readable. That is tremendously helpful. |
@Kosinkadink I've tried your lorahookmasking workflow, using a single checkpoint and one lora file for each Set Clip Hooks node. While it did manage to perfectly avoid lora bleeding, the generation time is 10x longer than usual for a 12-step 880x768px image:
Also, RAM usage skyrockets immediately as shown here. The model is PrefectPonyV2XL (6.4GB) and the loras have been shrinked to 70 MB each, but this didn't help at all. Is there some explanation on why this is so slow? The workflow is exactly your original one, but loading 2 loras instead of 1 lora and 1 checkpoint as lora. |
This PR merges all changes from improved_memory branch, and expands ModelPatcher + transformer_options to allow for different weights and properties to be applied for selected conditioning. This is done by introducing a hook design pattern, where conditioning and CLIP can have hooks attached to change their behavior at sample time; before, this was hardcoded for specific things like
controlnet
andgligen
.I did not find any memory or performance regression in my testing, but more testing would be good; I will try to get some folks to test out this branch alongside the corresponding rework_modelpatcher branches in AnimateDiff-Evolved and Advanced-ControlNet that make use of the new functionality.
Related PRs in those repos that will be merged when this does:
Kosinkadink/ComfyUI-AnimateDiff-Evolved#498
Kosinkadink/ComfyUI-Advanced-ControlNet#198
Remaining TODO:
Breaking Changes:
get_control
function now takestransformer_options
as a required parameter; if a custom node wrote its own function to overwrite the built-incalc_cond_batch
function, it will result in an error when executing. It will be an easy fix for any affected nodes; only one I can think of on the top of my head is TiledDiffusion.Features:
wrappers
functions that will be automatically handle passing an executor into wrapper functions to facilitate wrapping in a predictable manner. Since there is no limitation on names of wrapper functions, some custom nodes could decide to expose extending their own functionality with other nodes through the wrapper system. Cost of wrapping is imperceptibly low, so more wrapper support can be added upon need/request.callbacks
instead can be used to extend ModelPatcher functions to avoid the need for hacky ModelPatcher inheritance, for cases where wrapping wouldn't make sense. Same as wrappers, more callbacks can be added upon need/request.model_options
. Only clones objects stored inside it that have aon_model_patcher_clone()
callable,comfy.patcher_extension
to allow for easy modification and classification viaCallbacksMP
andWrappersMP
. In a future PR, patches should be exposed in a similar way.get_control
functions now take in transformer_options as an input, allowing them to add their own patches, wrappers, etc. as desired.