Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solving issue of differents ckpt with same hash #2459

Closed
wants to merge 4 commits into from

Conversation

jn-jairo
Copy link
Collaborator

Issue

This issue may happen with specific ckpt files when merged with interpolations pairs that add up to 1 like you can see in the screenshot bellow:

image

First I thought it could be just the first characters, but it turns out the whole hash is equal, as you can see:

image

Checking the files proves their content are different:

image

So just the bytes being used to create the hash are equal as I could verify:

image

Solution

To solve this issue I added a option to create the hash using the entire model and save it to a .sha256 file next to the .ckpt file in the first execution and reading from it in the following executions.

image

The content of the sha256 file follow the default sha256 format, allowing to be checked using the command sha256sum -c model.sha256 and you can even create the sha256 file yourself using a command like sha256sum -b model.ckpt > model.sha256 or equivalent and upload it to the models folder along with the ckpt to avoid the hash generation time on the first start.

This solves the issue of differents ckpt files with the same hash without breaking the code to who don't have this issue, and just the first execution is slower, the following executions are as fast as the default hash code.

The new hashes for comparison:

image

Environment this was tested in

  • OS: Ubuntu 20.04.5 LTS Linux 5.4.0-128-generic x86_64
  • Browser: Brave 1.44.108 Chromium 106.0.5249.103 64 bits
  • Graphic card: NVIDIA GeForce MX150 2GB

@dfaker
Copy link
Collaborator

dfaker commented Oct 13, 2022

@raefu had a nice and consistently fast solution to this: hashing the zip directory section at the end of the file so it's a hash of the attributes and crcs of all the contents.

@jn-jairo
Copy link
Collaborator Author

@dfaker cool, but I don't know how to do it, so someone will need to help with that solution.

@d8ahazard
Copy link
Collaborator

Silly question - would this break existing hashes stored in images?

If so, then I'd definitely want an option to enable this or not. Or use a "hashv2" param in infotext or something.

@jn-jairo
Copy link
Collaborator Author

@d8ahazard yes, the new hash will be different from the old one, there is a option to enable this, the default is disabled, but your idea to add something to differentiate in the info text is good, I think we should discurse about it and find a good way to differentiate from the hash method used.

Some options:

  • Add v2 to the hash like: 87d1ac53-v2
  • Use more characters instead of 8: 87d1ac53ab
  • Use another param in info text: Model hash: 87d1ac53, Model hash version: v2

@dfaker
Copy link
Collaborator

dfaker commented Oct 14, 2022

@dfaker cool, but I don't know how to do it, so someone will need to help with that solution.

it's a small section at the end of the file, starting with the signature 0x02014b50, taking the last MB of the .pt sould capture it.

@AUTOMATIC1111
Copy link
Owner

the biggest problem is what we do with all current model hashes

@0xdevalias
Copy link

0xdevalias commented Oct 24, 2022

Some options:

Add v2 to the hash like: 87d1ac53-v2
Use more characters instead of 8: 87d1ac53ab
Use another param in info text: Model hash: 87d1ac53, Model hash version: v2

I like the hash-v2 and/or Model hash: 87d1ac53, Model hash version: v2 options out of the 3 suggestions here. I feel like the 'use more characters' options is too obscure/'magic' feeling. I tend to personally prefer explicitness.

the biggest problem is what we do with all current model hashes

Just riffing off the top of my head here and haven't fully thought this through, but if the original hashing method is kept, alongside this new method (particularly if the new method designates itself as a v2/etc in the hash in some identifiable way), then presumably it would be possible to look up the hash either in both 'old' (current) v1 way, or the new v2 way.

There will still be the edgecases where the current v1 hash clashes for distinct models obviously, but in those cases perhaps it could just show a list of the models that match, and potentially offer to upgrade the embedded hash to the v2 hash. (I'm not actually familiar with the workflow around how the hashes are used, so if the above doesn't match the reality of how they're used, adjust/ignore as appropriate)

eg.

OldV1Hash NewV2Hash Lookup (v1->v2) Reverse Lookup (v2->v1)
AAAAAAAA A2345678-v2 A2345678-v2 AAAAAAAA
BBBBBBBB B2345678-v2 B2345678-v2 BBBBBBBB
CCCCCCCC C2345678-v2 C2345678-v2 or D2345678-v2 CCCCCCCC
CCCCCCCC D2345678-v2 C2345678-v2 or D2345678-v2 CCCCCCCC

@0xdevalias
Copy link

0xdevalias commented Oct 24, 2022

Also, just as a 'prior art' reference, this (hashing the entire *.ckpt file + outputting the full hash as a *.sha256) appears to be what InvokeAI is currently doing:

https://github.com/invoke-ai/InvokeAI/blob/4b95c422bde493bf7eb3068c6d3473b0e85a1179/ldm/invoke/model_cache.py#L263-L281

# ~/dev/stable-diffusion/InvokeAI/models/ldm/stable-diffusion-v1

⇒  ls
model.ckpt  model.sha256

⇒  cat model.sha256
fe4efff1e174c627256e44ec2991ba279b3816e364b49f9be2abc0b3ff3f8556%

⇒  time sha256sum --binary model.ckpt
fe4efff1e174c627256e44ec2991ba279b3816e364b49f9be2abc0b3ff3f8556 *model.ckpt
sha256sum --binary model.ckpt  18.98s user 0.60s system 99% cpu 19.728 total

@jn-jairo
Copy link
Collaborator Author

@0xdevalias thank you, I was thinking the same, I just didn't have time to code it, I will take a look in that InvokeAI code to see if it helps.

@0xdevalias
Copy link

0xdevalias commented Oct 24, 2022

@dfaker: @raefu had a nice and consistently fast solution to this: hashing the zip directory section at the end of the file so it's a hash of the attributes and crcs of all the contents.

@dfaker: it's a small section at the end of the file, starting with the signature 0x02014b50, taking the last MB of the .pt sould capture it.

So sounds like you could probably open the *.ckpt file as read binary, seek to the end, check that it has the EOCD (0x06054b50), then read backwards till you find the CFDH (0x02014b50), then take that slice of data and sha256 hash it.

Some quick Google/StackOverflow results with some example code:

Though this might be a little 'low-level', and may just be worth seeing if it's possible to use an existing python zip lib such as:

@0xdevalias
Copy link

0xdevalias commented Oct 24, 2022

So my brain got curious, and decided to dive into writing a little PoC script for this:

This script will efficiently read the *.ckpt zip file by seeking to the 'end of central directory' (EOCD) record, reading the 'central directory' (CD) offset + length from it, then seeking to the CD offset, reading the CD record in, and then calculating the SHA256 of the CD. It also does some basic error/sanity checking along the way to ensure the file doesn't seem to be corrupted.

Running it looks like this:

⇒  ./quick-zip-sha256hash.py
>> *.ckpt file looks good!
>> Calculating sha256 hash of the Zip Central Directory from the *.ckpt weights file
>> sha256(ckpt_cd) = 685bf114177d8ed310eead5838d4ca5aa6e396a64ab978ca91a0dbfcb6247f02 (0.00s)

The code also writes the hash out to model.sha256-cd:

⇒  cat model.sha256-cd
685bf114177d8ed310eead5838d4ca5aa6e396a64ab978ca91a0dbfcb6247f02%

Note that model.ckpt here is sd-v1-4.ckpt, and so all going to plan, running my script on your SD 1.4 should give the same hash.

@jn-jairo
Copy link
Collaborator Author

@0xdevalias Sorry, your code didn't worked for other models, I got an error for the model wd-v1-2-full-ema.ckpt
Didn't find the *.ckpt Zip file Central Directory (CD) signature where we expected to. Is the *.ckpt corrupted?

@0xdevalias
Copy link

I don't have that particular *.ckpt file, so I can't really check locally, but the PoC code is fairly straightforward, so it shouldn't be too hard to follow along, add in some debugging/print statements/etc to see what's going on, and explore what issues are happening that's leading to that error being hit.

You can see the code that raises that error here. Essentially it seeks to the file offset that the EOCD told it should be the start of the CD record, then attempts to read in the length of the record's bytes. It then checks the first 4 bytes to see if they match the 'magic number' that defines the EOCD record start, and if not, throws that error:

  # Seek to where we expect the Central Directory (CD) record to start and read it in
  fh.seek(ckpt_eocd_cd_offset, os.SEEK_SET)
  ckpt_cd = fh.read(ckpt_eocd_cd_size_bytes)

  # https://en.wikipedia.org/wiki/ZIP_(file_format)#Central_directory_file_header
  ckpt_cd_sig = ckpt_cd[0:4]

  if ckpt_cd_sig != cd_sig:
    raise Exception("Didn't find the *.ckpt Zip file Central Directory (CD) signature where we expected to. Is the *.ckpt corrupted?")
Original message I wrote before I realised I read the error you pasted wrong; in case it's helpful still

I don't have that particular *.ckpt file, so I can't really check locally, but if you look at the note in the PoC code at that section, it suggests that if the *.ckpt has comments in it, then this current naive approach won't work, and you'll need to seek further back in chunks and then scan for the 'magic string' signature to find the start of the EOCD.

https://github.com/0xdevalias/poc-quick-zip-sha256-hash/blob/main/quick-zip-sha256hash.py#L35-L37

if ckpt_eocd_sig != eocd_sig:
    raise Exception("Didn't find the *.ckpt Zip file End of Central Directory (EOCD) signature where we expected to. Is the *.ckpt corrupted, or does the Zip file have comments in it?")
    # NOTE: If the Zip file has comments, then you'd need to seek further back in chunks, and search for the EOCD signature

Shouldn't be too hard to implement, as all the bits and pieces are already there. But I'll leave that as an exploration/exercise for the reader to implement :) (aka: feel free to iterate on my proof of concept to make it more robust/cover edge cases/etc)

@jn-jairo
Copy link
Collaborator Author

Updated the code:

  • added option to select the hash version (default: 1)
  • added option to show the old hash along with the new one (default: False)
  • added backward compatibility with v1, so uploading a info or x/y plot with a v1 hash will select the correct model, no matter the hash version configured

image

@RupertAvery
Copy link

RupertAvery commented Nov 9, 2022

My approach is to sum the CRC32's of the files inside the /archive folder and subdirectories (/archive/data).

See #4478

To be clear, it's more of a quick and dirty content-hash rather than a true hash, but for my intended purpose, it's fast and generates unique values.

The reason being, I would like for there to be support for an embedded diffusion-specific metadata file, containing info about the model, most especially the trigger words and descriptions.

By limiting the hash function to model-specific files, we can freely inject and modify metadata without worrying about breaking the hash.

The hash function can be a sum of CRC32s, or a SHA256 of the concatenation, whichever is more unique. From my limited tests a CRC32 sum of all model files is good enough to make unique checkpoints from most checkpoints, even ones with similar base trainings, and is as fast as the current method.

The SD community sorely needs to embed metadata into ckpts, with the explosion of different models and their trained triggers.

Currently if you download a dreambooth model, there is no way to know the triggers unless you find the original download site or the post in reddit that the author wrote.

Having an embedded metadata that documents the triggers and other info will be extremely useful to the community.

Also suggest keeping the existing hash, but adding a hash-v2 in the UI and PNGInfo to avoid breaking existing hashes. Eventually(?) we can phase out the old hash.

Here's my implementation of the v2 hash:

def model_hash_v2(filename):
    try: 
        import zipfile
        z = zipfile.ZipFile(filename, "r")
        sum = 0
        for info in z.infolist():
            if info.filename.startswith("archive"):
                sum = sum + info.CRC & 0xFFFFFFFF
        return '{:08x}'.format(sum)

    except FileNotFoundError:
        return 'NOFILE'

@Jonseed
Copy link

Jonseed commented Nov 9, 2022

I can confirm that @RupertAvery way of hashing produces unique hashes, and it is very fast. I also like the idea of adding metadata to the ckpt files, as they are proliferating in number, and they are becoming increasingly difficult to organize.

Can metadata be added to the textual inversion embeddings, hypernetwork embeddings, and aesthetic gradients too? Or would this have to be a separate file? Should these also have a hash to uniquely identify them?

@0xdevalias
Copy link

0xdevalias commented Nov 9, 2022

My approach is to sum the CRC32's of the files inside the /archive folder and subdirectories (/archive/data).

See #4478

To be clear, it's more of a quick and dirty content-hash rather than a true hash, but for my intended purpose, it's fast and generates unique values.

I can confirm that @RupertAvery way of hashing produces unique hashes, and it is very fast.

@RupertAvery @Jonseed That's basically the same solution proposed earlier in this PR:

@raefu had a nice and consistently fast solution to this: hashing the zip directory section at the end of the file so it's a hash of the attributes and crcs of all the contents.

Originally posted by @dfaker in #2459 (comment)

And PoC implemented by me above:

So my brain got curious, and decided to dive into writing a little PoC script for this:

This script will efficiently read the *.ckpt zip file by seeking to the 'end of central directory' (EOCD) record, reading the 'central directory' (CD) offset + length from it, then seeking to the CD offset, reading the CD record in, and then calculating the SHA256 of the CD. It also does some basic error/sanity checking along the way to ensure the file doesn't seem to be corrupted.

..snip..

Originally posted by @0xdevalias in #2459 (comment)


Also suggest keeping the existing hash, but adding a hash-v2 in the UI and PNGInfo to avoid breaking existing hashes. Eventually(?) we can phase out the old hash.

@RupertAvery See prior discussions above in this PR about exactly that:

Some options:

Add v2 to the hash like: 87d1ac53-v2
Use more characters instead of 8: 87d1ac53ab
Use another param in info text: Model hash: 87d1ac53, Model hash version: v2

I like the hash-v2 and/or Model hash: 87d1ac53, Model hash version: v2 options out of the 3 suggestions here. I feel like the 'use more characters' options is too obscure/'magic' feeling. I tend to personally prefer explicitness.

the biggest problem is what we do with all current model hashes

Just riffing off the top of my head here and haven't fully thought this through, but if the original hashing method is kept, alongside this new method (particularly if the new method designates itself as a v2/etc in the hash in some identifiable way), then presumably it would be possible to look up the hash either in both 'old' (current) v1 way, or the new v2 way.

There will still be the edgecases where the current v1 hash clashes for distinct models obviously, but in those cases perhaps it could just show a list of the models that match, and potentially offer to upgrade the embedded hash to the v2 hash. (I'm not actually familiar with the workflow around how the hashes are used, so if the above doesn't match the reality of how they're used, adjust/ignore as appropriate)

eg.

OldV1Hash NewV2Hash Lookup (v1->v2) Reverse Lookup (v2->v1)
AAAAAAAA A2345678-v2 A2345678-v2 AAAAAAAA
BBBBBBBB B2345678-v2 B2345678-v2 BBBBBBBB
CCCCCCCC C2345678-v2 C2345678-v2 or D2345678-v2 CCCCCCCC
CCCCCCCC D2345678-v2 C2345678-v2 or D2345678-v2 CCCCCCCC

Originally posted by @0xdevalias in #2459 (comment)

Updated the code:

  • added option to select the hash version (default: 1)
  • added option to show the old hash along with the new one (default: False)
  • added backward compatibility with v1, so uploading a info or x/y plot with a v1 hash will select the correct model, no matter the hash version configured

..snip..

Originally posted by @jn-jairo in #2459 (comment)


The reason being, I would like for there to be support for an embedded diffusion-specific metadata file, containing info about the model, most especially the trigger words and descriptions.

Having an embedded metadata that documents the triggers and other info will be extremely useful to the community.

👏🏻👌🏻 Agreed.

@jn-jairo
Copy link
Collaborator Author

I made a new PR #4546 with the code of this PR and the code suggested by @RupertAvery in #2459 (comment)

@RupertAvery just to be clear, I did this PR before your feature request #4478, the proposal of this PR is to solve the ckpt with the same hash while keeping a backward compatibility, which this PR does, if you wish to add a metadata file inside the ckpt file feel free to open a PR with that code later.

@RupertAvery
Copy link

RupertAvery commented Nov 12, 2022

So I just tried loading a checkpoint with a file diffusion.json in the root of the zip and it looks like torch itself refuses to load the model weights if there is anything outside of the archive folder, so moving the file into that folder allowed the model to load.

It's an easy fix, not one I'm entirely happy with.

            if info != "archive/diffusion.json":
                sum = sum + info.CRC & 0xFFFFFFFF

@JustMaier
Copy link
Contributor

This problem still exists, right?
Do we at least have an idea of what the new standard will be? I need to hash all of the models on Civitai (391 models) and I'd rather not have to spend the bandwidth to do it more than once :P

@RupertAvery
Copy link

With safetensors, the proposed method of hashing the CRCs no longer works, because safetensors aren't zip files. This leads to the question, how can we hash safetensors properly?

@JustMaier I've also thought about indexing the civitai models, and if it's possible to just just the part of the file necessary to compute the hashes, of maybce ask civitai to expose the hashes in an API

@JustMaier
Copy link
Contributor

So I'm the guy behind Civitai, we don't have the hashes right now because I wasn't sure what was going to be done about this issue and I didn't want to pull everything down for hashing more than once.

Since CRC hashing isn't an option for safetensors I think the standard should be SHA256. If I understand the concern with that, is that it will take longer to compute, right?

@RupertAvery
Copy link

RupertAvery commented Dec 15, 2022

Hi! The problem with computing hashes is that Automatic1111 does it in real time, and that's probably why such shortcut method was used.

If we're going to add a new format anyway, i.e. safetensors, then I advocate creating a standard container for checkpoints, safetensor OR ckpt, with metadata to say which it is, what the hash is, and have a whole lot of space for author information.

Just a 2MB header with space for plaintext json metadata, and a precomputed SHA256 would be good enough. Though, 2MB is probably overkill. Having that empy padded space would allow authors to freely edit their metadata, without having to repack the actual weights. Also, the sha256 is just there for anyone who wants to actually check it.

We just need tools to enable aothors to move to the new format, and support from the WebUIs to read it. Like they say, build it, and they will come.

It's an additional burden on finetuners, but hopefully dreambooth UIs can integrate this into their process,

It isn't even a new format, it's just an additional header, with the actual data offset. I don't know how easy it is for pytorch to load a file from an offset instead of directly though. If it is possible, we don't even have to break anything, it's just an alternate way of loading checkpoints into memory.

Doesn anybody know if this is feasible?

also, what's a good extension for this container format?

.diffusion?

@JustMaier
Copy link
Contributor

We just need tools to enable aothors to move to the new format, and support from the WebUIs to read it. Like they say, build it, and they will come.

I like the idea of this, there is plenty of metadata that could and should be included in models (think merge tracking etc), the challenge would be propagating the standard. I think this could be helped with easy-to-use python packages that make implementing the standard easier and additional tools that can be used by end users to add metadata to existing AI art resources (checkpoints, embeds, hypernetworks, etc). Additionally, I'd like to think that if we implemented it as part of Civitai, that we could automatically apply this metadata to the 1,284 resources we're currently housing and help start the trend.

One bonus of coming up with a metadata standard is that it can continue to be used as new checkpoint formats or other AI art resources are released. For example, LORAs just hit the scene, wouldn't it be great if they were able to include the same metadata format.

@RupertAvery
Copy link

Yea, and we could extend this concept to embeddings, hypernetworks. Different extension perhaps, but having a header there.

I'm the author of Diffusion Toolkit,

https://github.com/RupertAvery/DiffusionToolkit

and I see that as a way to make it accessible to windows users. I plan to put more checkpoint related tooling into it anyway.

@JustMaier
Copy link
Contributor

I wonder if there is a way to include the metadata without making the files not work in tools that don’t support the metadata so that adoption doesn’t have to be blocked by “waiting for my favorite generator to support the format”

great tool btw. I need to dig into it to see how you’ve pulled out the metadata from each of those tools. I need the same thing on Civitai. Right now it only supports AUTO metadata. Got a file I should look at?

also if your serious about this, we should start a proposal somewhere to see if we can gather any input and get a few tool maintainers on board.

@RupertAvery
Copy link

RupertAvery commented Dec 19, 2022

I wonder if there is a way to include the metadata without making the files not work in tools that don’t support the metadata so that adoption doesn’t have to be blocked by “waiting for my favorite generator to support the format”

That was my original goal with ckpts, since they are zip files, they can contain anything without breaking the loading, you just have to tweak some scripts to allow the file we're going to add. The possibility of that went away with safetensors.

Another way of course is just to have a .json file next to the ckpt. Instant "metadata".

Even then, we still have to somehow add support to GUIs to read the metadata and display it somewhere useful. It would help if we could get someone already familiar with gradio and a1111's gui code on board. Like maybe an extension author. I'm willing to dive in, but I'm a little busy with other things right now.

Right now it only supports AUTO metadata. Got a file I should look at?

Everything is in Metadata.cs. If you look at the closed issues, there will be images there with test data.

also if your serious about this, we should start a proposal somewhere to see if we can gather any input and get a few tool maintainers on board.

We'll just have to try to push this forward as much as possible by making a branch and promoting it (like, telling anyone willing to try the fork). Unfortunately, not everyone will be git-savvy or waiting for us to merge the constant infux of commits from the main repo.

By including metadata authoring ni DIffusion Toolkit, I hope to generate some hype for it. I could start with storing it in a side-along json file, or in the database. It's only useful to Diffusion Toolkit, but at least users can check manage their models a bit.

I actually thought of loading sample images and other information from civitai when viewing models in Toolkit, and was hoping to reach out to you for that, but I don't know if that's okay (probably isn't).

You have a great site and really contributes to making models searchable, accessible and documenting their information (triggers) where possible.

I have started something by building a file wrapper that SHOULD make it so that, when it gets read by the consumer, it offsets everything. It kind of works, right now I'm testing it with an offset of zero, so I don't have to actually offset everything, just a proof of concept. It works for the zip file loader part (i'm testing it on a ckpt renamed to .diffusion) but as soon as it gets into torch.load, I get this:

  File "D:\conda\AUTOMATIC1111\stable-diffusion-webui\venv\lib\site-packages\torch\serialization.py", line 920, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.

This is the wrapper diffusion_format.py:

import io

class DiffusionFile:

    def __init__(self, file: str) -> None:
        print(f"Opening {file}")
        self.fp = open(file, 'rb')
        self.offset = 0
        self.fp.seek(self.offset)
        self.safetensors = False

    def __enter__(self):
        print("__enter__...")
        return


    def __exit__(self):
        print("__exit__...")
        return

    def readinto(self, buffer):
        print("ReadInto...")
        return self.fp.readinto(buffer)

    def fileno(self):
        print("FlieNo...")
        return self.fp.fileno()

    def seekable(self):
        print("Seekable...")
        return True

    def peek(self, size: int):
        print("Peeking...")
        return self.fp.peek(size)

    def isatty(self):
        print("Iaattyy...")
        return self.fp.isatty()


    def detach(self):
        print("Detaching...")
        return self.fp.detach()

    def flush(self):
        self.fp.flush()
        return

    def close(self):
        print("Closing...")
        self.fp.close()
        return

    def read(self, size: int = None):
        print("Reading...")
        return self.fp.read(size)

    def read1(self, size: int = None):
        print("Reading1...")
        return self.fp.read1(size)

    def read1into1(self, buffer):
        print("Reading1...")
        return self.fp.readinto1(buffer)

    def readline(self, size: int = None): 
        print("Reading Line...")
        return self.fp.readline(size)

    def tell(self):
        print("telling...")
        return self.fp.tell() - self.offset

    def seek(self, offset: int, whence: int = 0):
        print("Seeking...")
        self.fp.seek(self.offset + offset, whence)
        return

    def writable(self):
        print("writable...")
        return False


    def readable(self):
        print("readable...")
        return True

    def getclosed(self):
        print('getclosed')
        return self.fp.closed

    def getmode(self):
        print("getmode...")
        return self.fp.mode

    def getname(self):
        print("getname...")
        return self.fp.name

    name=property(getname)

    mode=property(getmode)

    closed=property(getclosed)

def load_file(filename):
    print("Loading diffusion file...")
    return DiffusionFile(filename) 

And this is where I inject it in sd_models.py

from modules import diffusion_format

...

def read_state_dict(checkpoint_file, print_global_state=False, map_location=None):
    _, extension = os.path.splitext(checkpoint_file)

    has_safetensors = False

    if extension.lower() == ".diffusion":
        checkpoint_file = diffusion_format.load_file(checkpoint_file)
        has_safetensors = checkpoint_file.safetensors
    else:
        has_safetensors = extension.lower() == ".safetensors"

    if has_safetensors:
        pl_sd = safetensors.torch.load_file(checkpoint_file, device=map_location or shared.weight_load_location)
    else:
        pl_sd = torch.load(checkpoint_file, map_location=map_location or shared.weight_load_location)

This is going way off topic of course.

We should definitely start some issue or something somewhere else, focused on making a container format and defining its spec. But where so that we can still get good visibility from other devs?

@RupertAvery
Copy link

See this repository for more information and as a place for discussion on the proposed container format:

https://github.com/RupertAvery/DiffusionFormat

@aka7774
Copy link

aka7774 commented Jan 11, 2023

hash_file = os.path.splitext(filename)[0] + '.sha256'

The two hashes are different:

  • a.ckpt
  • a.safetensors

filename + '.sha256'

  • a.ckpt.sha256
  • a.safetensors.sha256

@jn-jairo
Copy link
Collaborator Author

hash_file = os.path.splitext(filename)[0] + '.sha256'

The two hashes are different:

  • a.ckpt
  • a.safetensors

filename + '.sha256'

  • a.ckpt.sha256
  • a.safetensors.sha256

All this PR was made before the safetensors exist in this project, this PR isn't going anywhere, because auto didn't show interest in a new hash method, so I won't bother to update it until there is some chance of a new hash method to be accepted in this repo.

@aka7774
Copy link

aka7774 commented Jan 11, 2023

This extension is working well
https://github.com/aka7774/sd_infotext_ex

@jn-jairo
Copy link
Collaborator Author

Issue solved by a95f135

@jn-jairo jn-jairo closed this Jan 14, 2023
Atry pushed a commit to Atry/stable-diffusion-webui that referenced this pull request Jul 11, 2024
* Revert "Revert "Add Densepose (TorchScript)""

* 🐛 Fix unload
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants