-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CHD improved compression algorithm #7402
Comments
It would also cause compatibility issues if people start distributing CHDs using a new compression algorithm that isn’t supported in the applications that are out there now, and add another third-party library as a hard dependency. It’s not such a simple decision to add. |
This appears to be the same as issue #7386. |
There was a plan to use https://github.com/aaru-dps/libaaruformat as the basis for a major CHD feature update around now, and introducing additional compression codecs at that time would've been natural, but Claunia's been too busy to work on that library. |
It would be great if chd could be a streamable archival format, so you
can compress/decompress cue/bin (or any other files) but still have the
benefit of multiple compression algorithms available.
It might get traction outside of emulation.
…On 27/10/2020 01:55, R. Belmont wrote:
There was a plan to use https://github.com/aaru-dps/libaaruformat
<https://github.com/aaru-dps/libaaruformat> as the basis for a major
CHD feature update around now, and introducing additional compression
codecs at that time would've been natural, but Claunia's been too busy
to work on that library.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#7402 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACGVYYXQ6FRTMG5R3PMBXODSMYSB5ANCNFSM4TAENSMA>.
|
That was the intent - have CHD wrap "original" formats like cue/bin and iso and use libaaruformat to decode the formats, with the CHD layer providing transparent (de)compression. It would greatly boost the number of formats supported, make it so problems with our interpretation of the format didn't require remaking the CHD, and put a specialist in such matters (Claunia) in charge of the format decoding. |
Hi all, In the design of the aaruformat library I found three problems that I would be fixed in aaruformat v2 design:
The design for V2 was completed in March when a globlal pandemic bringed everything to a halt. Because V1 was so intrinsically linked to Aaru (basically being a part of it, not independent), I need to start the move to Aaru 6.0 so I can start implementing V2 in the library and have test images to ensure it works properly. I'm doing it as fast as I can with my little spare time as I have not been able to find grants for my work on Aaru so I can dedicate to it 100%. |
Apologies if this is not the right place to discuss this, but what is the status of CHDv6? I think it's currently awaiting claunia's Aaru v2 but is there any work being done regardless? Where can I read about progress, road map, discussions, etc?
To finally wrap this up, I wanna say that I do like claunia's ambitions with Aaru, and I think the aif format is a great concept (with even more emphasis on preservation e.g. storing the PSX disc wobble) but sadly it hasn't gained any traction. I can't wait to see what a collaboration would look like with the excellent decoding capatibilites of Aaru, with the familiar on-the-rise container CHD, and the huge MAME project backing it up. Cheers. P.S. I'm not demanding anything here, I'm just writing this down so I can get them out of my head. |
@Anuskuss Dang, you nailed that write up. I hope all of that is taken into consideration for V6. 🤞 Personally, I think there's a huge missed compressing opportunity by not finding a way to compress 2 - 4 disc games into a single CHD (compressing two discs into one archive basically make it the size of single disc since so much is reused between discs). Not to mention the benefit for emulators being able to choose a disc without a user making a manual m3u file to link them. |
The development of AaruFormat V2 is open and you can follow it in here where you can also peep at the official formal specification. It is going slower than I intended because it's me alone doing it and things in life took a 180 degree turn this 2022. I have not made a roadmap really because the specification itself is what we intend to implement so it works kinda like a roadmap. The only thing V2 will have that is not yet written down is support for Data Position Measurements. @lonkelle we have that planned for AaruFormat V3, as we need to ensure V2 is working fine before adding such a complex feature. As for the complains that emulators are not using AaruFormat well it has never been my target with the format. My target is preservation and our userbase is quite happy with the format even if no emulator supports it. I would love for emulators to support it but does not depend on me (neither does that AFV2 becomes CHDV6), and I cannot focus my energy on convincing people to, I just focus on making the format be able to preserve any media. Hope this solves your doubts. If you want to make specific discussions or questions about AFV1 or AFV2 please feel free to drop by the repository linked above. |
That doesn't sound right. mame/src/lib/util/chdcodec.cpp Lines 1151 to 1152 in a504bde
If, give or take, you use 64MB you shouldn't get near those numbers (unless you are trying to run more than four parallel streams at once or something).
Overhead.. as in cpu time required for compression?
*lzma2 Speaking of which, I'd like the bring attention to a recent discovery I made.
I suppose the xbox and wii guys would also like to have a word about that. p.s. FWIW brotli is also very competitive with zstd |
#11827 adds support for Zstandard compression in CHD files, as well as zip archives. By default, chdman will not enable Zstandard compression, so CHD files will be compatible with existing software. You can enable Zstandard compression when creating or copying a CHD with the For CD-ROM media, a good setting to try is |
I'm curious, what happens if someone creates a V5 CHD that uses zstd and attempts to use it with another program that supports CHD? Will there be an easy to parse error? I'm also a bit curious as to why this didn't end up bumping the CHD version. |
The error handling will depend on what the other program does, we don't control that. CHDMAN doesn't use Zstandard by default for that reason. The people who wanted it can have it, but we're not forcing people to use it and would probably recommend that maintainers of torrent sets or whatever not rush into it. |
If the program is using MAME's Previous versions of MAME itself will report the image as "not found" when auditing media as MAME isn't particularly detailed/friendly when it comes to dealing with invalid, unsupported or corrupt CHD files. I assume other CHD implementations also return an error on encountering an unsupported codec FourCC on opening a CHD file.
You should always be checking error codes, and there's already an error code that covers this situation.
CHD V5 already includes support for adding codecs (in much the same way that the zip file format doesn't need to change when a compression method is added). A V5 CHD file can specify up to four codecs. We don't frequently define new codecs. As @rb6502 already pointed out, chdman will not enable Zstandard by default when creating CHD files, so compatibility won't be broken unless you choose to use it. |
@cuavas Great work! Does this use zstd compression level 19? |
Well, you could read the source and see… It uses whatever Since compression isn’t done on-the-fly and decompression speed is largely insensitive to the compression level, it favours higher compression. |
I'm not salty that none of my suggestions were acknowledged but has at least my last point been taken care of? I remember that I used to zero-out the error correction which resulted in better compression. Reed–Solomon EDC/ECC could easily be regenerated when extracting so there's no point in storing that information. |
If you want to change the world, you have to submit pull requests. WindyFairy just added multi-session disc support to CHD, for instance. I'm not super enthusiastic about throwing away data though, on the off chance that it's important for protected discs or something along those lines. |
C is sadly above my weight class but even if, I don't think it's that easy to implement. I mean the Reed–Solomon can be copy-pasted but then you'd have to handle the case where it's deliberately wrong (like you said maybe for piracy detection or something) and then store that somewhere. The end goal would be to get rid of everything that's not userdata and only store what differs from the default case (e.g. submode). Then you could have a MODE2/2352 stored with a 2048 block size, giving you max compression. |
errm, but ECC/EDC already zeroed before compression and regenerated during decompression, isn't it ? |
Nah. $ du -m *.chd
285 noecc.chd
298 normal.chd with open('/tmp/normal.bin', 'rb') as o, open('/tmp/noecc.bin', 'wb') as n:
while b := o.read(2352):
n.write(b[:-280])
n.write(b'\0'*280) |
Yeah mame/src/lib/util/chdcodec.cpp Line 370 in 6483414
You may see the code which does clear header and ecc if they are "standard" |
Anuskuss and p1pkin you guys are right, it only works in a bug state so it's incomplete. nocash documented it. http://problemkaputt.de/psxspx-cdrom-disk-images-chd-mame.htm "The ECC-Filter works only for 930h-byte sectors (920h does also contain ECC, but CHD can't filter that, resulting in very bad compression ratio)". |
@robzorua not sure if it's the problem, the actual raw CD sectors is always 2352 (930h) bytes long (plus subcodes), there is no such thing as 2336 (920h) byte sector. |
They’re referring to the CD-ROM XA Mode 2, Form 2. It uses the following pattern:
The total size is 12+3+1+8+2324+4 = 2352 bytes per sector. If you assume standard mastering and no data errors you can reconstruct the sync pattern and CRC. That leaves 3+1+8+2324 = 2336 meaningful bytes. Most mastering software supports supplying data files with 2336 bytes per sector for CD-ROM XA Mode 2 tracks. However “920h does also contain ECC” is just plain incorrect. The whole point of Mode 2, Form 2 is that it omits the extra in-band ECC data to allow 276 more data bytes per sector. You trade redundancy and error tolerance for space and speed. But this is getting way off-topic. We aren’t talking about changing the way data is stored inside CHD files here. The issue was just requesting support for Zstandard compression in CHD files, which has been implemented. |
CHD is a streamable compression format for backup images developed by the mame project.
It's getting traction way beyond mame, and is quickly becoming the de facto compressed format for the entire emulation community.
CHD cuts the image into smaller blocks, compressed individually.
There are, as far as I know, 3 compression algorithms which are accessible : lzma (
cdlz
), zlib (cdzl
), and flac (cdfl
).flac is specific to sound data, while
lzma
andcdzl
are lossless.lzma
compresses better, but is also slower to decompress.CHD utilities like
chdman
seem to highly value space, and therefore tend to lean towardslzma
. That's fine, if one disregards decompression speed. But on many devices, notably ARM smartphones and tablets, decompression speed is a real hog. This basically leaveszlib
as the only alternative.More recently, a new compression algorithm,
zstd
has started to become mainstream. Granted it is especially successful in cloud center and server space, but not only. It's even documented as part of updatedzip
and7zip
format specification, has a public IETF RFC, and can even be used to compress web traffic.What makes
zstd
interesting with regards to CHD ?To begin with, it has excellent decompression speed. As in ~x16 faster than
lzma
, and 3-4x tozlib
.One would expect to pay this speed in compression ratio, but that's effectively not the case : at its highest setting,
zstd
compresses within a few % oflzma
, with small to negligible differences.At the very least, compared to
zlib
, it's an all win, and a substantial one.For these properties,
zstd
is already used in transparent compression for file systems, such assquashfs
, which serve a similar use case asCHD
.High compression ratio, blazing fast decompression speed, could that be an interesting evolution of the
CHD
format ?From a format perspective, it's probably not a huge deal : just an additional tag for a new format. I presume the current format already uses tags to distinguish between none, zlib, lzma and flac.
From an ecosystem perspective though, it can be trickier : it will require support from the decoder first, before any encoder can make use of the new feature.
The text was updated successfully, but these errors were encountered: