Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Sega CD/Saturn disc hashing #4024

Merged
merged 3 commits into from
Sep 14, 2024

Conversation

kalimag
Copy link
Contributor

@kalimag kalimag commented Sep 8, 2024

Fix DiscHasher.OldHash() method used to hash non-PSX discs

  • Don't hash generated (empty) lead-in/lead-out tracks
  • Limit to track length, not absolute LBA of next track

Previously, Saturn and Sega CD discs are all hashed to the same value D41D8CD98F00B204E9800998ECF8427E (empty sequence). This happens because the method is trying to hash the non-existent lead-in track and track.NextTrack.LBA is 0 (the LBA of the "real" first track).

Hashing PC Engine discs already mostly worked because the first track is usually an audio track and the lead-in track copies the mode of the first track and is skipped. A handful of discs where the first data track is <512 sectors may generate a different hash now, because the old implementation was reading past the end of the track. The only examples of that I could find were some unlicensed games that weren't in the gamedb.

Jaguar CD hashing still doesn't work because all tracks are marked as audio on those discs.

Check if completed:

- Don't hash generated (empty) lead-in/lead-out tracks
- Limit to track length, not absolute LBA of next track
@YoshiRulz
Copy link
Member

Does this resolve #1075 then? That would be huge.

@kalimag
Copy link
Contributor Author

kalimag commented Sep 8, 2024

Kind of? I tested a fair number of Saturn, Sega CD and PC Engine discs, and didn't find any collisions between different games. However, different revisions/bad dumps do occasionally produce the same hash, which is expected since it's only hashing the first track. I'm not sure if this is a problem and what level of quality you're looking for with the disc hashes.

BizHawk now has another set of console-specific disc hashing functions in the RA code. I don't know how well they deal with revisions, but perhaps reusing those for general hashing would also be an option. They currently live in the EmuHawk project, but other than swapping out the RA ConsoleID for BizHawks DiscType or SystemId it probably wouldn't be hard to move them into the disc or common project.

@kalimag kalimag changed the title Fix non-PSX disc hashing somewhat Fix Sega CD/Saturn disc hashing Sep 8, 2024
@CasualPokePlayer
Copy link
Member

BizHawk now has another set of console-specific disc hashing functions in the RA code. I don't know how well they deal with revisions, but perhaps reusing those for general hashing would also be an option. They currently live in the EmuHawk project, but other than swapping out the RA ConsoleID for BizHawks DiscType or SystemId it probably wouldn't be hard to move them into the disc or common project.

FWIW, doing this could also help solve the issue of 3DS hashing, as rcheevos has their own limited hashing method for it. The downside of course, is it requires firmware (aes_keys.txt / seeddb.bin) in the case the game is encrypted (although if the firmware is not present the game couldn't be loaded anyways).

The disc hashing code too is also meant to be more a secondary implementation compared to the rcheevos implementation, although I've been mulling considering just removing it and deferring the hashing code entirely to rcheevos, like debug hashing code does (it has callbacks set in place so it will end up using our own disc system just fine). For general hashing however, it could perhaps still be better to have our own implementation so we are not directly subject to rcheevos in normal use (i.e. most users).

@CasualPokePlayer
Copy link
Member

CasualPokePlayer commented Sep 8, 2024

Does this resolve #1075 then? That would be huge.

Jaguar CD still won't work with this approach. Fundementally, Jaguar CDs are "audio" CDs, they never have "data tracks" (this was a design decision by Atari to squeeze out more space, at of course the obvious cost of no error correction data and so a much higher risk of disc read failures). You'd want a specialized disc hashing approach here, likely doing what RA does.

@kalimag
Copy link
Contributor Author

kalimag commented Sep 10, 2024

  • Is there any reason not to also hash the TOC for non-PSX discs? (apart from Many PSX ID Hashes Are Probably Wrong #3760)
  • Should there be special handling for old movies saved with the meaningless D41D8CD98F00B204E9800998ECF8427E hash?
  • Is it alright to just duplicate the Jaguar CD hashing code from the RA integration into the DiscSystem project to cover that system as well?

@YoshiRulz
Copy link
Member

Should there be special handling for old movies saved with the meaningless D41D8CD98F00B204E9800998ECF8427E hash?

No, people "upgrading" their movies are expected to correct header and sync settings themselves.

Is it alright to just duplicate the Jaguar CD hashing code from the RA integration into the DiscSystem project to cover that system as well?

Only if it's not trivial to move it. public methods in DiscSystem are available in the RetroAchievements classes.

Reuse RetroAchievements implementation, move into `DiscHasher`
@YoshiRulz YoshiRulz merged commit e2a6942 into TASEmulators:master Sep 14, 2024
4 checks passed
@YoshiRulz
Copy link
Member

I'm afraid this didn't end up making it into RC1. I think it's because I pushed master:release without pulling master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants