Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to check R-W pack parity during refine #193

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

hcs64
Copy link
Contributor

@hcs64 hcs64 commented Oct 20, 2024

This implements a --refine-subchannel-rw option that enables R-W pack P parity checking.

This provides some small improvement when refining CD+G discs: Over 6 refines on an old disc (Lou Reed - New York), it went from 627 errored packs (237 CD+G) to 554 (205). Without the R-W check it only got down to 621 (230) over 6 refines. (These pack error counts are from redumper-extract-rw, which will correct the errors.)

A pack is interleaved across three sectors, but if there are only one or two errors in a pack it is possible to locate it so only the affected sector needs to be retried. These errors could be corrected anyway, but it will let us get some more use out of refine for discs that use R-W.

{
read_entry(fs_sub, (uint8_t *)sector_subcode_rw_context.data(), CD_SUBCODE_SIZE, lba_index - 2, CD_RW_PACK_CONTEXT_SECTORS, 0, 0);

if(!valid_rw_packs(sector_subcode_rw_context))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bug here, though I think it only affects the count: If Q was corrected above, and that also corrected R-W, then sector_subcode was already written to the file, so it will look here like it had already been correct. What's needed is to defer write_entry until after both the Q and R-W checks, probably set a store_subcode flag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's going to be hard to keep the count accurate anyway, given that a sector may be marked correct by a read of the next sector. In the interest of not overcomplicating things I'm going to leave this as-is here, unless you'd prefer I handle it more thoroughly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an error in my initial store_subcode approach, the latest commit implements it correctly: the final number of R-W errors matches what will be initially counted on a subsequent run, before it would miscount if both Q and R-W were corrected on the same read.

It's still possible that a sector becomes valid only due to subsequent reads, but this should be rare (it would involve at least 3 errors in a pack, which I haven't yet seen), and anyway it will only affect the count. There's some outside chance that a subsequent read will make an earlier sector invalid so that it ought to be re-read, but this seems so unlikely that it isn't worth special handling. It's rather more likely that a re-read for Q makes R-W become invalid; this will be at least retried if there are retries left, but it will also throw off the count.

cd/cd_dump.ixx Outdated Show resolved Hide resolved
hcs64 added 2 commits October 22, 2024 00:54
Previously it wouldn't count if Q was also corrected
@superg
Copy link
Owner

superg commented Oct 22, 2024

That's a lot of code. I will be checking that out on a weekend.

@superg
Copy link
Owner

superg commented Nov 5, 2024

On a high level, what is the benefit of checking if R-W pack is valid (across 3 sectors) vs subchannel Q check (across 1 sector)?

I'll explain:
If Q is damaged, it's crc doesn't match e.g. for each sector we check Q crc and mark it to refine if it's incorrect.
Due to the way subchannel is bit interleaved, bad Q crc usually means that part of R-W pack is also bad. Rereading that sector's subchannel until we get Q crc match usually fixes it's 1/3 R-W counterpart.

To summarize, there are two ways to check if whole subchannel is valid:

  1. check Q crc of a given sector
  2. check R-W parity (we'll get to that a bit later but for the simplicity I omit the details here) of 3 sectors

So far, I don't understand why to do (2) that is much more complicated if (1) is enough to establish that subchannel is incorrect?

@superg
Copy link
Owner

superg commented Nov 5, 2024

Another question, CD-TEXT is stored in R-W (lead-in). Internally, CD-TEXT is structured into the packs and there is same 16-bit CRC field, example: https://github.com/superg/redumper/blob/main/cd/toc.ixx#L421

Is there any correlation of this CD-TEXT structured data and R-W packs structure?

@hcs64
Copy link
Contributor Author

hcs64 commented Nov 5, 2024

Thanks for looking over it!

Due to the way subchannel is bit interleaved, bad Q crc usually means that part of R-W pack is also bad. Rereading that sector's subchannel until we get Q crc match usually fixes it's 1/3 R-W counterpart.

It's true that rereading for bad Q might fix R-W, but my goal is to retry sectors with only R-W errors.

From an example dump: The initial dump had Q: 156, R-W: 547 (sectors with errors of each type). A few refine passes later it's down to Q: 98, R-W: 463. That's 58 fewer Q, 84 fewer R-W. Even if each reread for Q corrected one R-W (not given), those other 84-58=26 would not have been retried.

Statistical aside: If your model is that errors occur by replacing a whole byte by another byte at random, there's still a ~1/2 chance of the Q bit being correct when P-W is replaced (127 same Q / 255 wrong bytes), while that chance is only 1/85 for R-W (3 same R-W / 255 wrong bytes), suggesting ~40 times as many R-W errors. The odds are a lot better than that, though, since the error is roughly independent for each bit. With error probability p for one bit you'll expect 1-(1-p)**6 probability of error in a 6 bit symbol, (1-(1-p)**6)/p approaches 6 at low error rates (around a reasonable 1/100,000). (The limit is also 6 when considered over a whole 96 byte subcode sector.) In practice I see ~3-5 times more sectors with R-W errors than Q.

Is there any correlation of this CD-TEXT structured data and R-W packs structure?

I haven't implemented checking CD-Text, but as I understand it CD-Text doesn't use this same R-W pack structure:

There is no interleave or sector crossing, each 24 byte pack of CD-Text is self-contained, the sector is divided evenly into 4 of these. CD-Text data is bit packed directly into the 6 bits of R-W, so a 24 byte pack has 24*6/8=18 bytes of CD-Text payload.

As you say there is only the 16 bit CRC; this is error detecion, not correction. CD-Text handles errors through repetition since there is so little data: Each pack is repeated several times and the drive picks one with each sequence number that has a good CRC. I think this is similar to how TOC is encoded in Q?

So in theory you could use that to detect R-W errors where CD-Text is present. It's a bit simpler because you don't need to consider the interleave, you only need to re-read exactly the sector with the error.

@hcs64
Copy link
Contributor Author

hcs64 commented Nov 5, 2024

Even if each reread for Q corrected one R-W (not given), those other 84-58=26 would not have been retried.

Also note that this isn't a case of R-W being corrected due to a re-read of a neighboring sector. We're able to mark just one of those sectors bad for R-W if we can locate the error within the pack, and that was the case with all errors in this dump. In practice even 2 errors in a pack is rare on my discs.

@superg
Copy link
Owner

superg commented Nov 7, 2024

From an example dump: The initial dump had Q: 156, R-W: 547 (sectors with errors of each type). A few refine passes later it's down to Q: 98, R-W: 463. That's 58 fewer Q, 84 fewer R-W. Even if each reread for Q corrected one R-W (not given), those other 84-58=26 would not have been retried.

Statistical aside: If your model is that errors occur by replacing a whole byte by another byte at random, there's still a ~1/2 chance of the Q bit being correct when P-W is replaced (127 same Q / 255 wrong bytes), while that chance is only 1/85 for R-W (3 same R-W / 255 wrong bytes), suggesting ~40 times as many R-W errors. The odds are a lot better than that, though, since the error is roughly independent for each bit. With error probability p for one bit you'll expect 1-(1-p)**6 probability of error in a 6 bit symbol, (1-(1-p)**6)/p approaches 6 at low error rates (around a reasonable 1/100,000). (The limit is also 6 when considered over a whole 96 byte subcode sector.) In practice I see ~3-5 times more sectors with R-W errors than Q.

All makes sense, thanks for the clarification.

There is no interleave or sector crossing, each 24 byte pack of CD-Text is self-contained, the sector is divided evenly into 4 of these. CD-Text data is bit packed directly into the 6 bits of R-W, so a 24 byte pack has 24*6/8=18 bytes of CD-Text payload.

I see, so this is yet another structure within R-W.

As you say there is only the 16 bit CRC; this is error detecion, not correction. CD-Text handles errors through repetition since there is so little data: Each pack is repeated several times and the drive picks one with each sequence number that has a good CRC. I think this is similar to how TOC is encoded in Q?

Exactly, each TOC entry is repeated 3 times and additionally tracks cycle repeatedly until pre-gap (LBA -75). I think drive uses Q crc to choose which TOC entries are correct. It makes sense that in parallel in those same TOC R-W subchannel data there is CD-TEXT encoded. We are able to extract whole raw subchannel for lead-in using PLEXTOR but I actually never looked into R-W there as we get CD-TEXT data parsed more easily using read_toc SCSI command.

Now that I understand it a bit more I will go over the code and get that merged. Sorry for this taking so long, I'm really busy lately and just too tired to see the meaningful code end of the day, conversations are easier :).

Couple questions: GF64 math - did you backport that from Rust?
I am asking this to make sure I don't taint redumper with an incompatible license, if there is such.

Also thanks for writing the "test", there is literally no good testing infra here, I just slapped something quick to make sure I don't introduce regressions in core components like MSF handling and descrambling.

@hcs64
Copy link
Contributor Author

hcs64 commented Nov 7, 2024

Now that I understand it a bit more I will go over the code and get that merged. Sorry for this taking so long, I'm really busy lately and just too tired to see the meaningful code end of the day, conversations are easier :).

No rush, thanks a lot whenever you can get to it. I have the CI builds for the rips I wanted to do in the meantime.

Couple questions: GF64 math - did you backport that from Rust?
I am asking this to make sure I don't taint redumper with an incompatible license, if there is such.

It's not ported. I mostly linked to gf256 because it has a great tutorial explaining the data types and algorithms, especially the Galois field and Reed-Solomon modules. I found those very helpful, I occasionally also used the Wikipedia articles I linked (though they tend to be too general), and Berlekamp's original "Nonbinary BCH decoding" paper on Berlekamp-Massey was helpful (even if I still don't completely understand the algorithm).

Also thanks for writing the "test", there is literally no good testing infra here, I just slapped something quick to make sure I don't introduce regressions in core components like MSF handling and descrambling.

Errors are fairly rare so I didn't have confidence I'd implemented it right until I tested every detectable error location and a few unrecoverable error patterns. The two error case can take a while to run in a debug build, though.

Copy link
Owner

@superg superg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is solid. I left feedback on couple of small things.
The only thing to mention is that I have a new dump/refine code under redumper dumpnew refinenew. It's not 100% ready yet, still working on it. Once it's done, I will backport your changes there, shouldn't be an issue.


// buffers
std::vector<uint8_t> sector_data(CD_DATA_SIZE);
std::vector<uint8_t> sector_subcode(CD_SUBCODE_SIZE);
std::array<uint8_t, CD_SUBCODE_SIZE * CD_RW_PACK_CONTEXT_SECTORS> sector_subcode_rw_context;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason this is std::array?
I don't see this as performance bottleneck so it can be std::vector just to keep it more tidy with the other buffers.

@@ -531,6 +535,19 @@ export bool redumper_dump_cd(Context &ctx, const Options &options, bool refine)
if(options.refine_subchannel)
refine_sector = true;
}

if(options.check_subchannel_rw && lba >= 2 && lba + 2 < lba_end)
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using lba_start here would be more correct. Usually I try to read as early as I can, plextor allows -75 (1 second) and asus a little bit less than 150 (2 seconds). lba_start is already set to the lowest possible value so something like this should be good:
if(options.check_subchannel_rw && lba >= lba_start + 2 && lba + 2 < lba_end)
If I understand everything correctly, each sector needs two leading sectors and two trailing sectors (including the current one it's 5 in total)

The only thing I am not sure of is if pre-gap contains same RW recovery data but I'd say it's likely.

read_entry(fs_sub, (uint8_t *)sector_subcode_rw_context.data(), CD_SUBCODE_SIZE, lba_index - 2, CD_RW_PACK_CONTEXT_SECTORS, 0, 0);
if(!valid_rw_packs(sector_subcode_rw_context))
{
if(options.verbose)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to have logging here, it will spam a lot of lines prior to refining.

read_entry(fs_sub, (uint8_t *)sector_subcode.data(), CD_SUBCODE_SIZE, lba_index, 1, 0, 0);
ChannelQ Q;
subcode_extract_channel((uint8_t *)&Q, sector_subcode.data(), Subchannel::Q);
if(!Q.isValid())
read = true;

// optionally check R-W pack parity
if(options.check_subchannel_rw && lba >= 2 && lba + 2 < lba_overread && !read)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as an earlier comment to use lba_start

if(inside_range(lba, error_ranges) == nullptr)
--errors_q;
}

if(options.check_subchannel_rw && lba >= 2 && lba + 2 < lba_overread)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as an earlier comment to use lba_start

LOGC_RF("{} [{:3}%] LBA: {:6}/{}, errors: {{ SCSI: {}, C2: {}, Q: {} }}", spinner_animation(),
percentage(refine_processed * refine_retries + refine_counter, refine_count * refine_retries), lba, lba_overread, errors_scsi, errors_c2, errors_q);
std::string rw_log;
if(options.check_subchannel_rw)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's useful to display R-W errors even if we don't specify an option. Doesn't have to be changed in this PR, can be done later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants