-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to check R-W pack parity during refine #193
base: main
Are you sure you want to change the base?
Conversation
{ | ||
read_entry(fs_sub, (uint8_t *)sector_subcode_rw_context.data(), CD_SUBCODE_SIZE, lba_index - 2, CD_RW_PACK_CONTEXT_SECTORS, 0, 0); | ||
|
||
if(!valid_rw_packs(sector_subcode_rw_context)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bug here, though I think it only affects the count: If Q was corrected above, and that also corrected R-W, then sector_subcode
was already written to the file, so it will look here like it had already been correct. What's needed is to defer write_entry
until after both the Q and R-W checks, probably set a store_subcode
flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's going to be hard to keep the count accurate anyway, given that a sector may be marked correct by a read of the next sector. In the interest of not overcomplicating things I'm going to leave this as-is here, unless you'd prefer I handle it more thoroughly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had an error in my initial store_subcode
approach, the latest commit implements it correctly: the final number of R-W errors matches what will be initially counted on a subsequent run, before it would miscount if both Q and R-W were corrected on the same read.
It's still possible that a sector becomes valid only due to subsequent reads, but this should be rare (it would involve at least 3 errors in a pack, which I haven't yet seen), and anyway it will only affect the count. There's some outside chance that a subsequent read will make an earlier sector invalid so that it ought to be re-read, but this seems so unlikely that it isn't worth special handling. It's rather more likely that a re-read for Q makes R-W become invalid; this will be at least retried if there are retries left, but it will also throw off the count.
Previously it wouldn't count if Q was also corrected
That's a lot of code. I will be checking that out on a weekend. |
On a high level, what is the benefit of checking if R-W pack is valid (across 3 sectors) vs subchannel Q check (across 1 sector)? I'll explain: To summarize, there are two ways to check if whole subchannel is valid:
So far, I don't understand why to do (2) that is much more complicated if (1) is enough to establish that subchannel is incorrect? |
Another question, CD-TEXT is stored in R-W (lead-in). Internally, CD-TEXT is structured into the packs and there is same 16-bit CRC field, example: https://github.com/superg/redumper/blob/main/cd/toc.ixx#L421 Is there any correlation of this CD-TEXT structured data and R-W packs structure? |
Thanks for looking over it!
It's true that rereading for bad Q might fix R-W, but my goal is to retry sectors with only R-W errors. From an example dump: The initial dump had Q: 156, R-W: 547 (sectors with errors of each type). A few refine passes later it's down to Q: 98, R-W: 463. That's 58 fewer Q, 84 fewer R-W. Even if each reread for Q corrected one R-W (not given), those other 84-58=26 would not have been retried. Statistical aside: If your model is that errors occur by replacing a whole byte by another byte at random, there's still a ~1/2 chance of the Q bit being correct when P-W is replaced (127 same Q / 255 wrong bytes), while that chance is only 1/85 for R-W (3 same R-W / 255 wrong bytes), suggesting ~40 times as many R-W errors. The odds are a lot better than that, though, since the error is roughly independent for each bit. With error probability
I haven't implemented checking CD-Text, but as I understand it CD-Text doesn't use this same R-W pack structure: There is no interleave or sector crossing, each 24 byte pack of CD-Text is self-contained, the sector is divided evenly into 4 of these. CD-Text data is bit packed directly into the 6 bits of R-W, so a 24 byte pack has As you say there is only the 16 bit CRC; this is error detecion, not correction. CD-Text handles errors through repetition since there is so little data: Each pack is repeated several times and the drive picks one with each sequence number that has a good CRC. I think this is similar to how TOC is encoded in Q? So in theory you could use that to detect R-W errors where CD-Text is present. It's a bit simpler because you don't need to consider the interleave, you only need to re-read exactly the sector with the error. |
Also note that this isn't a case of R-W being corrected due to a re-read of a neighboring sector. We're able to mark just one of those sectors bad for R-W if we can locate the error within the pack, and that was the case with all errors in this dump. In practice even 2 errors in a pack is rare on my discs. |
All makes sense, thanks for the clarification.
I see, so this is yet another structure within R-W.
Exactly, each TOC entry is repeated 3 times and additionally tracks cycle repeatedly until pre-gap (LBA -75). I think drive uses Q crc to choose which TOC entries are correct. It makes sense that in parallel in those same TOC R-W subchannel data there is CD-TEXT encoded. We are able to extract whole raw subchannel for lead-in using PLEXTOR but I actually never looked into R-W there as we get CD-TEXT data parsed more easily using read_toc SCSI command. Now that I understand it a bit more I will go over the code and get that merged. Sorry for this taking so long, I'm really busy lately and just too tired to see the meaningful code end of the day, conversations are easier :). Couple questions: GF64 math - did you backport that from Rust? Also thanks for writing the "test", there is literally no good testing infra here, I just slapped something quick to make sure I don't introduce regressions in core components like MSF handling and descrambling. |
No rush, thanks a lot whenever you can get to it. I have the CI builds for the rips I wanted to do in the meantime.
It's not ported. I mostly linked to gf256 because it has a great tutorial explaining the data types and algorithms, especially the Galois field and Reed-Solomon modules. I found those very helpful, I occasionally also used the Wikipedia articles I linked (though they tend to be too general), and Berlekamp's original "Nonbinary BCH decoding" paper on Berlekamp-Massey was helpful (even if I still don't completely understand the algorithm).
Errors are fairly rare so I didn't have confidence I'd implemented it right until I tested every detectable error location and a few unrecoverable error patterns. The two error case can take a while to run in a debug build, though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is solid. I left feedback on couple of small things.
The only thing to mention is that I have a new dump/refine code under redumper dumpnew refinenew
. It's not 100% ready yet, still working on it. Once it's done, I will backport your changes there, shouldn't be an issue.
|
||
// buffers | ||
std::vector<uint8_t> sector_data(CD_DATA_SIZE); | ||
std::vector<uint8_t> sector_subcode(CD_SUBCODE_SIZE); | ||
std::array<uint8_t, CD_SUBCODE_SIZE * CD_RW_PACK_CONTEXT_SECTORS> sector_subcode_rw_context; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason this is std::array?
I don't see this as performance bottleneck so it can be std::vector just to keep it more tidy with the other buffers.
@@ -531,6 +535,19 @@ export bool redumper_dump_cd(Context &ctx, const Options &options, bool refine) | |||
if(options.refine_subchannel) | |||
refine_sector = true; | |||
} | |||
|
|||
if(options.check_subchannel_rw && lba >= 2 && lba + 2 < lba_end) | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using lba_start here would be more correct. Usually I try to read as early as I can, plextor allows -75 (1 second) and asus a little bit less than 150 (2 seconds). lba_start is already set to the lowest possible value so something like this should be good:
if(options.check_subchannel_rw && lba >= lba_start + 2 && lba + 2 < lba_end)
If I understand everything correctly, each sector needs two leading sectors and two trailing sectors (including the current one it's 5 in total)
The only thing I am not sure of is if pre-gap contains same RW recovery data but I'd say it's likely.
read_entry(fs_sub, (uint8_t *)sector_subcode_rw_context.data(), CD_SUBCODE_SIZE, lba_index - 2, CD_RW_PACK_CONTEXT_SECTORS, 0, 0); | ||
if(!valid_rw_packs(sector_subcode_rw_context)) | ||
{ | ||
if(options.verbose) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not to have logging here, it will spam a lot of lines prior to refining.
read_entry(fs_sub, (uint8_t *)sector_subcode.data(), CD_SUBCODE_SIZE, lba_index, 1, 0, 0); | ||
ChannelQ Q; | ||
subcode_extract_channel((uint8_t *)&Q, sector_subcode.data(), Subchannel::Q); | ||
if(!Q.isValid()) | ||
read = true; | ||
|
||
// optionally check R-W pack parity | ||
if(options.check_subchannel_rw && lba >= 2 && lba + 2 < lba_overread && !read) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as an earlier comment to use lba_start
if(inside_range(lba, error_ranges) == nullptr) | ||
--errors_q; | ||
} | ||
|
||
if(options.check_subchannel_rw && lba >= 2 && lba + 2 < lba_overread) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as an earlier comment to use lba_start
LOGC_RF("{} [{:3}%] LBA: {:6}/{}, errors: {{ SCSI: {}, C2: {}, Q: {} }}", spinner_animation(), | ||
percentage(refine_processed * refine_retries + refine_counter, refine_count * refine_retries), lba, lba_overread, errors_scsi, errors_c2, errors_q); | ||
std::string rw_log; | ||
if(options.check_subchannel_rw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's useful to display R-W errors even if we don't specify an option. Doesn't have to be changed in this PR, can be done later.
This implements a
--refine-subchannel-rw
option that enables R-W pack P parity checking.This provides some small improvement when refining CD+G discs: Over 6 refines on an old disc (Lou Reed - New York), it went from 627 errored packs (237 CD+G) to 554 (205). Without the R-W check it only got down to 621 (230) over 6 refines. (These pack error counts are from redumper-extract-rw, which will correct the errors.)
A pack is interleaved across three sectors, but if there are only one or two errors in a pack it is possible to locate it so only the affected sector needs to be retried. These errors could be corrected anyway, but it will let us get some more use out of refine for discs that use R-W.