Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have the (cli) encoder skip unextractable frames #114

Merged
merged 4 commits into from
Nov 26, 2024

Conversation

sz3
Copy link
Owner

@sz3 sz3 commented Nov 25, 2024

In investigating what might be causing #108, I remembered an error case I'd run into in the past but had thought/assumed/hoped was very rare: generating an "unextractable" frame -- that is, one which cimbar_extract will fail to extract (notwithstanding a hypothetical perfect extraction algorithm). This happens due to the extraction algorithm getting confused by the data part of the image and failing to find the corner markers.

Turns out it's rare, but not rare enough. In my testing, anywhere between 1-2% (!) of generated frames are bogus in this way. This is ok for purposes of the main use case -- there will be another unique frame every 66ms or so. But for offline cases where we generate an image (or series of images) and check it later, it's not ok.

QR codes address this issue by having different "masking patterns" that can be selected. We don't have that option, but we do have a brute force way to fix it: check the generated image after we generate, and make sure it extracts. If it doesn't, we skip that frame and try again.

Currently this is only used in the cimbar cli, not in the web encoder. I'll need to look at the performance implications some more to see if having the web encoder make the same check is a good idea.

sz3 added 4 commits November 25, 2024 01:44
Some cimbar frames are bad and will fail extraction due to the data
aligning itself to match the anchor pattern.

This is a design bug/quirk that doesn't matter *too* much for the
on-the-fly "encode data on one screen, decode via camera" use case.
We're constantly generating new frames. But for any offline,
more "QR-code" like usage -- for example printing -- having a bad image
is much more troublesome.

The fix/workaround is to check the generated image after we generate,
and make sure it extracts. If it doesn't, we skip that frame and try
again.

This is currently only used the cli (and various test code), due to
potential performance concerns.
Ran some tests on how common this failure mode is, and 1-2%(!) of frames
tend to be "bad" (and thus are thrown out). We should never ever hit
this 5-in-a-row case, but it'd also be pretty brutal to infinite loop
due to a bug in the will_it_scan() function...
It modestly helps make decodes easier in the single frame case, but the
1-2% failures we see means we need to either:
A. add the new scan sanity check. This might end up being the way to go,
but it has performance drawbacks.
B. remove the special case and generate some extra frames. Special cases
are annoying anyway.
@sz3 sz3 mentioned this pull request Nov 25, 2024
: _dark(dark)
, _skip(skip? skip : std::min(img.rows, img.cols) / 60)
, _mergeCutoff(img.cols / 30)
, _anchorSize(30)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking it might be time to clang-format the source tree...


constexpr int limit = 50;
Corners corners(points);
if (corners.top_left().x() > limit or corners.top_left().y() > limit)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check here is pretty simple. We run an extract on a freshly encoded image -- no blurring or skewing or any other camera-induced weirdness. We don't even need error correction for an image like this, it's pristine. The corners the scanner finds should be right where we expect them: at the edges of the image.

But if the (essentially random) data pattern is adversarial enough, the scanner will find a "corner" elsewhere, and we can detect/reject this frame as bad.

@@ -49,7 +49,7 @@ TEST_CASE( "EncoderTest/testFountain.4c", "[unit]" )
enc.set_legacy_mode();
assertEquals( 3, enc.encode_fountain(inputFile, outPrefix, 0) );

std::vector<uint64_t> hashes = {0xbb1cc62b662abfe5, 0xf586f6466a5b194, 0x8c2f0f40e6ecb08b};
std::vector<uint64_t> hashes = {0xbb1cc62b662abfe5, 0xf586f6466a5b194, 0x93a3830d042966e1};
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out we were hitting the error case here!

continue;

// else, we gotta make forward progress. And it's probably a bug?
std::cerr << fmt::format("generated {} bad frames in a row. This really shouldn't happen, maybe report a bug. :(", consecutiveScansFailed) << std::endl;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that the likely way we hit this message is that the Scanner code gets messed up and everything (or many things) fail to scan. We could risk the infinite loop since that kind of failure should get caught by tests/etc, but... this seems smarter to me.

@sz3 sz3 merged commit 9d4eb6c into master Nov 26, 2024
8 checks passed
@sz3 sz3 deleted the encoder-skip-unextractable-frames branch November 26, 2024 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant