Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoding improvements, and (somewhat) more flexible config for alternate grid dimensions #73

Merged
merged 38 commits into from
Jun 1, 2023

Conversation

sz3
Copy link
Owner

@sz3 sz3 commented Jun 1, 2023

This started off as an experimental branch, so it's quite long-lived and has a number of things wrapped up in it:

  • bugfix for bitmatrix decode (I think undefined behavior due to this was the culprit for the Android 12 crash/bug...)
  • fix for the build of cimbar.js on emscripten latest (until they change it again)
  • added a "cooldown" functionality to the flood fill decode algorithm. As we iterate through the cells and create decode instructions for the next cell (organized by min-error on a heap), we will also track which direction we just went. e.g. if our previous cell drifted left during its decode, we will not drift left for our next cell. This is a simple and (probably) safe way to prevent runaway errors during the decode.
    • this only uses the adjacent cell that added us (or scheduled us) for decoding, not the other 3 adjacent cells.
    • we don't skip the center "no drift" no-op direction -- we always hope this is the one we're picking.
  • added more seed/start positions for the flood fill decode. We were initially doing 4 -- now we are doing 8.
  • added functionality to queue more decode cells based on a successful string of adjacent tile decodes
    • the idea is that if we are "right down the middle of the plate" with our decodes on the same line (no skew adjustments, low error rate), we can do better than just queuing up the adjacent 4 cells for the tile. We can also safely queue up a few cells along the same row/column we've noticed we seem to have a good handle on.
      • this acts as a layer of redundancy for finding the right location for a tile. (rather than just depending on one of the 4 adjacent tiles to be correct, we can skip over bad tiles in some cases.
    • the threshold is <= 3 error bits. This is fairly conservative for the 8x8 decode, so it may make sense to make it configurable.
  • the Config class has been split up in order to support alternate configurations. The cimbar config is still specified at compile time, but the important parameters can be swapped out by adding a new struct to the "GridConf.h" file.
  • various changes/fixes have been made to accomodate smaller cell sizes. 5x5 and 6x6 (vs the current 8x8) are the ones I was experimenting with.
    • this mostly involves changes to the decode.
  • added a work-in-progress 5x5 tileset (with 4 symbols == 2 symbol bits per tile) and configuration. Some parts of this are well vetted, others are not. More improvements need to be made to the color decode before 5x5 becomes particularly usable. That said, under the right conditions, transfer rates look pretty good...

sz3 added 30 commits May 31, 2023 21:22
This is the start.

bitmaps.h will also be changing, but I'm going to wait until I'm more
confident in the tileset.

Fix: we were using `cell_size()` in a few places we should've been using
`cell_offset()` 😱
He can still deal with READLEN=8 too -- it's easier to test this way.
Tests will be broken.

`special_case_fuzzy_ahash()` is an optimization that's no longer used --
this change is unecessary, but preserved for posterity. (I'll delete the
function and tests shortly)
…hash

There's unfortunately still some stuff that is hard-coded based on the
target sizes (notably: ahash_result's magical read offsets).
... use it in fountain_chunk_size(). The existing calculation was built
around the 8x8 grid size (9300 bytes).
The interaction between ecc_bytes/ecc_block_size and fountain_chunks is
pretty diabolical. it'd be nice to lock that in somehow.
Removing various hard coded "155"s, and whatnot
Caveat: we're currently exposing image_size/anchor_size in the config
header. This is pretty wacky, and possibly a bad idea. But it also might
get the job done for now...?

(we'll see)
The idea is that if we drift "right" when decoding a given cell, we
should exclude the same drift for the next cell we check. This *might*
help with runaway decode errors?

Tests remain quite scuffed.
…tions

The idea being that we might be 1px off in both dimensions (frankly, it
might be worse than that...), and that it doesn't do us any harm to run
the extra checks in this one instance.

Probably
The idea being to limit the amount of magic numbers embedded various
places, so we can switch between tile sizes "on the fly" (it still needs
to happen at compile time).
These are only necessary for 8x8, but oh well
(i.e. 8x8, but a hypothetical 7x7 would also use it)
Part eyeballing, part experimentation -- probably overfit to my sample
data either way, but so were the previous settings 😬

`3` is what we've been using for 8x8, so it does feel reasonable to fall
back to it.
The idea here is that there are some circumstances we can do better than
"we think the 4 adjacent tiles are here".
If we see a line of tiles that are right on the money drift wise
(i.e. they require no adjustment), we can be reasonably confident we're
doing ok, and take the step of looking ahead in either direction (in
practice one of these will be a no-op, since we *came* from that direction)

This seems to lead to a small-but-consistent improvement in decode quality.
Some of these constexprs seem a bit ambitious for now.
}
cv::adaptiveThreshold(symbols, symbols, 255, cv::ADAPTIVE_THRESH_MEAN_C, cv::THRESH_BINARY, blockSize, 0);
cv::adaptiveThreshold(symbols, symbols, 255, cv::ADAPTIVE_THRESH_MEAN_C, cv::THRESH_BINARY, blockSize, -5);
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic formula is grayscale algorithm (e.g. what RGB formula we use -- happens to be opencv's default) + blocksize + the constant. (3,-5) is my current best guess, but it'll be prone to some minor overfitting.

{
return true;
return std::pow(cells_per_col(), 2) - std::pow(corner_padding(), 2) * 4;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this has been moved to the header file. The cpp is now just the "smart" helper functions.

unsigned compression_level();
}
protected:
using GridConf = Conf8x8;
Copy link
Owner Author

@sz3 sz3 Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where one would swap in an alternate grid config, if one wanted to do such a thing.


static constexpr unsigned fountain_chunks_per_frame()
{
return 10;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this (and interleave_partitions) needs to be in the GridConf too, and will probably do that in a followup PR.

_heap.push({betweenMarkerBlock, 1});
_heap.push({betweenMarkerBlock+_cellFinder.dimensions()-1, 1});
_heap.push({lastElem-betweenMarkerBlock, 1});
_heap.push({lastElem-(betweenMarkerBlock+_cellFinder.dimensions()-1), 1});
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our 4 additional seed positions. We could add more, if it seemed to help.


auto& [_, prev_error, prev_cooldown] = _instructions[index];
// in the case where we have consecutive high confidence cells with no drift changes,
// it's safe(ish) to aggressively queue a few more cells
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

int llidx = adj[ll];
if (rridx >= 0 and llidx >= 0)
{
std::array<int,4> horizon = {-1, -1, -1, -1};
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason this is array<4> is simply convenience. That's the size we expect for the directional calculation (up,down,left,right), so it's reused here.


static constexpr unsigned cell_size = 8;
static constexpr unsigned cell_offset = 8;
static constexpr unsigned cells_per_col = 112;
Copy link
Owner Author

@sz3 sz3 Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our grid configurations. This is the current 8x8 config.

(see file for full config)


static constexpr unsigned cell_size = 5;
static constexpr unsigned cell_offset = 9;
static constexpr unsigned cells_per_col = 162;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the 5x5 config I've been playing with. It works somewhat, but needs improvements to the color decoding algorithm before it's ready for prime time.

string expected = "0=0 99=8 12097=9 12100=0 12196=1 12197=25 12198=33 12200=5 12201=0 12295=32 "
"12296=32 12297=46 12298=32 12299=34 12300=30 12301=32 12394=32 12395=57 "
"12396=37 12397=38 12398=10 12399=15";
string expected = "0=0 99=8 600=33 710=28 711=30 821=8 822=22 823=19 934=55 11688=55 11799=25 "
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are regression tests. Changing the flood fill decode algorithm (new seeds, new behavior on a string of successful decodes) breaks them.

@@ -48,7 +48,7 @@ set (LINK_WASM_LIST
-s USE_GLFW=3
-s FILESYSTEM=0
-s TOTAL_MEMORY=134217728
-s EXPORTED_FUNCTIONS='["_render","_next_frame","_initialize_GL","_encode","_configure"]'
-s EXPORTED_FUNCTIONS='["_render","_next_frame","_initialize_GL","_encode","_configure","_malloc","_free"]'
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A change to emscripten makes this necessary. I imagine there's some situation where someone wouldn't want these exported, but I'm not sure what it is.

@@ -30,6 +30,7 @@ class Decoder

protected:
unsigned _eccBytes;
unsigned _eccBlockSize;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this actually needs to be a member variable. But it's somewhat moot I think.

else // # cv4
assertEquals( "59ddb2516b4ff5a528aebe538a22b736a6714263a454d20e146e1ffbba36c5ae", get_hash(decodedFile) );
if (CV_VERSION_MAJOR == 4)
assertEquals( "0f74a76cb1f59df7a42449a3527d464d913d12a03bffa51d6f53828724c3feb1", get_hash(decodedFile) );
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another intentional regression test breakage.

@@ -11,8 +11,8 @@
class Deskewer
{
public:
Deskewer(unsigned total_size=1024, unsigned anchor_size=30);
int total_size() const;
Deskewer(unsigned image_size=0, unsigned anchor_size=0);
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will now be defer to the config.

be.extract(22, 32, 42, 52, 62, 72, 82, 92)
be.extract_tuple( be.pattern(6) ),
be.extract_tuple( be.pattern(7) ),
be.extract_tuple( be.pattern(8) )
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were already pretty magical -- to deal with configurable cell dimensions, they are now sadly even more magical.

be.extract(12, 22, 32, 42, 52, 62, 72, 82),
be.extract_tuple( be.pattern(3) ),
be.extract_tuple( be.pattern(4) ),
be.extract_tuple( be.pattern(5) ),
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the one we actually use -- ignoring the corners. These represent the 64 distinct bits we need for each of the 5 compares (popcnts) we do for each potential symbol for each cell decode. For 8x8 (with 16 possibilities), that's 16*5 = 80 compares.

@@ -37,72 +37,39 @@ namespace image_hash
return res;
}

inline ahash_result special_case_fuzzy_ahash(const cv::Mat& gray, unsigned mode)
template <unsigned CELLSIZE>
inline ahash_result<CELLSIZE> fuzzy_ahash(const cv::Mat& img, uchar threshold=0, unsigned mode=ahash_result<CELLSIZE>::ALL)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed some dead code.

This, sadly, now takes a template param. Could you tell?


static constexpr auto pattern(unsigned id)
{
return get_offsets(id%3 + (id/3)*(READLEN+2));
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some indecipherable magic. I should add a comment. The tests may help in understanding what's going on here.

The magic number 3 comes from our 2 redundant rows/cols (e.g. left,center,right) -- we convert our numbering scheme in extract_fast() into a real initial bitposition that we'll use to extract our 64 bits (8x8) from the 100 bits (10x10) we have.

assertEquals( "2 9 16 23 30", tuple_to_str(be.get_offsets(2)) );

assertEquals( "7 14 21 28 35", tuple_to_str(be.pattern(3)) );
assertEquals( "7 14 21 28 35", tuple_to_str(be.get_offsets(7)) );
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we assert that the magic incantation acts as expected.

const uint8_t* cv = reinterpret_cast<const uint8_t*>(&mval);
uint8_t val = cv[0] << 7 | cv[1] << 6 | cv[2] << 5 | cv[3] << 4 | cv[4] << 3 | cv[5] << 2 | cv[6] << 1 | cv[7];
// TODO: what about endianness???
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

, _good(_image.cols >= Config::image_size() and _image.rows >= Config::image_size())
: _image(img)
, _cellSize(Config::cell_size() + 2)
, _positions(Config::cell_spacing(), Config::cells_per_col(), Config::cell_offset(), Config::corner_padding())
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_cells() was a misleading name. cells_per_col() (and row) is what it is.

@sz3 sz3 merged commit d9bd2a2 into master Jun 1, 2023
@sz3 sz3 deleted the bugfix-improve-and-5x5 branch June 1, 2023 05:42
sz3 added a commit to sz3/cfc that referenced this pull request Jun 24, 2023
Keeping libcimbar master up to date. The most interesting included PR is
probably sz3/libcimbar#73.
sz3 added a commit that referenced this pull request Feb 8, 2024
Keeping libcimbar master up to date. The most interesting included PR is
probably #73.
sz3 added a commit that referenced this pull request Mar 13, 2024
f219e87 Merge pull request #75 from sz3/web-fix
20f1601 The GL window height, or width, or both, needs to be divisible by 4?
98368d7 Pass dummy values to configure() to use config defaults
1b09a77 Use Config settings for cimbar_js defaults
b392e57 Pin the emscripten/emsdk docker image
d9bd2a2 Merge pull request #73 from sz3/bugfix-improve-and-5x5
709a348 Add back a few test cases, add some comments, and check in the 5x5 tileset
b9d9d03 Simplify calculate_cooldown() + a comment for FloodDecodePositions::update()
55fcadb gh workflow updates
6baf0c4 Odd emscripten workaround -- not sure if this is the "right way", but it works
9f11473 Fix wasm packaging script
04528bb Add the 5x5 tileset to bitmaps.h + missing include
dce6833 Fix off-by-one type bug!
d452652 Update regression tests
8c891d0 Put back the Config.cpp for broader compiler support...
55956de Update more tests?
9204892 Misc fix
4d2943c Experimenting with ways to improve the flood decode
6daa947 3,-5 for adaptiveThreshhold on symbols seems better?
18e5480 Put back the skip param for mean_rgb on larger cell sizes
1ec077b Running afoul of constexpr rules
4e33841 Put us back on 8x8
ec74f7c For now, put the uint128s back
02d4573 Have ahash_result take a template param for cell size
e2b55fc + do all bitextract index magic in "pattern()" function
24a291a Add some compile-time magic to auto-generate the bit extract pattern
173087c Move grid params into their own file?
cad107c Minor tweaks to make it easier to switch between grid sizes
f932602 Special case for running the "ALL" check (9-way compare) on seed locations
92a33ef More warnings \o/
3479d39 Better(?) threshold params, and a thought
99e0f51 Drift cooldown?
f3cdabe More const, and add more seed positions for the decode?
6849234 These should all be const...
f9a9dd8 Scale window size off the size of the image.
3616c36 Pass image_size/anchor_size to extractor...
7766cfd Attempt to use 988x988 grid...?
c4f8750 A 5x5 that works with both 4 bit and 5 bit
c34ae88 Calculate total_cells() and frame capacity, and ...
cdd2f5c Update ahash_result for 5-bit reads + lots of test fixes for average_hash
ec62980 Changes to average_hash...
e2cab43 Small changes to average_hash(), decode_color()
731dfb1 Update the bitextractor to deal with reads other than len=8
1fd265d Minimal changes for encoding with a grid of 5x5s
b0aae78 Merge pull request #68 from sz3/packaging-cimbar-html
6ca0650 Post cimbar_js.html as part of package build.
0906f68 Script to bundle cimbar_js's asm.js build into a single html file
c8529e3 Merge pull request #63 from sz3/stdin
d17dcf3 -f is now the default
21f9932 A bit silly, but: py3.6 compatibility for now
2ff867d Test cli?
d646ec7 Have fountain_decoder_sink optionally print its work...
892579a Have the encoder be stdin-aware too
21cc6a7 🤔
aad249d Read filenames from stdin iff no inputs are provided, + ...
074d813 Make fountain encoding the default. Add StdinLineReader for decodes.
89762c1 An idea

git-subtree-dir: app/src/cpp/libcimbar
git-subtree-split: f219e87
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant