Skip to content

Conversation

@marete
Copy link
Collaborator

@marete marete commented Jul 15, 2025

This will be used in a subsequent PR to correctly determine if 2 Bigtable row keys are adjacent, given that a Bigtable row key must be at most 4KiB in length.

The context is that the function we currently use (which is in the Bigtable client library) to do the check that 2 row keys are adjacent order-wise always assumes that the next string in lexicographic order is obtained by adding a '\0' char, which assumes that row keys may be of unlimited length when in fact they are limited to 4KiB, in Bigtable.


This change is Reviewable

…gth are adjacent.

This will be used in a subsequent PR to correctly determine if 2 Bigtable row
keys are adjacent, given that a Bigtable row key must be at most 4KiB
in length.
@marete marete requested review from dopiera and prawilny July 15, 2025 19:57
@marete marete changed the title feat: A function to determine if 2 strings of a maximum length are lexicographically adjacent. feat: A function to determine if 2 strings of some maximum length are lexicographically adjacent. Jul 15, 2025
Copy link
Collaborator

@prawilny prawilny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll review the test after the fixes land.

Reviewed 2 of 3 files at r1, all commit messages.
Reviewable status: 2 of 3 files reviewed, 2 unresolved discussions (waiting on @dopiera)


google/cloud/bigtable/emulator/range_set.cc line 616 at r1 (raw file):

  // parts. However, row keys that are at maximum length are the rare
  // exception, presumably.
  successor_to_a = a;

nit: why not copy it in the initialization (with std::string successor_to_a = a)?


google/cloud/bigtable/emulator/range_set.cc line 618 at r1 (raw file):

  successor_to_a = a;

  successor_to_a[index_to_inc.value()] += 1;

I think it's wrong.

In the examples, let's assume that char is signed (and 127 is the max value).

According to this function, ConsecutiveStringsOfMaxLen([0, 127], [1,127], 2) would be true, whereas it should be false.

It should be true for ConsecutiveStringsOfMaxLen([0, 127], [1,-128], 2) instead.

I think the cleanest way to finding the successor would be to use find_if() and fill() from https://en.cppreference.com/w/cpp/algorithm.html to find the non-max value, fill everything following it with MIN_CHAR and then increment the value as you do here.
We'd avoid pointer math (std::distance) that way.

@marete marete changed the title feat: A function to determine if 2 strings of some maximum length are lexicographically adjacent. feat: A function to determine if 2 strings of some maximum length are lexicographically consecutive. Jul 17, 2025
@marete marete changed the title feat: A function to determine if 2 strings of some maximum length are lexicographically consecutive. feat: impl: Determine if 2 strings of some maximum length are lexicographically consecutive. Jul 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants