-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove primary index from Blockstore special-column keys #33419
Remove primary index from Blockstore special-column keys #33419
Conversation
Okay, here we go @steviez . The Aside from the tests (obv), there is a bit more work to be done. Namely, I think there are several places we can check the primary index max_slot info to determine whether or not we even need to fall back to the deprecated keys, eg If we decide not to support reads of the old-format keys, this whole thing gets a lot simpler. We'd probably want to keep the |
80278f6
to
7eb5502
Compare
cfd6824
to
1e5dd0f
Compare
Update here: everything builds and tests pass. If we're going to do this, I should probably add tests for the fallback behavior in:
And that purges work with deprecated and new indexes in tests like:
|
Codecov Report
@@ Coverage Diff @@
## master #33419 +/- ##
=========================================
- Coverage 81.7% 81.7% -0.1%
=========================================
Files 807 807
Lines 218252 218172 -80
=========================================
- Hits 178438 178354 -84
- Misses 39814 39818 +4 |
4132186
to
a12e45e
Compare
|
692f552
to
2a51033
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I think this PR looks to be in a pretty good state and the logic makes sense to me. A few general questions I have:
- By supporting both key types in the column, we have backwards compatibility in regards to being able to load data from an old
Blockstore
. However, one other case that we've typically tried to support is a client version downgrade. Supposing this lands in v1.17, the following sequence could cause issues- Upgrade node to v1.17
- Operates their node for a bit such that data is written to the
Blockstore
- Downgrade node to v1.16
- Query is issued that hits a key/value pair written by v1.17, v1.16 software doesn't know how to handle and panics
- For a node that is hooked up with bigtable, the order of operations for read attempts for transaction status is:
- Look for new key format
- Look for old key format
- Hit bigtable
- For above, I'm wondering if there is a way that we can detect all of the entries with the old format are gone. The benefit of doing so is that we don't issue a second read locally, and either go straight to bigtable or if no bigtable, just return a miss without performing extra I/O
Yeah, that's a great question. I've been thinking about this a bit. One approach we could take is to separate out all the code supporting reads (and purges) of both key formats in the special columns and backport that (to v1.16, in your example). And then merge just the code that writes the new format into the later minor release (v1.17 in your example). There are options that would involve less code in the backport, ie. updating the various read methods to just not panic on a different key format, or to not panic on the specific new format. Let's chat about what makes the most sense.
Currently, I think the only way to determine that the old-format entries are irrelevant is to do a read -- of the transaction_status_index_cf entries to look at max_slots. We could potentially instead stick something else in memory in Blockstore to track it to prevent the db read (could replace Incidentally, I'm realizing the max_slot assumptions may be wrong in the case of a validator that upgrades and then downgrades again. Just note to us to spend a little extra time thinking this through. |
157d852
to
d5c264a
Compare
9d52ebe
to
5923d84
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good amount of back and forth but think we made it!
To leave a paper-trail, @CriesofCarrots is going to prep a backport of the read functionality for v1.17. The approach is described in more detail here.
Here is said backport: #33617 |
Problem
RPC getSignaturesForAddress ordering is not consistent; because the AddressSignatures column is keyed by (Address, Slot, Signature), iterated values within the same slot are ordered by Signature bytes (lots of issues about this, eg. #22456).
We now have access to a transaction's index within the block when writing the transaction status to blockstore, so we can rekey the column.
Meanwhile, both the TransactionStatus and AddressSignatures column keys include a primary index; this is purely vestigial, as the PrimaryIndex purge that used the index has since been replaced by the CompactionFilter purge.
Summary of Changes
Remove primary-index from special column keys
Add a bunch of plumbing to remain backward compatible with existing data with the old keys
Fix
Blockstore::get_confirmed_signatures_for_address2()
in the processFixes #32519
Fixes #22456
Fixes #21843