Remove primary index from Blockstore special-column keys #33419

CriesofCarrots · 2023-09-27T01:12:08Z

Problem

RPC getSignaturesForAddress ordering is not consistent; because the AddressSignatures column is keyed by (Address, Slot, Signature), iterated values within the same slot are ordered by Signature bytes (lots of issues about this, eg. #22456).

We now have access to a transaction's index within the block when writing the transaction status to blockstore, so we can rekey the column.

Meanwhile, both the TransactionStatus and AddressSignatures column keys include a primary index; this is purely vestigial, as the PrimaryIndex purge that used the index has since been replaced by the CompactionFilter purge.

Summary of Changes

Remove primary-index from special column keys
Add a bunch of plumbing to remain backward compatible with existing data with the old keys
Fix Blockstore::get_confirmed_signatures_for_address2() in the process

Fixes #32519
Fixes #22456
Fixes #21843

CriesofCarrots · 2023-09-27T01:24:27Z

Okay, here we go @steviez . The blockstore_db.rs changes are probably the best place to start, unless you want to start walking through the commits.

Aside from the tests (obv), there is a bit more work to be done. Namely, I think there are several places we can check the primary index max_slot info to determine whether or not we even need to fall back to the deprecated keys, eg purge_special_columns_exact(), get_confirmed_signatures_for_address2().
(I determined read_transaction_status() would have the same number of Blockstore reads either way.)

If we decide not to support reads of the old-format keys, this whole thing gets a lot simpler. We'd probably want to keep the iter_filtered() method and some sort of fallible try_index() method, but that's about it for the plumbing.

CriesofCarrots · 2023-09-29T00:48:21Z

Update here: everything builds and tests pass. If we're going to do this, I should probably add tests for the fallback behavior in:

read_transaction_status()
get_transaction_status_with_counter()
~~get_confirmed_signatures_for_address2()~~

And that purges work with deprecated and new indexes in tests like:

test_purge_transaction_status_exact()
test_purge_special_columns_compaction_filter()

codecov · 2023-09-29T01:34:28Z

Codecov Report

Merging #33419 (5923d84) into master (1d91b60) will decrease coverage by 0.1%.
Report is 2 commits behind head on master.
The diff coverage is 92.8%.

@@            Coverage Diff            @@
##           master   #33419     +/-   ##
=========================================
- Coverage    81.7%    81.7%   -0.1%     
=========================================
  Files         807      807             
  Lines      218252   218172     -80     
=========================================
- Hits       178438   178354     -84     
- Misses      39814    39818      +4

CriesofCarrots · 2023-10-04T03:08:07Z

Blockstore::get_signatures_for_address2 is a hot mess. I'm leaning toward only supporting the new AddressSignatures column key in that method.

ledger/src/blockstore_db.rs

steviez

In general, I think this PR looks to be in a pretty good state and the logic makes sense to me. A few general questions I have:

By supporting both key types in the column, we have backwards compatibility in regards to being able to load data from an old Blockstore. However, one other case that we've typically tried to support is a client version downgrade. Supposing this lands in v1.17, the following sequence could cause issues
- Upgrade node to v1.17
- Operates their node for a bit such that data is written to the Blockstore
- Downgrade node to v1.16
- Query is issued that hits a key/value pair written by v1.17, v1.16 software doesn't know how to handle and panics
For a node that is hooked up with bigtable, the order of operations for read attempts for transaction status is:
1. Look for new key format
2. Look for old key format
3. Hit bigtable
For above, I'm wondering if there is a way that we can detect all of the entries with the old format are gone. The benefit of doing so is that we don't issue a second read locally, and either go straight to bigtable or if no bigtable, just return a miss without performing extra I/O

ledger/src/blockstore.rs

CriesofCarrots · 2023-10-05T17:36:53Z

1. By supporting both key types in the column, we have backwards compatibility in regards to being able to load data from an old `Blockstore`. However, one other case that we've typically tried to support is a client version downgrade.

Yeah, that's a great question. I've been thinking about this a bit. One approach we could take is to separate out all the code supporting reads (and purges) of both key formats in the special columns and backport that (to v1.16, in your example). And then merge just the code that writes the new format into the later minor release (v1.17 in your example).

There are options that would involve less code in the backport, ie. updating the various read methods to just not panic on a different key format, or to not panic on the specific new format. Let's chat about what makes the most sense.

3. For above, I'm wondering if there is a way that we can detect all of the entries with the old format are gone. The benefit of doing so is that we don't issue a second read locally, and either go straight to bigtable or if no bigtable, just return a miss without performing extra I/O

Currently, I think the only way to determine that the old-format entries are irrelevant is to do a read -- of the transaction_status_index_cf entries to look at max_slots. We could potentially instead stick something else in memory in Blockstore to track it to prevent the db read (could replace Blockstore::active_transaction_status_index(), as that should be unneeded going forward).

Incidentally, I'm realizing the max_slot assumptions may be wrong in the case of a validator that upgrades and then downgrades again. Just note to us to spend a little extra time thinking this through.

…d the iterator

steviez

Good amount of back and forth but think we made it!

To leave a paper-trail, @CriesofCarrots is going to prep a backport of the read functionality for v1.17. The approach is described in more detail here.

CriesofCarrots · 2023-10-10T16:40:18Z

@CriesofCarrots is going to prep a backport of the read functionality for v1.17

Here is said backport: #33617

CriesofCarrots added the noCI Suppress CI on this Pull Request label Sep 27, 2023

CriesofCarrots force-pushed the addr-sigs-with-index branch from 80278f6 to 7eb5502 Compare September 29, 2023 00:31

CriesofCarrots added CI Pull Request is ready to enter CI and removed noCI Suppress CI on this Pull Request labels Sep 29, 2023

solana-grimes removed the CI Pull Request is ready to enter CI label Sep 29, 2023

CriesofCarrots force-pushed the addr-sigs-with-index branch from cfd6824 to 1e5dd0f Compare September 29, 2023 00:36

CriesofCarrots mentioned this pull request Oct 3, 2023

Blockstore special columns: minimize deletes in PurgeType::Exact #33498

Merged

CriesofCarrots force-pushed the addr-sigs-with-index branch 6 times, most recently from 4132186 to a12e45e Compare October 4, 2023 01:00

CriesofCarrots force-pushed the addr-sigs-with-index branch 2 times, most recently from 692f552 to 2a51033 Compare October 4, 2023 04:00

steviez mentioned this pull request Oct 5, 2023

Make Blockstore::purge_special_columns_exact() bail if columns empty #33534

Merged

steviez reviewed Oct 5, 2023

View reviewed changes

steviez mentioned this pull request Oct 5, 2023

Remove unused code in Blockstore underlying impl #33538

Merged

This was referenced Oct 5, 2023

Add early return to Blockstore::find_address_signatures methods #33545

Merged

Blockstore::get_sigs_for_addr2: ensure lowest_slot >= first_available_block #33556

Merged

CriesofCarrots force-pushed the addr-sigs-with-index branch 3 times, most recently from 157d852 to d5c264a Compare October 6, 2023 03:12

Tyera Eulberg added 19 commits October 9, 2023 10:47

Fix test_purge_special_columns_compaction_filter (all build)

eae2c29

Move some test-harness stuff around

b23ad5a

Add test cases for purge_special_columns_with_old_data

6451722

Add test_read_transaction_status_with_old_data

0e4e382

Add test_get_transaction_status_with_old_data

1794f1c

Review comments

ad70443

Move rev of block-signatures into helper

3a757da

Improve deprecated_key impls

55214b0

iter_filtered -> iter_current_index_filtered

08b2e4f

Add comment to explain why use the smallest (index, Signature) to see…

d738729

…d the iterator

Impl ColumnIndexDeprecation for TransactionMemos (doesn't build)

dbaa70b

Update TransactionMemos put

a1609b5

Add LedgerColumn::get_raw

638868f

Fix read_transaction_memos

c64f42c

Add TransactionMemos to purge_special_columns_exact

aed03dd

Add TransactionMemos to compaction filter

2aeacd2

Take find_address_signatures out of service

606835b

Remove faulty delete_new_column_key logic

4b68c04

Simplify comments

5923d84

CriesofCarrots force-pushed the addr-sigs-with-index branch from 9d52ebe to 5923d84 Compare October 9, 2023 17:10

CriesofCarrots marked this pull request as ready for review October 9, 2023 17:10

steviez approved these changes Oct 9, 2023

View reviewed changes

CriesofCarrots mentioned this pull request Oct 10, 2023

v1.17: skip unrecognized keys in Blockstore special-column iterators #33617

Merged

CriesofCarrots merged commit 509d6ac into solana-labs:master Oct 10, 2023
16 checks passed

CriesofCarrots mentioned this pull request Oct 12, 2023

Blockstore: track when all primary-index data has been purged #33668

Merged

steviez mentioned this pull request Nov 30, 2023

blockstore: use u32 for fec_set_index in erasure set index store key #34268

Merged

CriesofCarrots mentioned this pull request Dec 18, 2023

changelog: add getSignaturesForAddress ordering #34510

Merged

CriesofCarrots mentioned this pull request Feb 20, 2024

RPC-call getSignaturesForAddress misbehaves if before/until args land on one of multiple transactions within a slot. #35268

Closed

willhickey mentioned this pull request Mar 28, 2024

v1.18 commits - please ignore anza-xyz/agave#475

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove primary index from Blockstore special-column keys #33419

Remove primary index from Blockstore special-column keys #33419

CriesofCarrots commented Sep 27, 2023 •

edited

Loading

CriesofCarrots commented Sep 27, 2023 •

edited

Loading

CriesofCarrots commented Sep 29, 2023 •

edited

Loading

codecov bot commented Sep 29, 2023 •

edited

Loading

CriesofCarrots commented Oct 4, 2023

steviez left a comment

CriesofCarrots commented Oct 5, 2023

steviez left a comment

CriesofCarrots commented Oct 10, 2023

Remove primary index from Blockstore special-column keys #33419

Remove primary index from Blockstore special-column keys #33419

Conversation

CriesofCarrots commented Sep 27, 2023 • edited Loading

Problem

Summary of Changes

CriesofCarrots commented Sep 27, 2023 • edited Loading

CriesofCarrots commented Sep 29, 2023 • edited Loading

codecov bot commented Sep 29, 2023 • edited Loading

Codecov Report

CriesofCarrots commented Oct 4, 2023

steviez left a comment

Choose a reason for hiding this comment

CriesofCarrots commented Oct 5, 2023

steviez left a comment

Choose a reason for hiding this comment

CriesofCarrots commented Oct 10, 2023

CriesofCarrots commented Sep 27, 2023 •

edited

Loading

CriesofCarrots commented Sep 27, 2023 •

edited

Loading

CriesofCarrots commented Sep 29, 2023 •

edited

Loading

codecov bot commented Sep 29, 2023 •

edited

Loading