Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bandwidth_scheduler) - generate bandwidth requests based on receipts in outgoing buffers v2.0 #12464

Open
wants to merge 45 commits into
base: master
Choose a base branch
from

Conversation

jancionear
Copy link
Contributor

This is a rewrite of #12375, following the redesign that we discussed offline.

Main changes

  • Use a TrieQueue instead of a single value with all of the receipt groups
  • ReceiptGroup now contains both size and gas
  • Keep information about total gas, size and number of receipts for every outgoing buffer. Useful for resharding.
  • Don't read the first receipt when generating bandwidth requests. This doesn't play well with witness size limits, reading the first receipt could potentially add 4MB of storage proof, and we need to do it for every outgoing buffer.
  • Groups now have an upper size bound to ensure that size of the first group is at most max_receipt_size (as long as the size of a single receipt is at most max_receipt_size...)
  • Use the version of StateStoredReceipt to determine if the metadata should be updated. Don't add fields which keep the version of metadata.

The code is ready, but untested. I need to write a bunch of tests. You can take a look at the code in the meantime, it'll be easier to apply any changes before all the tests are written.

The PR is meant to be reviewed commit-by-commit.

Currently TrieQueue allows only storing
receipts. I would like to store other things
in a TrieQueue, let's make the item type generic.
They will be needed to load the OutgoingMetadatas
in the next commit. I put these changes in a separate
commit for clarity.
TrieQueueIterator returns Result<Item, StorageError>,
so it can't be passed to the function that crates a bandwidth request.
Modify the function to take iterators over results instead
of iterators over values.
Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a high level check but looks good to me. I'll leave the proper review to @shreyan-gupta .

Comment on lines +118 to +123
/// The V1 of StateStoredReceipt.
/// The data is the same as in V0.
/// Outgoing buffer metadata is updated only for versions V1 and higher.
/// The receipts start being stored as V1 after the protocol change that introduced
/// outgoing buffer metadata. Having a separate variant makes it clear whether the
/// outgoing buffer metadata should be updated when a receipt is stored/removed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I think it's pretty neat!

Comment on lines 60 to 64
/// Stores `ReceiptGroupsQueueData` for the receipt groups queue
/// which corresponds to the buffered receipts to `receiver_shard`.
pub const OUTGOING_RECEIPT_GROUPS_QUEUE_DATA: u8 = 16;
/// A single item of `ReceiptGroupsQueue`. Values are of type `ReceiptGroup`.
pub const OUTGOING_RECEIPT_GROUPS_QUEUE_ITEM: u8 = 17;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming wise I think it would be more clear to call those BUFFERED_RECEIPT_GROUPS_* here - just to make it clear that this is connected to the BUFFERED_RECEIPT* columns.

I like queue data and queue item even though they don't follow the existing convention of _INDICES and just the type - I think it's fine as the queue data stores more than just indices.

Copy link
Contributor Author

@jancionear jancionear Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it to BUFFERED_RECEIPT_GROUPS_QUEUE_DATA/ITEM. To me "outgoing" is more specific than "buffered". Buffering is a generic action that could be done in many different contexts, "outgoing" clearly refers to the outgoing receipts. But I don't care that much, buffered is also fine.
Ideally it would be "BUFFERED_OUTGOING" x.x

Maybe Shreyan also has an opinion here? I'll wait for his review before renaming it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I agree that outgoing would be better, but now that we already have BUFFERED I'd rather keep things consistent. Yeah, it's just a nit, I'm not too attached to either name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind either prefix, BUFFERED_RECEIPT_GROUP_* or OUTGOING_RECEIPT_GROUP_* but I would still like to stick with the *_INDICES prefix as it gives a clear indication of what to expect there.

Looking at _INDICES I can expect there to exist a TrieQueue etc. etc. but looking directly at _DATA and _ITEM doesn't tell me much. I understand we are storing a bit more than just the indices in our case but the primary purpose is still to track indices. This is me coming from figuring out resharding, sitting and staring at all the trie key names and trying to understand what they mean...

Maybe OUTGOING_RECEIPT_GROUP_INDICES and OUTGOING_RECEIPT_GROUP?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like INDICES, it lies about what the data really is. There is a comment pointing to ReceiptGroupsQueueData, if someone is interested in what exactly the data is, they can quickly go there and take a look.

I'll change it to BUFFERED_RECEIPT_GROUPS_QUEUE_DATA/ITEM. I think Wac liked it as well, so that's 2 votes for this version.

Comment on lines -62 to -64

// NOTE: NEW_COLUMN = 15 will be the last unique nibble in the trie!
// Consider demultiplexing on 15 and using 2-nibble prefixes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you

Comment on lines 272 to 279
TrieKey::OutgoingReceiptGroupsQueueData { .. } => {
col::OUTGOING_RECEIPT_GROUPS_QUEUE_DATA.len() + std::mem::size_of::<u64>()
}
TrieKey::OutgoingReceiptGroupsQueueItem { index, .. } => {
col::OUTGOING_RECEIPT_GROUPS_QUEUE_ITEM.len()
+ std::mem::size_of::<u64>()
+ std::mem::size_of_val(index)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mini nit: Maybe size_of:: ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like size_of_val more than size_of, it's impossible to make a mistake, it always calculates the actual size of the field. I only used size_of for the shard_id because I wasn't sure how size_of_val is going to behave with the ShardId wrapper.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry the formatting was messed up, I wanted to suggest size_of ShardId but that < > made the ShardId disappear.

pub fn default_config() -> Self {
// TODO(bandwidth_scheduler) - put in runtime config
ReceiptGroupsConfig {
size_lower_bound: ByteSize::kb(90),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the lower bound be the same as the diff between two consecutive requests? Is that 100kB?

Copy link
Contributor Author

@jancionear jancionear Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Diff between two consecutive values that can be requested is ~111kB
All groups will be at least as large as lower_bound, and that should be smaller than the diff between two values. I'm not sure what's the ideal value 🤔 I guess the smaller we make the groups the better the precision of the request. I could reduce the group size to something like 50kB.

Copy link
Contributor

@shreyan-gupta shreyan-gupta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall! Thank you so much!

@@ -90,6 +92,10 @@ impl ShardTries {
TrieUpdate::new(self.get_view_trie_for_shard(shard_uid, state_root))
}

pub fn get_shard_uids(&self) -> &[ShardUId] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are we using this function? Is it not possible to use the shard layout to get the shard_uids?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used by the runtime params estimator, in estimator::apply_action_receipt.
The EstimatorContext::testbed function manually creates data for a shard, there's no epoch manager or shard layout.
It would be possible to refactor the estimator to use a shard layout, but the easiest way was to get the list of shards from the list of tries per shard. AFAIU there's only one shard at the moment, so it's not a big issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if it's only used in estimator, lets use some other way of getting the shard ids. The estimator isn't even used currently, any solution, including just hardcoding the shard_ids should be fine.

I'm scared sometime in the future people would see this get_shard_uids function in tries and start using that instead of the proper method of getting it from shard layout.

@@ -60,6 +58,8 @@ pub struct OutgoingReceiptBuffer<'parent> {
/// queue items. Based on that, a common push(), pop(), len(), and iter()
/// implementation is provided as trait default implementation.
pub trait TrieQueue {
type Item<'a>: BorshDeserialize + BorshSerialize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a quick todo comment above to remove the 'a lifetime indicator once we remove Cow from receipt?

@@ -189,6 +252,8 @@ impl DelayedReceiptQueue {
}

impl TrieQueue for DelayedReceiptQueue {
type Item<'a> = ReceiptOrStateStoredReceipt<'a>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For DelayedReceipts, we don't really store any metadata right? Could we change the type of Item to just Receipt?

Copy link
Contributor Author

@jancionear jancionear Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The delayed receipt queue also uses state stored receipts. (See DelayedReceiptQueueWrapper). We need to update congestion info every time a receipt is added/removed from the delayed queue, and the stored metadata makes sure that the metadata stays consistent across protocol changes.

core/store/src/trie/receipts_column_helper.rs Outdated Show resolved Hide resolved
/// receipts.
pub struct ReceiptIterator<'a> {
/// Read-only iterator over items stored in a TrieQueue.
pub struct TrieQueueIterator<'a, Queue: TrieQueue> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Love the generalization!

/// Corresponds to receipts stored in the outgoing buffer to this shard.
receiver_shard: ShardId,
/// Persistent data, stored in the trie.
data: ReceiptGroupsQueueDataV0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't we storing ReceiptGroupsQueueData directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easier to work with ReceiptGroupsQueueDataV0, as it's a normal struct with fields that can be accessed. Using the enum would require decomposing it every time or adding access methods :/
The code can be reorganised in the future if we end up needing to use DataV1.

/// to determine the size and structure of receipts in the outgoing buffer and make
/// bandwidth requests based on them.
#[derive(Debug)]
pub struct ReceiptGroupsQueue {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struct is used only within this file. Could we remove pub from all functions and declarations?

Copy link
Contributor Author

@jancionear jancionear Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also used in the receipt sink. When generating bandwidth requests ReceiptSink fetches the ReceiptsGroupsQueue for some shard and generates the bandwidth requests based on the groups in the queue.
In the previous iteration of the PR I had a separate struct that served a public interface for outgoing metadata to some shard, and the queue could be hidden in a private module, but Wac complained that it's redundant so I removed it 😅
#12375 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I had done just a quick "find all references"

// Take out the last group from the queue and inspect it.
match self.pop_back(state_update)? {
Some(mut last_group) => {
if self.groups_config.should_start_new_group(&last_group, receipt_size, receipt_gas)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like an overkill to store groups_config as part of ReceiptGroupsQueue given it's only being used here. I would consider passing it through update_on_receipt_pushed function, specially since it doesn't need to be publically exposed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's convenient to have it there. And I think it makes sense - a receipt group queue operates using some group config. It's nice for ergonomics, you don't need to pass the config every time. I like it, I'd prefer to keep it this way.
The config is not saved to the state if that's what you're worried about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @shreyan-gupta here, it doesn't seem like the right place for configuration. How bad is it to pass it around?

Also how can the configuration be updated in the future? Currently it's set in the ctor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @shreyan-gupta here, it doesn't seem like the right place for configuration. How bad is it to pass it around?

Eh ok, I'll try to change it to a parameter, see how that goes. But IMO there's nothing wrong with keeping it inside the struct. I see it as the struct being configured in the constructor.

I agree with @shreyan-gupta here, it doesn't seem like the right place for configuration. How bad is it to pass it around?

Also how can the configuration be updated in the future? Currently it's set in the ctor.

The configuration is not saved. It's generated and passed to the ctor for every new block. Generating a new configuration is enough to update it, it'll be immediately applied to new groups.

/// Returns empty metadata for protocol versions which don't support metadata.
/// Make sure to pass shard ids for every shard that has receipts in the outgoing buffer,
/// otherwise the metadata for it will be overwritten with empty metadata.
pub fn load(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, question about all the load functions that we have here...

  • Can we structure the code in such a way that the load functions are only called with BandwidthScheduler feature enabled?
  • In which case, can we get rid of all logic that assumes optional loading? Examples are update_* functions where we do .entry().or_insert_with() and ReceiptGroupsQueue::load() returning Option<Self>?
  • Initialization on enabling BandwidthScheduler feature could be handled by checking if trie key exists and returning default.

It becomes hard to think about all cases where we may or may not be returning/initializing a value, considering we don't have consistent checks for BandwidthScheduler feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code works the way it does because:

  • There are already some outgoing buffers, we need to create matching metadata for them
  • During resharding a new shard could suddenly appear and we need to initialize the metadata for it
  • We can't really scan the trie for existing outgoing buffers because the TrieAccess trait doesn't have an iterator

The easiest way to achieve this was to initialize the metadata when a receipt for it appears.

Can we structure the code in such a way that the load functions are only called with BandwidthScheduler feature enabled?

That's the case right now. OutgoingMetadatas::load doesn't do anything when the feature is not enabled. And nothing is pushed/popped from the metadata before protocol change.

In which case, can we get rid of all logic that assumes optional loading? Examples are update_* functions where we do .entry().or_insert_with() and ReceiptGroupsQueue::load() returning Option?

During resharding there could be a new shard, the or_insert_with is meant to handle this case.

Initialization on enabling BandwidthScheduler feature could be handled by checking if trie key exists and returning default.

That is kind of what's happening, OutgoingMetadatas::load tries to load the metadata, and if the trie doesn't contain the key it creates a new one.

But I'm open to other designs as well, it's a bit convoluted :/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the case right now. OutgoingMetadatas::load doesn't do anything when the feature is not enabled. And nothing is pushed/popped from the metadata before protocol change.

Oh I see, the calls to update_on_receipt_pushed and update_on_receipt_popped are guarded by receipt.should_update_outgoing_metadatas(), hmm.... That's not at all intuitive

I was about to ask how does the initialization of OutgoingMetadatas on protocol change, but I think I see what's going on. So, I guess we can continue to process V0 stored receipts (that don't update metadata) while creating and processing V1 stored receipts after the protocol upgrade?

Comment on lines 60 to 64
/// Stores `ReceiptGroupsQueueData` for the receipt groups queue
/// which corresponds to the buffered receipts to `receiver_shard`.
pub const OUTGOING_RECEIPT_GROUPS_QUEUE_DATA: u8 = 16;
/// A single item of `ReceiptGroupsQueue`. Values are of type `ReceiptGroup`.
pub const OUTGOING_RECEIPT_GROUPS_QUEUE_ITEM: u8 = 17;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind either prefix, BUFFERED_RECEIPT_GROUP_* or OUTGOING_RECEIPT_GROUP_* but I would still like to stick with the *_INDICES prefix as it gives a clear indication of what to expect there.

Looking at _INDICES I can expect there to exist a TrieQueue etc. etc. but looking directly at _DATA and _ITEM doesn't tell me much. I understand we are storing a bit more than just the indices in our case but the primary purpose is still to track indices. This is me coming from figuring out resharding, sitting and staring at all the trie key names and trying to understand what they mean...

Maybe OUTGOING_RECEIPT_GROUP_INDICES and OUTGOING_RECEIPT_GROUP?

jancionear and others added 7 commits November 18, 2024 14:09
Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>
Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>
Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>
Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>
Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>
Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>
Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>
pub gas_lower_bound: Gas,
/// All receipt groups aim to have gas below this threshold.
pub gas_upper_bound: Gas,
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some more time thinking about the optimal lower and upper bound sizes for receipt groups, and I think I'll:

  • Remove the lower bound
  • Reduce the upper bound to something like 100kB

This means that a new group will be started if adding a new receipt to the group would cause its size to go above 100kB.

As a result all groups will be below 100kB, except for the groups that contain a single receipt with size above 100kB.

AFAIU having groups like this will produce optimal bandwidth requests - identical to the ones that we would produce if we walked over all of receipts individually.

The diff between two values that can be requested is ~110kB. If all groups were below 100kB we would always produce optimal requests - each consecutive group would either request a value or move to the next value, just like individual receipts.

When there are receipts larger than 100kB we will naturally skip some of the values, but this is identical to what would happen if we operated on individual receipts. The next group smaller than 100kB will either request the same value or move to the next one, just like individual receipts.

A bit handwavy, but I feel that it works. I can make some drawings if needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, makes sense, I honestly had the same thought for why were we maintaining both lower and upper bounds and we could instead do with just one lower or upper bound, however I didn't raise it in the original review.

One thing that wasn't clear to me was something on the lines of what happens when we have the following scenarios?

  • 99 number of 1kb receipts
  • 1 number of 500 kb receipt.

Should the ideal/expected behavior be case A or case B?

Case A
Group 1: 99 receipts with 99kb
Group 2: 1 receipt with 500 kb

Case B
Group 1: 100 receipts with 599 kb

I would imagine case A is probably more ideal than case B for bandwidth scheduling but it's harder to "code it up" compared to a simple upper bound impl?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is why I had originally not questioned the upper and lower bound

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial design was (B). The main reason was that I wanted to have the guarantee that every group is at least 90kB large to be able to guarantee that the size of the list of groups is small enough. This was important when all receipt groups were kept in one trie value. Now that doesn't really matter.

(B) keeps the number of groups lower, but it has serious disadvantages.
The first disadvantage is that a group can get larger than max_receipt_size, and there's no guarantee that the bandwidth scheduler will be able to grant this much bandwidth. To deal with this I initially read the first receipt and subtracted its size from the first group, but this causes witness size issues because the first receipt could be big. To deal with this I added the size_upper_bound to keep size of receipt groups below max_receipt_size.
The second disadvantage is that it doesn't produce optimal bandwidth requests. Smaller groups would be more precise.

In the end I moved to (A) because it doesn't have any of these disadvantages. The number of groups can be a bit larger, but that shouldn't matter when it's a TrieQueue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! :)

@jancionear
Copy link
Contributor Author

I added a bunch of tests.

They actually found a bug - updated ReceiptGroupsQueueData was not saved to the trie when removing a receipt from the first group in the queue. This was because modify_first didn't write the indices when the first item was modified, but not removed from the queue. It makes sense - queue indices don't change when the first element is removed, but it didn't work with the ReceiptGroupsQueue which saves its data (which contain total gas/size) when write_inidices is called.
Eh using write_indices for writing data is a bit ugly :/

There's also a larger testloop test which spawns a workload that causes a lot of cross-shard receipts to be sent (and some of them buffered). The test inspects the buffered receipts, groups and generated bandwidth requests and makes sure that they look correct.
Getting it to work was a bit of pain. The first reason is that I needed to add a lot of access keys, as one pair of (account, access_key) can only run one transaction at a time. The other reason is that it's pretty hard to achieve an interesting distribution of buffered receipts. Until now I thought about the outgoing buffer as a queue where new receipts are pushed at the end of it, and receipts from the front are sent out. But this is actually not the case - the logic is "if I can send this out, do it, otherwise buffer the receipt". The result is that most of receipts in the outgoing buffers are large - small receipts can usually be sent out immediately, it's the large ones that get buffered.
I tuned the test a bit and currently the buffered receipts look something like this:

Height: 10034
Current scheduler params: BandwidthSchedulerParams { base_bandwidth: 100000, max_shard_bandwidth: 4500000, max_receipt_size: 4194304, max_allowance: 4500000 }
Verifying bandwidth requests s2.v0 -> 0 
Buffered receipts: len: 27, total size: 3.9 MB, first_ten_sizes: [39055, 245311, 753, 3503, 236, 66562, 66404, 3460, 244868, 2167]
Verifying bandwidth requests s2.v0 -> 1 
Buffered receipts: len: 24, total size: 17.5 MB, first_ten_sizes: [224807, 4194304, 1727, 117005, 299615, 4194304, 106816, 413, 2262, 1296]
Verifying bandwidth requests s2.v0 -> 2 
Buffered receipts: len: 5, total size: 8.4 MB, first_ten_sizes: [236557, 1093098, 2696078, 153967, 4194304]
Verifying bandwidth requests s1.v0 -> 0 
Buffered receipts: len: 23, total size: 9.2 MB, first_ten_sizes: [291957, 154867, 1181, 928, 3473, 2499, 759, 2621, 3051, 140721]
Verifying bandwidth requests s1.v0 -> 1 
Buffered receipts: len: 1, total size: 4.2 MB, first_ten_sizes: [4194304]
Verifying bandwidth requests s1.v0 -> 2 
Buffered receipts: len: 31, total size: 6.2 MB, first_ten_sizes: [573857, 135912, 28894, 565, 2783, 134740, 159326, 219111, 630, 3012]
Verifying bandwidth requests s0.v0 -> 1 
Buffered receipts: len: 1, total size: 4.2 MB, first_ten_sizes: [4194304]

One more thing to try could be burning a lot of gas to cause a shard to become fully congested and then small receipts will be buffered as well. But then the transactions will get rejected... Eh maybe it's good enough

Copy link

codecov bot commented Nov 21, 2024

Codecov Report

Attention: Patch coverage is 91.13697% with 99 lines in your changes missing coverage. Please review.

Project coverage is 70.06%. Comparing base (9e4933b) to head (78eb384).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
core/store/src/trie/receipts_column_helper.rs 89.28% 13 Missing and 8 partials ⚠️
...egration-tests/src/test_loop/utils/transactions.rs 85.60% 15 Missing and 4 partials ⚠️
runtime/runtime/src/lib.rs 32.14% 18 Missing and 1 partial ⚠️
core/primitives/src/bandwidth_scheduler.rs 54.54% 14 Missing and 1 partial ⚠️
runtime/runtime/src/congestion_control.rs 91.93% 2 Missing and 8 partials ⚠️
core/store/src/trie/outgoing_metadata.rs 98.39% 0 Missing and 8 partials ⚠️
core/primitives/src/receipt.rs 90.69% 1 Missing and 3 partials ⚠️
core/primitives/src/types.rs 0.00% 2 Missing ⚠️
tools/state-viewer/src/congestion_control.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #12464      +/-   ##
==========================================
+ Coverage   69.84%   70.06%   +0.21%     
==========================================
  Files         838      840       +2     
  Lines      169410   170225     +815     
  Branches   169410   170225     +815     
==========================================
+ Hits       118323   119264     +941     
+ Misses      45840    45805      -35     
+ Partials     5247     5156      -91     
Flag Coverage Δ
backward-compatibility 0.16% <0.00%> (-0.01%) ⬇️
db-migration 0.16% <0.00%> (-0.01%) ⬇️
genesis-check 1.27% <0.00%> (-0.02%) ⬇️
linux 69.29% <77.17%> (+0.12%) ⬆️
linux-nightly 69.63% <90.86%> (+0.20%) ⬆️
macos 51.26% <75.73%> (+0.26%) ⬆️
pytests 1.58% <0.00%> (-0.02%) ⬇️
sanity-checks 1.39% <0.00%> (-0.02%) ⬇️
unittests 69.88% <91.13%> (+0.21%) ⬆️
upgradability 0.20% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

Copy link
Contributor

@shreyan-gupta shreyan-gupta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Just one minor comment for trie keys that I missed in previous reviews.

I skimmed through the tests and they look good, didn't go into too much detail.

@@ -338,6 +361,15 @@ impl TrieKey {
buf.extend(&index.to_le_bytes());
}
TrieKey::BandwidthSchedulerState => buf.push(col::BANDWIDTH_SCHEDULER_STATE),
TrieKey::BufferedReceiptGroupsQueueData { receiving_shard: receiver_shard } => {
buf.push(col::BUFFERED_RECEIPT_GROUPS_QUEUE_DATA);
buf.extend(&receiver_shard.to_le_bytes());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think the renaming might not have worked here and below, could you please change receiver_shard to receiving_shard?

Also, could we do something like above where we convert shard_id to u16 first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think the renaming might not have worked here and below, could you please change receiver_shard to receiving_shard?

👍, fixed. I used rust-analyzer to rename the variable and it didn't work properly here.

Also, could we do something like above where we convert shard_id to u16 first?

I don't like the idea of converting the shard id from u64 to u16 in TrieKey. It makes sense in bandwidth requests where we care about optimizing space, but for trie keys I'd prefer to prioritize being future proof. I don't think we gain much by optimizing it to u16 there.

/// Runs a transaction until completion.
/// Works in a non-blocking way which allows to run multiple transactions in parallel.
/// Is meant to use with run_until.
pub struct TransactionRunner {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noice!!

Copy link
Contributor Author

@jancionear jancionear Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it that's the way it's supposed to be done in TestLoop, but I think it's a useful utility to have. It's better than spawning transactions and hoping for the best 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I noticed the changes to making run_until take a FnMut, but I couldn't see too much of a downside other than the fact that you don't really know when the condition function is called within testloop.

Example, incrementing some counter within the condition FnMut might not be a great idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants