feat(bandwidth_scheduler) - generate bandwidth requests based on receipts in outgoing buffers v2.0 #12464

jancionear · 2024-11-15T02:00:15Z

This is a rewrite of #12375, following the redesign that we discussed offline.

Main changes

Use a TrieQueue instead of a single value with all of the receipt groups
ReceiptGroup now contains both size and gas
Keep information about total gas, size and number of receipts for every outgoing buffer. Useful for resharding.
Don't read the first receipt when generating bandwidth requests. This doesn't play well with witness size limits, reading the first receipt could potentially add 4MB of storage proof, and we need to do it for every outgoing buffer.
Groups now have an upper size bound to ensure that size of the first group is at most max_receipt_size (as long as the size of a single receipt is at most max_receipt_size...)
Use the version of StateStoredReceipt to determine if the metadata should be updated. Don't add fields which keep the version of metadata.

The code is ready, but untested. I need to write a bunch of tests. You can take a look at the code in the meantime, it'll be easier to apply any changes before all the tests are written.

The PR is meant to be reviewed commit-by-commit.

Currently TrieQueue allows only storing receipts. I would like to store other things in a TrieQueue, let's make the item type generic.

They will be needed to load the OutgoingMetadatas in the next commit. I put these changes in a separate commit for clarity.

TrieQueueIterator returns Result<Item, StorageError>, so it can't be passed to the function that crates a bandwidth request. Modify the function to take iterators over results instead of iterators over values.

wacban

Just a high level check but looks good to me. I'll leave the proper review to @shreyan-gupta .

wacban · 2024-11-15T11:24:27Z

core/primitives/src/receipt.rs

+/// The V1 of StateStoredReceipt.
+/// The data is the same as in V0.
+/// Outgoing buffer metadata is updated only for versions V1 and higher.
+/// The receipts start being stored as V1 after the protocol change that introduced
+/// outgoing buffer metadata. Having a separate variant makes it clear whether the
+/// outgoing buffer metadata should be updated when a receipt is stored/removed.


Nice, I think it's pretty neat!

wacban · 2024-11-15T11:27:34Z

core/primitives/src/trie_key.rs

+    /// Stores `ReceiptGroupsQueueData` for the receipt groups queue
+    /// which corresponds to the buffered receipts to `receiver_shard`.
+    pub const OUTGOING_RECEIPT_GROUPS_QUEUE_DATA: u8 = 16;
+    /// A single item of `ReceiptGroupsQueue`. Values are of type `ReceiptGroup`.
+    pub const OUTGOING_RECEIPT_GROUPS_QUEUE_ITEM: u8 = 17;


Naming wise I think it would be more clear to call those BUFFERED_RECEIPT_GROUPS_* here - just to make it clear that this is connected to the BUFFERED_RECEIPT* columns.

I like queue data and queue item even though they don't follow the existing convention of _INDICES and just the type - I think it's fine as the queue data stores more than just indices.

I can change it to BUFFERED_RECEIPT_GROUPS_QUEUE_DATA/ITEM. To me "outgoing" is more specific than "buffered". Buffering is a generic action that could be done in many different contexts, "outgoing" clearly refers to the outgoing receipts. But I don't care that much, buffered is also fine.
Ideally it would be "BUFFERED_OUTGOING" x.x

Maybe Shreyan also has an opinion here? I'll wait for his review before renaming it.

In general I agree that outgoing would be better, but now that we already have BUFFERED I'd rather keep things consistent. Yeah, it's just a nit, I'm not too attached to either name.

I don't mind either prefix, BUFFERED_RECEIPT_GROUP_* or OUTGOING_RECEIPT_GROUP_* but I would still like to stick with the *_INDICES prefix as it gives a clear indication of what to expect there.

Looking at _INDICES I can expect there to exist a TrieQueue etc. etc. but looking directly at _DATA and _ITEM doesn't tell me much. I understand we are storing a bit more than just the indices in our case but the primary purpose is still to track indices. This is me coming from figuring out resharding, sitting and staring at all the trie key names and trying to understand what they mean...

Maybe OUTGOING_RECEIPT_GROUP_INDICES and OUTGOING_RECEIPT_GROUP?

I don't really like INDICES, it lies about what the data really is. There is a comment pointing to ReceiptGroupsQueueData, if someone is interested in what exactly the data is, they can quickly go there and take a look.

I'll change it to BUFFERED_RECEIPT_GROUPS_QUEUE_DATA/ITEM. I think Wac liked it as well, so that's 2 votes for this version.

wacban · 2024-11-15T11:27:47Z

core/primitives/src/trie_key.rs

-
-    // NOTE: NEW_COLUMN = 15 will be the last unique nibble in the trie!
-    // Consider demultiplexing on 15 and using 2-nibble prefixes.


wacban · 2024-11-15T11:29:19Z

core/primitives/src/trie_key.rs

+            TrieKey::OutgoingReceiptGroupsQueueData { .. } => {
+                col::OUTGOING_RECEIPT_GROUPS_QUEUE_DATA.len() + std::mem::size_of::<u64>()
+            }
+            TrieKey::OutgoingReceiptGroupsQueueItem { index, .. } => {
+                col::OUTGOING_RECEIPT_GROUPS_QUEUE_ITEM.len()
+                    + std::mem::size_of::<u64>()
+                    + std::mem::size_of_val(index)
+            }


mini nit: Maybe size_of:: ?

I like size_of_val more than size_of, it's impossible to make a mistake, it always calculates the actual size of the field. I only used size_of for the shard_id because I wasn't sure how size_of_val is going to behave with the ShardId wrapper.

Ah sorry the formatting was messed up, I wanted to suggest size_of ShardId but that < > made the ShardId disappear.

wacban · 2024-11-15T12:11:44Z

core/store/src/trie/outgoing_metadata.rs

+    pub fn default_config() -> Self {
+        // TODO(bandwidth_scheduler) - put in runtime config
+        ReceiptGroupsConfig {
+            size_lower_bound: ByteSize::kb(90),


Should the lower bound be the same as the diff between two consecutive requests? Is that 100kB?

Diff between two consecutive values that can be requested is ~111kB
All groups will be at least as large as lower_bound, and that should be smaller than the diff between two values. I'm not sure what's the ideal value 🤔 I guess the smaller we make the groups the better the precision of the request. I could reduce the group size to something like 50kB.

shreyan-gupta

Looks great overall! Thank you so much!

shreyan-gupta · 2024-11-18T09:36:17Z

core/store/src/trie/shard_tries.rs

@@ -90,6 +92,10 @@ impl ShardTries {
        TrieUpdate::new(self.get_view_trie_for_shard(shard_uid, state_root))
    }

+    pub fn get_shard_uids(&self) -> &[ShardUId] {


Where are we using this function? Is it not possible to use the shard layout to get the shard_uids?

It's used by the runtime params estimator, in estimator::apply_action_receipt.
The EstimatorContext::testbed function manually creates data for a shard, there's no epoch manager or shard layout.
It would be possible to refactor the estimator to use a shard layout, but the easiest way was to get the list of shards from the list of tries per shard. AFAIU there's only one shard at the moment, so it's not a big issue.

Yeah, if it's only used in estimator, lets use some other way of getting the shard ids. The estimator isn't even used currently, any solution, including just hardcoding the shard_ids should be fine.

I'm scared sometime in the future people would see this get_shard_uids function in tries and start using that instead of the proper method of getting it from shard layout.

shreyan-gupta · 2024-11-18T09:39:32Z

core/store/src/trie/receipts_column_helper.rs

@@ -60,6 +58,8 @@ pub struct OutgoingReceiptBuffer<'parent> {
 /// queue items. Based on that, a common push(), pop(), len(), and iter()
 /// implementation is provided as trait default implementation.
 pub trait TrieQueue {
+    type Item<'a>: BorshDeserialize + BorshSerialize;


Could we add a quick todo comment above to remove the 'a lifetime indicator once we remove Cow from receipt?

shreyan-gupta · 2024-11-18T09:42:06Z

core/store/src/trie/receipts_column_helper.rs

@@ -189,6 +252,8 @@ impl DelayedReceiptQueue {
 }

 impl TrieQueue for DelayedReceiptQueue {
+    type Item<'a> = ReceiptOrStateStoredReceipt<'a>;


For DelayedReceipts, we don't really store any metadata right? Could we change the type of Item to just Receipt?

The delayed receipt queue also uses state stored receipts. (See DelayedReceiptQueueWrapper). We need to update congestion info every time a receipt is added/removed from the delayed queue, and the stored metadata makes sure that the metadata stays consistent across protocol changes.

core/store/src/trie/receipts_column_helper.rs

shreyan-gupta · 2024-11-18T09:44:50Z

core/store/src/trie/receipts_column_helper.rs

-/// receipts.
-pub struct ReceiptIterator<'a> {
+/// Read-only iterator over items stored in a TrieQueue.
+pub struct TrieQueueIterator<'a, Queue: TrieQueue> {


Nice! Love the generalization!

shreyan-gupta · 2024-11-18T10:54:30Z

core/store/src/trie/outgoing_metadata.rs

+    /// Corresponds to receipts stored in the outgoing buffer to this shard.
+    receiver_shard: ShardId,
+    /// Persistent data, stored in the trie.
+    data: ReceiptGroupsQueueDataV0,


Why aren't we storing ReceiptGroupsQueueData directly?

It's easier to work with ReceiptGroupsQueueDataV0, as it's a normal struct with fields that can be accessed. Using the enum would require decomposing it every time or adding access methods :/
The code can be reorganised in the future if we end up needing to use DataV1.

shreyan-gupta · 2024-11-18T11:04:08Z

core/store/src/trie/outgoing_metadata.rs

+/// to determine the size and structure of receipts in the outgoing buffer and make
+/// bandwidth requests based on them.
+#[derive(Debug)]
+pub struct ReceiptGroupsQueue {


This struct is used only within this file. Could we remove pub from all functions and declarations?

It's also used in the receipt sink. When generating bandwidth requests ReceiptSink fetches the ReceiptsGroupsQueue for some shard and generates the bandwidth requests based on the groups in the queue.
In the previous iteration of the PR I had a separate struct that served a public interface for outgoing metadata to some shard, and the queue could be hidden in a private module, but Wac complained that it's redundant so I removed it 😅
#12375 (comment)

Apologies, I had done just a quick "find all references"

shreyan-gupta · 2024-11-18T11:06:09Z

core/store/src/trie/outgoing_metadata.rs

+        // Take out the last group from the queue and inspect it.
+        match self.pop_back(state_update)? {
+            Some(mut last_group) => {
+                if self.groups_config.should_start_new_group(&last_group, receipt_size, receipt_gas)


Seems like an overkill to store groups_config as part of ReceiptGroupsQueue given it's only being used here. I would consider passing it through update_on_receipt_pushed function, specially since it doesn't need to be publically exposed.

It's convenient to have it there. And I think it makes sense - a receipt group queue operates using some group config. It's nice for ergonomics, you don't need to pass the config every time. I like it, I'd prefer to keep it this way.
The config is not saved to the state if that's what you're worried about.

I agree with @shreyan-gupta here, it doesn't seem like the right place for configuration. How bad is it to pass it around?

Also how can the configuration be updated in the future? Currently it's set in the ctor.

I agree with @shreyan-gupta here, it doesn't seem like the right place for configuration. How bad is it to pass it around?

Eh ok, I'll try to change it to a parameter, see how that goes. But IMO there's nothing wrong with keeping it inside the struct. I see it as the struct being configured in the constructor.

I agree with @shreyan-gupta here, it doesn't seem like the right place for configuration. How bad is it to pass it around?

Also how can the configuration be updated in the future? Currently it's set in the ctor.

The configuration is not saved. It's generated and passed to the ctor for every new block. Generating a new configuration is enough to update it, it'll be immediately applied to new groups.

shreyan-gupta · 2024-11-18T11:08:00Z

core/store/src/trie/outgoing_metadata.rs

+    /// Returns empty metadata for protocol versions which don't support metadata.
+    /// Make sure to pass shard ids for every shard that has receipts in the outgoing buffer,
+    /// otherwise the metadata for it will be overwritten with empty metadata.
+    pub fn load(


So, question about all the load functions that we have here...

Can we structure the code in such a way that the load functions are only called with BandwidthScheduler feature enabled?

In which case, can we get rid of all logic that assumes optional loading? Examples are update_* functions where we do .entry().or_insert_with() and ReceiptGroupsQueue::load() returning Option<Self>?

Initialization on enabling BandwidthScheduler feature could be handled by checking if trie key exists and returning default.

It becomes hard to think about all cases where we may or may not be returning/initializing a value, considering we don't have consistent checks for BandwidthScheduler feature.

The code works the way it does because:

There are already some outgoing buffers, we need to create matching metadata for them

During resharding a new shard could suddenly appear and we need to initialize the metadata for it

We can't really scan the trie for existing outgoing buffers because the TrieAccess trait doesn't have an iterator

The easiest way to achieve this was to initialize the metadata when a receipt for it appears.

Can we structure the code in such a way that the load functions are only called with BandwidthScheduler feature enabled?

That's the case right now. OutgoingMetadatas::load doesn't do anything when the feature is not enabled. And nothing is pushed/popped from the metadata before protocol change.

In which case, can we get rid of all logic that assumes optional loading? Examples are update_* functions where we do .entry().or_insert_with() and ReceiptGroupsQueue::load() returning Option?

During resharding there could be a new shard, the or_insert_with is meant to handle this case.

Initialization on enabling BandwidthScheduler feature could be handled by checking if trie key exists and returning default.

That is kind of what's happening, OutgoingMetadatas::load tries to load the metadata, and if the trie doesn't contain the key it creates a new one.

But I'm open to other designs as well, it's a bit convoluted :/

That's the case right now. OutgoingMetadatas::load doesn't do anything when the feature is not enabled. And nothing is pushed/popped from the metadata before protocol change.

Oh I see, the calls to update_on_receipt_pushed and update_on_receipt_popped are guarded by receipt.should_update_outgoing_metadatas(), hmm.... That's not at all intuitive

I was about to ask how does the initialization of OutgoingMetadatas on protocol change, but I think I see what's going on. So, I guess we can continue to process V0 stored receipts (that don't update metadata) while creating and processing V1 stored receipts after the protocol upgrade?

shreyan-gupta · 2024-11-18T11:18:41Z

core/primitives/src/trie_key.rs

+    /// Stores `ReceiptGroupsQueueData` for the receipt groups queue
+    /// which corresponds to the buffered receipts to `receiver_shard`.
+    pub const OUTGOING_RECEIPT_GROUPS_QUEUE_DATA: u8 = 16;
+    /// A single item of `ReceiptGroupsQueue`. Values are of type `ReceiptGroup`.
+    pub const OUTGOING_RECEIPT_GROUPS_QUEUE_ITEM: u8 = 17;


I don't mind either prefix, BUFFERED_RECEIPT_GROUP_* or OUTGOING_RECEIPT_GROUP_* but I would still like to stick with the *_INDICES prefix as it gives a clear indication of what to expect there.

Looking at _INDICES I can expect there to exist a TrieQueue etc. etc. but looking directly at _DATA and _ITEM doesn't tell me much. I understand we are storing a bit more than just the indices in our case but the primary purpose is still to track indices. This is me coming from figuring out resharding, sitting and staring at all the trie key names and trying to understand what they mean...

Maybe OUTGOING_RECEIPT_GROUP_INDICES and OUTGOING_RECEIPT_GROUP?

Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>

jancionear · 2024-11-18T17:33:01Z

core/store/src/trie/outgoing_metadata.rs

+    pub gas_lower_bound: Gas,
+    /// All receipt groups aim to have gas below this threshold.
+    pub gas_upper_bound: Gas,
+}


I spent some more time thinking about the optimal lower and upper bound sizes for receipt groups, and I think I'll:

Remove the lower bound

Reduce the upper bound to something like 100kB

This means that a new group will be started if adding a new receipt to the group would cause its size to go above 100kB.

As a result all groups will be below 100kB, except for the groups that contain a single receipt with size above 100kB.

AFAIU having groups like this will produce optimal bandwidth requests - identical to the ones that we would produce if we walked over all of receipts individually.

The diff between two values that can be requested is ~110kB. If all groups were below 100kB we would always produce optimal requests - each consecutive group would either request a value or move to the next value, just like individual receipts.

When there are receipts larger than 100kB we will naturally skip some of the values, but this is identical to what would happen if we operated on individual receipts. The next group smaller than 100kB will either request the same value or move to the next one, just like individual receipts.

A bit handwavy, but I feel that it works. I can make some drawings if needed.

Yeah, makes sense, I honestly had the same thought for why were we maintaining both lower and upper bounds and we could instead do with just one lower or upper bound, however I didn't raise it in the original review.

One thing that wasn't clear to me was something on the lines of what happens when we have the following scenarios?

99 number of 1kb receipts

1 number of 500 kb receipt.

Should the ideal/expected behavior be case A or case B?

Case A
Group 1: 99 receipts with 99kb
Group 2: 1 receipt with 500 kb

Case B
Group 1: 100 receipts with 599 kb

I would imagine case A is probably more ideal than case B for bandwidth scheduling but it's harder to "code it up" compared to a simple upper bound impl?

Which is why I had originally not questioned the upper and lower bound

My initial design was (B). The main reason was that I wanted to have the guarantee that every group is at least 90kB large to be able to guarantee that the size of the list of groups is small enough. This was important when all receipt groups were kept in one trie value. Now that doesn't really matter.

(B) keeps the number of groups lower, but it has serious disadvantages.
The first disadvantage is that a group can get larger than max_receipt_size, and there's no guarantee that the bandwidth scheduler will be able to grant this much bandwidth. To deal with this I initially read the first receipt and subtracted its size from the first group, but this causes witness size issues because the first receipt could be big. To deal with this I added the size_upper_bound to keep size of receipt groups below max_receipt_size.
The second disadvantage is that it doesn't produce optimal bandwidth requests. Smaller groups would be more precise.

In the end I moved to (A) because it doesn't have any of these disadvantages. The number of groups can be a bit larger, but that shouldn't matter when it's a TrieQueue.

Sounds good! :)

jancionear · 2024-11-21T14:59:48Z

I added a bunch of tests.

They actually found a bug - updated ReceiptGroupsQueueData was not saved to the trie when removing a receipt from the first group in the queue. This was because modify_first didn't write the indices when the first item was modified, but not removed from the queue. It makes sense - queue indices don't change when the first element is removed, but it didn't work with the ReceiptGroupsQueue which saves its data (which contain total gas/size) when write_inidices is called.
Eh using write_indices for writing data is a bit ugly :/

There's also a larger testloop test which spawns a workload that causes a lot of cross-shard receipts to be sent (and some of them buffered). The test inspects the buffered receipts, groups and generated bandwidth requests and makes sure that they look correct.
Getting it to work was a bit of pain. The first reason is that I needed to add a lot of access keys, as one pair of (account, access_key) can only run one transaction at a time. The other reason is that it's pretty hard to achieve an interesting distribution of buffered receipts. Until now I thought about the outgoing buffer as a queue where new receipts are pushed at the end of it, and receipts from the front are sent out. But this is actually not the case - the logic is "if I can send this out, do it, otherwise buffer the receipt". The result is that most of receipts in the outgoing buffers are large - small receipts can usually be sent out immediately, it's the large ones that get buffered.
I tuned the test a bit and currently the buffered receipts look something like this:

Height: 10034
Current scheduler params: BandwidthSchedulerParams { base_bandwidth: 100000, max_shard_bandwidth: 4500000, max_receipt_size: 4194304, max_allowance: 4500000 }
Verifying bandwidth requests s2.v0 -> 0 
Buffered receipts: len: 27, total size: 3.9 MB, first_ten_sizes: [39055, 245311, 753, 3503, 236, 66562, 66404, 3460, 244868, 2167]
Verifying bandwidth requests s2.v0 -> 1 
Buffered receipts: len: 24, total size: 17.5 MB, first_ten_sizes: [224807, 4194304, 1727, 117005, 299615, 4194304, 106816, 413, 2262, 1296]
Verifying bandwidth requests s2.v0 -> 2 
Buffered receipts: len: 5, total size: 8.4 MB, first_ten_sizes: [236557, 1093098, 2696078, 153967, 4194304]
Verifying bandwidth requests s1.v0 -> 0 
Buffered receipts: len: 23, total size: 9.2 MB, first_ten_sizes: [291957, 154867, 1181, 928, 3473, 2499, 759, 2621, 3051, 140721]
Verifying bandwidth requests s1.v0 -> 1 
Buffered receipts: len: 1, total size: 4.2 MB, first_ten_sizes: [4194304]
Verifying bandwidth requests s1.v0 -> 2 
Buffered receipts: len: 31, total size: 6.2 MB, first_ten_sizes: [573857, 135912, 28894, 565, 2783, 134740, 159326, 219111, 630, 3012]
Verifying bandwidth requests s0.v0 -> 1 
Buffered receipts: len: 1, total size: 4.2 MB, first_ten_sizes: [4194304]

One more thing to try could be burning a lot of gas to cause a shard to become fully congested and then small receipts will be buffered as well. But then the transactions will get rejected... Eh maybe it's good enough

codecov · 2024-11-21T15:04:44Z

Codecov Report

Attention: Patch coverage is 91.13697% with 99 lines in your changes missing coverage. Please review.

Project coverage is 70.06%. Comparing base (9e4933b) to head (78eb384).
Report is 2 commits behind head on master.

Files with missing lines	Patch %	Lines
core/store/src/trie/receipts_column_helper.rs	89.28%	13 Missing and 8 partials ⚠️
...egration-tests/src/test_loop/utils/transactions.rs	85.60%	15 Missing and 4 partials ⚠️
runtime/runtime/src/lib.rs	32.14%	18 Missing and 1 partial ⚠️
core/primitives/src/bandwidth_scheduler.rs	54.54%	14 Missing and 1 partial ⚠️
runtime/runtime/src/congestion_control.rs	91.93%	2 Missing and 8 partials ⚠️
core/store/src/trie/outgoing_metadata.rs	98.39%	0 Missing and 8 partials ⚠️
core/primitives/src/receipt.rs	90.69%	1 Missing and 3 partials ⚠️
core/primitives/src/types.rs	0.00%	2 Missing ⚠️
tools/state-viewer/src/congestion_control.rs	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #12464      +/-   ##
==========================================
+ Coverage   69.84%   70.06%   +0.21%     
==========================================
  Files         838      840       +2     
  Lines      169410   170225     +815     
  Branches   169410   170225     +815     
==========================================
+ Hits       118323   119264     +941     
+ Misses      45840    45805      -35     
+ Partials     5247     5156      -91

Flag	Coverage Δ
backward-compatibility	`0.16% <0.00%> (-0.01%)`	⬇️
db-migration	`0.16% <0.00%> (-0.01%)`	⬇️
genesis-check	`1.27% <0.00%> (-0.02%)`	⬇️
linux	`69.29% <77.17%> (+0.12%)`	⬆️
linux-nightly	`69.63% <90.86%> (+0.20%)`	⬆️
macos	`51.26% <75.73%> (+0.26%)`	⬆️
pytests	`1.58% <0.00%> (-0.02%)`	⬇️
sanity-checks	`1.39% <0.00%> (-0.02%)`	⬇️
unittests	`69.88% <91.13%> (+0.21%)`	⬆️
upgradability	`0.20% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests
JS Bundle Analysis - Avoid shipping oversized bundles

This reverts commit 101f7fa.

shreyan-gupta

LGTM!

Just one minor comment for trie keys that I missed in previous reviews.

I skimmed through the tests and they look good, didn't go into too much detail.

shreyan-gupta · 2024-11-21T19:22:05Z

core/primitives/src/trie_key.rs

@@ -338,6 +361,15 @@ impl TrieKey {
                buf.extend(&index.to_le_bytes());
            }
            TrieKey::BandwidthSchedulerState => buf.push(col::BANDWIDTH_SCHEDULER_STATE),
+            TrieKey::BufferedReceiptGroupsQueueData { receiving_shard: receiver_shard } => {
+                buf.push(col::BUFFERED_RECEIPT_GROUPS_QUEUE_DATA);
+                buf.extend(&receiver_shard.to_le_bytes());


Oh I think the renaming might not have worked here and below, could you please change receiver_shard to receiving_shard?

Also, could we do something like above where we convert shard_id to u16 first?

Oh I think the renaming might not have worked here and below, could you please change receiver_shard to receiving_shard?

👍, fixed. I used rust-analyzer to rename the variable and it didn't work properly here.

Also, could we do something like above where we convert shard_id to u16 first?

I don't like the idea of converting the shard id from u64 to u16 in TrieKey. It makes sense in bandwidth requests where we care about optimizing space, but for trie keys I'd prefer to prioritize being future proof. I don't think we gain much by optimizing it to u16 there.

shreyan-gupta · 2024-11-21T19:52:08Z

integration-tests/src/test_loop/utils/transactions.rs

+/// Runs a transaction until completion.
+/// Works in a non-blocking way which allows to run multiple transactions in parallel.
+/// Is meant to use with run_until.
+pub struct TransactionRunner {


I'm not sure if it that's the way it's supposed to be done in TestLoop, but I think it's a useful utility to have. It's better than spawning transactions and hoping for the best 😅

Yeah, I noticed the changes to making run_until take a FnMut, but I couldn't see too much of a downside other than the fact that you don't really know when the condition function is called within testloop.

Example, incrementing some counter within the condition FnMut might not be a great idea.

jancionear added 15 commits November 14, 2024 21:52

Add StateStoredReceiptV1

cc672a2

Make TrieQueue generic over item type

67b48a8

Currently TrieQueue allows only storing receipts. I would like to store other things in a TrieQueue, let's make the item type generic.

Rename TrieQueue::push to TrieQueue::push_back

0ce833f

Rename TrieQueue::pop to pop_front

2e35da0

Add TrieQueue::pop_back

8f1118b

Add TrieQueue::modify_first

b2ab786

Add OutgoingMetadatas and ReceiptGroupQueue

c96a26a

Add Trie columns to store receipt groups data

6f152c5

Remove the comment about column=15 being a problem

41909b3

Pass shard_uids to estimator::apply_action_receipt

101f7fa

They will be needed to load the OutgoingMetadatas in the next commit. I put these changes in a separate commit for clarity.

Maintain outgoing buffer metadatas in ReceiptSink

8531d80

Refactor ReceiptSink

686b146

Take iterator over results when creating a bandwidth request

c48b403

TrieQueueIterator returns Result<Item, StorageError>, so it can't be passed to the function that crates a bandwidth request. Modify the function to take iterators over results instead of iterators over values.

Generate bandwidth requests

6975fac

Update protocol schema checksums

f55eba7

jancionear requested review from wacban and shreyan-gupta November 15, 2024 02:00

jancionear requested a review from a team as a code owner November 15, 2024 02:00

jancionear mentioned this pull request Nov 15, 2024

feat(bandwidth_scheduler) - generate bandwidth requests based on receipts in outgoing buffers #12375

Closed

Fix protocol schema again

3fd8ccf

wacban reviewed Nov 15, 2024

View reviewed changes

jancionear mentioned this pull request Nov 15, 2024

Nuke cows in Receipt #12470

Draft

shreyan-gupta reviewed Nov 18, 2024

View reviewed changes

jancionear and others added 7 commits November 18, 2024 14:09

Simplify Item definition in ReceiptIterator

2ae2ed0

Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>

TrieQueue::Item instead of Receipt in error message

4da8f44

Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>

Fix Parameters typo

eaef8e6

Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>

Fix Needed typo

d8526d3

Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>

Fix needed typo 2

5a6b66a

Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>

Remove lifetime from ReceiptSink

638b7ac

Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>

Fix initialized typo

54de07b

Co-authored-by: Shreyan Gupta <shreyan.gupta96@gmail.com>

jancionear added 2 commits November 18, 2024 14:27

substract without the s

5688e31

Add Cow TODO

1e60e18

jancionear commented Nov 18, 2024

View reviewed changes

jancionear added 14 commits November 18, 2024 17:44

remove lower_bound, set upper_bound to 100kB

249a693

Rename trie columns

628f19e

Fix protocol schema check after renaming trie column

0ed79be

Test TrieQueue operations

b725792

Test ReceiptGroupsConfig

d460162

Write indices after modification in TrieQueue::modify_first

b432b08

Add get_random_receipt_size_for_test

0364d5e

Test ReceiptGroupsQueue

e313e21

Test that receipt groups produce optimal bandwidth requests

02f2203

Use FnMut in TestLoop::run_until

dbe0f7e

Add TransactionRunner

30cf08e

Add run_txs_parallel

5e96672

Bandwidth request generation testloop test

b035296

Merge branch 'master' into jan-generate-requests-v2

7a117aa

jancionear added 4 commits November 21, 2024 15:19

Fix nightly feature flags

4e064a5

Revert "Pass shard_uids to estimator::apply_action_receipt"

abd7d6c

This reverts commit 101f7fa.

Hardcode shard_id in estimator::apply_action_receipt

fc571ce

Don't keep ReceiptGroupsConfig in ReceiptGroupsQueue

740bf2a

jancionear requested review from wacban and shreyan-gupta November 21, 2024 16:49

shreyan-gupta approved these changes Nov 21, 2024

View reviewed changes

jancionear added 2 commits November 21, 2024 21:09

Rename receiver_shard to receiving_shard

752fb6e

Grammar fix

78eb384


		// NOTE: NEW_COLUMN = 15 will be the last unique nibble in the trie!
		// Consider demultiplexing on 15 and using 2-nibble prefixes.

feat(bandwidth_scheduler) - generate bandwidth requests based on receipts in outgoing buffers v2.0 #12464

Are you sure you want to change the base?

feat(bandwidth_scheduler) - generate bandwidth requests based on receipts in outgoing buffers v2.0 #12464

Conversation

jancionear commented Nov 15, 2024

Main changes

wacban left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear Nov 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear Nov 15, 2024 • edited Loading

Choose a reason for hiding this comment

shreyan-gupta left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear Nov 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear Nov 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear commented Nov 21, 2024

codecov bot commented Nov 21, 2024 • edited Loading

Codecov Report

shreyan-gupta left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jancionear Nov 15, 2024 •

edited

Loading

jancionear Nov 15, 2024 •

edited

Loading

jancionear Nov 18, 2024 •

edited

Loading

jancionear Nov 18, 2024 •

edited

Loading

codecov bot commented Nov 21, 2024 •

edited

Loading

jancionear Nov 21, 2024 •

edited

Loading