Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler Messages #30976

Merged
merged 9 commits into from
Apr 19, 2023
Merged

Conversation

apfitzge
Copy link
Contributor

@apfitzge apfitzge commented Mar 29, 2023

Problem

Banking Scheduler and Worker threads need to communicate via message passing.
Scheduler PR and Worker PR will both use these messages, so it makes sense to separate them into a separate PR to be merged before either scheduler/worker.

Summary of Changes

Messages that are passed from [scheduler to worker] and [worker to scheduler].

The message flow between Scheduler and Worker(s) is as follows:

flowchart LR

    subgraph Forwarding
        direction LR
        C[Scheduler];
        D[Worker];
        C -->|ForwardWork|D -->|FinishedForwardWork|C;
    end
    
    subgraph Consuming
        direction LR
        A[Scheduler];
        B[Worker];
        A -->|ConsumeWork|B -->|FinishedConsumeWork|A;
    end
Loading

Fixes #

@apfitzge apfitzge added the work in progress This isn't quite right yet label Mar 29, 2023
@apfitzge apfitzge force-pushed the feature/scheduler_messages branch 2 times, most recently from 4d14eb0 to 8ddc0b1 Compare March 29, 2023 22:45
@codecov
Copy link

codecov bot commented Mar 30, 2023

Codecov Report

Merging #30976 (61fb9aa) into master (b53656b) will decrease coverage by 0.1%.
The diff coverage is 0.0%.

@@            Coverage Diff            @@
##           master   #30976     +/-   ##
=========================================
- Coverage    81.5%    81.5%   -0.1%     
=========================================
  Files         727      728      +1     
  Lines      205118   205128     +10     
=========================================
- Hits       167256   167241     -15     
- Misses      37862    37887     +25     

@apfitzge apfitzge removed the work in progress This isn't quite right yet label Apr 11, 2023
@apfitzge apfitzge marked this pull request as ready for review April 11, 2023 18:22
@apfitzge
Copy link
Contributor Author

Left a few notes about hesitations I had while cleaning this up. Any thoughts would be appreciated 😄

This was referenced Apr 11, 2023
core/src/banking_stage/scheduler_messages.rs Outdated Show resolved Hide resolved
core/src/banking_stage/scheduler_messages.rs Outdated Show resolved Hide resolved
core/src/banking_stage/scheduler_messages.rs Outdated Show resolved Hide resolved
core/src/banking_stage/scheduler_messages.rs Show resolved Hide resolved
/// Message: [Worker -> Scheduler]
/// Processed transactions.
pub struct FinishedConsumeWork {
pub work: ConsumeWork,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this make copy of Vec<SanitizedTransaction> and send back, results 2X transactions being send back-n-forth between scheduler and worker? If so, that could become a hot spot, given number of transactions should only increase in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need references to the transactions to process them (relevant worker code), so there's no need to take them out of the ConsumeWork struct

In the implementation (linked above), the work field here, is the original ConsumeWork received by the worker, no cloning, which I think is what you are concerned about?

To be entirely clear/pedantic, it will do a copy of the struct when we move it into the Finished* wrappers. But that's just a shallow-copy of the vecs, so the entire struct copy is only 64 bytes. We could reduce that to 8 bytes by wrapping it in a Box, but then we have 2 pointer follows to get to the txs; not sure it'd be worth it. @ryoqun might have some thoughts here 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

So after writing the above comment, I was thinking "how can we enforce the fact the txs aren't cloned out by worker".

Maybe it's a better interface to not pub the fields, and add some impls for the messages:

  1. fn new() -> Self;
  2. fn transactions(&self) -> &[SanitizedTransaction];
  3. fn finish(retryable_indexes: Vec<usize>) -> FinishConsumeWork;

and similarly for the ForwardWork message pair.

@taozhu-chicago, @ryoqun - do you think this is a better interface than what I've got here with the pub fields?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the work field here, is the original ConsumeWork received by the worker, no cloning, which I think is what you are concerned about?

Yep, without see the actual implementation, I wasn't able to tell is work: Vec<SanitizedTransaction> is moved or copied. So yeah, would be a good idea to make message interface enforce ref to avoid accidental copy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔
Thinking the interface I described above won't be sufficient to enforce that, since we need to get the values out on the finished side. Will think about a better design tonight.

Also back to my original comment, it's probably slightly misleading. We do have to copy (move not clone - so still not copying the underlying Vecs within the tx), the memory of the SanitizedTransaction from the Scheduler's storage into the Vec when we create the batch. And back from Vec into Scheduler storage iff it is returned as retryable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Struggling to come up with a good way to enforce this in rust. We still want the fields of FinishedConsumeWork to be pub so we can pull them out at the end, but also want to restrict construction, to come from an existing ConsumeWork.

Potentially something like this:

// needed so clippy doesn't yell about using #[non_exhaustive],
// which only restricts OUTSIDE the crate...
#[allow(clippy::manual_non_exhaustive)] 
pub struct FinishedConsumeWork {
    pub work: ConsumeWork,
    pub retryable_indexes: Vec<usize>,
    _priv: (),
}

impl FinishedConsumeWork {
    fn new(work: ConsumeWork, retryable_indexes: Vec<usize>) -> Self {
        Self {
            work,
            retryable_indexes,
            _priv: (),
        }
    }
}

but I think that's fairly ugly on the receiving side.

After considering it more, I'm hesitant to put any implementation in what are supposed to be simple messages. wdyt @taozhu-chicago / @ryoqun ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea that does feel like bit of over engineering, at least at this stage. I am cool with committing what's ready to move forward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i think current form is fine as-is.

core/src/banking_stage/scheduler_messages.rs Show resolved Hide resolved
tao-stones
tao-stones previously approved these changes Apr 14, 2023
Copy link
Contributor

@tao-stones tao-stones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

pub batch_id: TransactionBatchId,
pub transaction_ids: Vec<TransactionId>,
pub transactions: Vec<SanitizedTransaction>,
pub max_age_slots: Vec<Slot>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

following a discussion with @ryoqun, we shouldn't let SanitizedTransactions pass epoch boundaries - so we can mark txs (at sanitize time) with a max age slot it will expire at.

Workers should check this before execution, and quickly disregard txs if the bank the worker is using > max_age_slot

}

/// Message: [Scheduler -> Worker]
/// Transactions to be consumed (executed, recorded, committed)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Transactions to be consumed (i.e. executed, recorded, and committed)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done: eb516c5 + 61fb9aa

/// Transactions to be consumed (executed, recorded, committed)
pub struct ConsumeWork {
pub batch_id: TransactionBatchId,
pub transaction_ids: Vec<TransactionId>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming: ids to be consistent with ForwardWork? (a field with identical type Vec<TransactionId> isn't called as packet_ids there currently).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done: 005ebd9

I originally had ids, then moved away to clarify from batch_id (which I had originally named id...then never changed these back! I think ids is already clear enough, so changed to be consistent between the messages.

@@ -0,0 +1,54 @@
use {
crate::immutable_deserialized_packet::ImmutableDeserializedPacket,
solana_poh::poh_recorder::Slot, solana_sdk::transaction::SanitizedTransaction, std::sync::Arc,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: solana_sdk::clock::Slot is more authoritative, iirc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done: efcff1c

good catch, must have chosen the wrong suggested import from rust-analyzer!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi, this won't happen again: #31832

ryoqun
ryoqun previously approved these changes Apr 19, 2023
Copy link
Member

@ryoqun ryoqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with nits

Copy link
Contributor

@tao-stones tao-stones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-approve after nits

@apfitzge apfitzge merged commit 7a393e4 into solana-labs:master Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

3 participants