-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter pubkey in gossip duplicateproof ingestion #29879
Filter pubkey in gossip duplicateproof ingestion #29879
Conversation
…ateproof_ingestion
…thub.com:wen-coding/solana into filter_pubkey_in_gossip_duplicateproof_ingestion
…ateproof_ingestion
…thub.com:wen-coding/solana into filter_pubkey_in_gossip_duplicateproof_ingestion
self.last_root = Some(last_root); | ||
if let Ok(bank_forks_result) = self.bank_forks.read() { | ||
let root_bank = bank_forks_result.root_bank(); | ||
self.cached_staked_nodes = Some(root_bank.staked_nodes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since stakes only change per epoch, better to not take the bank_forks
lock everytime the root changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, changed.
last_root: Option<Slot>, | ||
// remember the last root slot cleaned, clear anything between last_root and last_root_cleaned. | ||
last_root_cleaned: Option<Slot>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these Option
?
Can we just initialize them with an actual value? or just use 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theoretically because root could potentially be 0, but I guess it never happens in production, and it's not catastrophic if we don't handle duplicate shreds when root is 0, so I'm changing last_root and last_root_cleaned to 0 now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't they get actual values at initialization?
bank_forks: Arc<RwLock<BankForks>>, | ||
// Cache information from root bank so we could function correctly without reading roots. | ||
cached_on_slot: Slot, | ||
cached_staked_nodes: Option<Arc<HashMap<Pubkey, u64>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of these we can just use epoch staked nodes:
https://github.com/solana-labs/solana/blob/180ea1e0a/runtime/src/bank.rs#L7178
which are fixed for entire epoch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cached_staked_nodes
does not need to be an Option
.
If epoch-stakes are none, we already have bigger problems.
Just use unwrap_or_default
. Maybe error log if it is none.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before it is initialized the first time I need to give it a value, Option seems most natural here, using None to indicate not initialized so I should read epoch info before use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this field cannot be initialized when initializing DuplicateShredHandler
struct?
If that is not really possible, instead of Option
can we just set it to an empty hash-map and use
cached_on_slot == 0u64 && cached_staked_nodes.is_empty()
as an indication that staked-nodes have not been initialized yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This handler is initialized in TVU initialization, I'm not sure there is a valid bank to read from at that point, also it's better not to obtain read locks in initialization logic any way.
Option seems most natural, it also makes copying from root_bank.epoch_staked_nodes more straightforward. But I can live with an empty stake map at initialization as well, changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure there is a valid bank to read from at that point,
If there is not, then we need to understand when that would be the case and why.
My understanding from the BankForks
api is that it always have a root-bank; and the rest here are obtained from the root_bank.
also it's better not to obtain read locks in initialization logic any way.
why not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wen-coding I am still curious to know what the problem is to fully populating the struct fields at initialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main concern is separation of concerns. I guess I'm extremely against long read/write access in initialization because I'd been bitten by this before in emergency, we needed some quick fix to go into production to rescue the world, but the initialization logic spent x minutes reading the disk before accepting requests, so that's x minutes of requests lost.
All initialization logic are simple at first, but as time goes by they will get more and more bloated, until one day you need to spend a lot of time deciding "A needs to be constructed before B, and B needs to be constructed before C". If you put real dependencies in initialize(), then you can construct the objects in any order you want, then during initialization phase you can initialize according to the real dependencies. It might not seem to be a lot of difference, but it makes life a bit easier because very often you can't even construct a complex object without all its dependency constructed.
It also helps testing, because if you put all the physical world dependencies in a separate initialize() logic, then you can easily stick in mock objects after construction without calling initialize(), your tests will not have to jump through the hoops constructing the fake world so the constructor can give you an object.
That said, it is hard to draw the line somewhere, but generally I prefer lazy initialization and make the constructor small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are particular cases that initialization causes issues we can address those specifically.
But I think by default better initialize types correctly rather than leave them in a limbo state.
Specially all Option<..>
which can be none will make the code convoluted and easily may introduce bugs.
{ | ||
return; | ||
} | ||
if let Ok(bank_forks_result) = self.bank_forks.read() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just do
let bank_forks = self.bank_forks.read().unwrap();
we do it all over the code-base and if the lock is poisoned better just to stop there than keep going.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if self.last_root.is_some() | ||
&& (slot <= self.last_root.unwrap() | ||
|| slot > self.last_root.unwrap() + self.cached_slots_in_epoch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better remove Option
from self.last_root
and remove these unwraps
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Some(proof_chunkmap) => { | ||
return proof_chunkmap.wallclock >= data.wallclock; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to bother with wallclock
at all in this file.
If a node is intentionally sending multiple fake duplicate proofs, they can easily manipulate these wallclocks to work around this check.
Also, inspecting wallclock
relies on a very minor implementation detail somewhere else which is not ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to put parts from different proofs together, even though they might be the same now, they might be different in the future.
And we define an order on wallclock so that our selection is kind of stable. I agree it's easy to attack, but I guess it's possible that a naive validator send out different proofs at different wallclock, then it's cleaner here to just stick to one of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to put parts from different proofs together
You don't have to do that. Just ignore it if there is already one, or replace the existing one. It doesn't matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can do that, I don't think it's too important to make the order stable here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still see this code using wallclock
so many places.
I really don't think we should; just plz ignore if wallclock
exist, and if you get two values just pick one; doesn't matter which.
// Remove the entry with lowest stake which is also lower than stake of new pubkey. | ||
for oldkey in chunk_map.keys() { | ||
let oldkey_stake = | ||
cached_staked_nodes.get(oldkey).copied().unwrap_or_default(); | ||
if oldkey_stake < to_remove_stake { | ||
to_remove = Some(*oldkey); | ||
to_remove_stake = oldkey_stake; | ||
} | ||
} | ||
match to_remove { | ||
Some(oldkey) => { | ||
chunk_map.remove(&oldkey); | ||
} | ||
// If all current stakes are higher than new pubkey stake, do not insert. | ||
_ => return false, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't this become extremely slow when the cache is at capacity?
because it has to do a scan for each insert.
We can use a lazy eviction strategy as in:
https://github.com/solana-labs/solana/blob/180ea1e0a/gossip/src/crds.rs#L581-L607
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out. Changed it so that 10% of the lowest stake pubkeys are dumped every time the map is full.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also mentioned elsewhere. lets evict entries only once number of entries cached are >= 2x
intended capacity. Then evict half of them based on stake.
…s full. - Change slot and slot_to_clean to not use Option. - other small fixes.
Pull request has been modified.
const MAX_SLOT_DISTANCE_TO_ROOT: Slot = 100; | ||
// We limit the pubkey for each slot to be 100 for now. | ||
// We limit the pubkey for each slot to be 100 for now, when this limit is reached, | ||
// we drop 10% of stakes with lowest stakes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be 10% of pubkeys not "stake", right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, fixed.
bank_forks: Arc<RwLock<BankForks>>, | ||
// Cache information from root bank so we could function correctly without reading roots. | ||
cached_on_slot: Slot, | ||
cached_staked_nodes: Option<Arc<HashMap<Pubkey, u64>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cached_staked_nodes
does not need to be an Option
.
If epoch-stakes are none, we already have bigger problems.
Just use unwrap_or_default
. Maybe error log if it is none.
// The stuff we need only changes per epoch, so we only need to cache | ||
// once per epoch, and it's okay if we use old data in a few slots. | ||
if self.cached_staked_nodes.is_some() | ||
&& last_root < self.cached_slots_in_epoch + self.cached_on_slot | ||
{ | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of this you can store the epoch
when things were cached and just check if the epoch has changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I get current epoch without reading the root bank? I am interested in checking whether epoch changed without reading the bank which involves a read lock, currently I'm using cached_on_slot not cached_on_epoch just because I'm getting the slot number for free without a read lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I get current epoch without reading the root bank?
The root bank will give you the root bank epoch.
Some nodes' root bank might be behind others', so it would not result in a "current" epoch.
If we want to be 100% accurate we should:
- map duplicate shreds to the "slot" number embedded inside them.
- then map slots to epochs using
epoch_schedule
, and useepoch_staked_nodes
corresponding with each epoch.
Something like this code:
https://github.com/solana-labs/solana/blob/b29a50a55/core/src/cluster_nodes.rs#L404-L438
That might be an overkill though and unnecessary take too much resources, because here we are only using stakes to defend against spams.
Some simpler ideas would be:
- to rate limit the root bank look up to
DEFAULT_MS_PER_SLOT
like:
https://github.com/solana-labs/solana/blob/b29a50a55/core/src/shred_fetch_stage.rs#L61-L76 - use
RwLock::try_read
not to add lock contention on bank-forks and unless the root-bank is too old.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In
last_root < self.cached_slots_in_epoch + self.cached_on_slot
what if self.cached_on_slot
was in the middle of the epoch?
then the epoch stakes won't be updated until the middle of the next epoch, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to RwLock::try_read now, it should be okay if we can't get data for new epoch for a few slots, we just use the old data.
{ | ||
return; | ||
} | ||
let bank_forks_result = self.bank_forks.read().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is holding the lock on bank_forks unnecessarily for too long. instead:
let root_bank = self.bank_forks.read().unwrap().root_bank();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
return false; | ||
Some(SlotStatus::UnfinishedProof(slot_map)) => match slot_map.get(&data.from) { | ||
None => { | ||
if slot_map.len() < MAX_PUBKEY_PER_SLOT { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this if
check go inside check_has_enough_stake_and_cleanup
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// stake is too low, return false in this case. | ||
!newkey_popped | ||
} | ||
_ => false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid _ =>
.
Instead enumerate all possible cases.
If a new variant is added to SlotStatus
enum then we get a compiler error to implement the right handling for the new variant here. However _ =>
will silent that compiler error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
…lot. - Other small fixes.
// The stuff we need only changes per epoch, so we only need to cache | ||
// once per epoch, and it's okay if we use old data in a few slots. | ||
if self.cached_staked_nodes.is_some() | ||
&& last_root < self.cached_slots_in_epoch + self.cached_on_slot | ||
{ | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I get current epoch without reading the root bank?
The root bank will give you the root bank epoch.
Some nodes' root bank might be behind others', so it would not result in a "current" epoch.
If we want to be 100% accurate we should:
- map duplicate shreds to the "slot" number embedded inside them.
- then map slots to epochs using
epoch_schedule
, and useepoch_staked_nodes
corresponding with each epoch.
Something like this code:
https://github.com/solana-labs/solana/blob/b29a50a55/core/src/cluster_nodes.rs#L404-L438
That might be an overkill though and unnecessary take too much resources, because here we are only using stakes to defend against spams.
Some simpler ideas would be:
- to rate limit the root bank look up to
DEFAULT_MS_PER_SLOT
like:
https://github.com/solana-labs/solana/blob/b29a50a55/core/src/shred_fetch_stage.rs#L61-L76 - use
RwLock::try_read
not to add lock contention on bank-forks and unless the root-bank is too old.
// The stuff we need only changes per epoch, so we only need to cache | ||
// once per epoch, and it's okay if we use old data in a few slots. | ||
if self.cached_staked_nodes.is_some() | ||
&& last_root < self.cached_slots_in_epoch + self.cached_on_slot | ||
{ | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In
last_root < self.cached_slots_in_epoch + self.cached_on_slot
what if self.cached_on_slot
was in the middle of the epoch?
then the epoch stakes won't be updated until the middle of the next epoch, no?
@@ -63,10 +62,17 @@ pub struct DuplicateShredHandler { | |||
// We don't want bad guys to inflate the chunk map, so we limit the number of | |||
// pending proofs from each pubkey to ALLOWED_SLOTS_PER_PUBKEY. | |||
validator_pending_proof_map: HashMap<Pubkey, HashSet<Slot>>, | |||
// remember the last root slot handled, clear anything older than last_root. | |||
// Cache last root to reduce read lock. | |||
last_root: Slot, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
last_root
has kind of redundancy.
root
means last_root
. if it was not last_root
then it wouldn't be root
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's named last_root because it's copied from last_root() of blockstore, I can rename it if you really want that.
bank_forks: Arc<RwLock<BankForks>>, | ||
// Cache information from root bank so we could function correctly without reading roots. | ||
cached_on_slot: Slot, | ||
cached_staked_nodes: Option<Arc<HashMap<Pubkey, u64>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this field cannot be initialized when initializing DuplicateShredHandler
struct?
If that is not really possible, instead of Option
can we just set it to an empty hash-map and use
cached_on_slot == 0u64 && cached_staked_nodes.is_empty()
as an indication that staked-nodes have not been initialized yet?
last_root: 0, | ||
last_root_cleaned: 0, | ||
cached_on_slot: 0, | ||
cached_staked_nodes: None, | ||
cached_slots_in_epoch: 100, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why these other fields are not initialized here?
and where does 100
come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said above, I don't want to do read operations in initialization. 100 is a random number I put in there just in case we can't read root bank for a while, changed to 0 now.
if slot <= self.last_root | ||
|| slot > self.last_root + self.cached_slots_in_epoch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!(self.last_root..self.last_root + self.cached_slots_in_epoch).contains(&slot)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
fn check_has_enough_stake_and_cleanup(&mut self, slot: &Slot, newkey: &Pubkey) -> bool { | ||
match self.chunk_map.get_mut(slot) { | ||
None => true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if this is a new slot
we always insert the entry even if we are at capacity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current the capacity is calculated per slot, of course, we do limit number of slots each pubkey can insert.
let mut heap = BinaryHeap::new(); | ||
// Push new pubkey in first so it's compared with those in the table. | ||
heap.push(Reverse::<(u64, Pubkey)>((newkey_stake, *newkey))); | ||
for oldkey in chunk_map.keys() { | ||
let oldkey_stake = | ||
cached_staked_nodes.get(oldkey).copied().unwrap_or_default(); | ||
heap.push(Reverse::<(u64, Pubkey)>((oldkey_stake, *oldkey))); | ||
} | ||
let mut newkey_popped = false; | ||
for _ in 0..(MAX_PUBKEY_PER_SLOT / 10) { | ||
if let Some(Reverse::<(u64, Pubkey)>((_, key))) = heap.pop() { | ||
if &key == newkey { | ||
newkey_popped = true; | ||
} else { | ||
chunk_map.remove(&key); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unnecessarily completed.
I think it would be simpler if:
- always insert the entry regardless of stake.
- once the insert is done, if the number of entries >= 2 * times intended capacity, evict half of entries with lowest stake from that hash-map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, changed.
const MAX_SLOT_DISTANCE_TO_ROOT: Slot = 100; | ||
// We limit the pubkey for each slot to be 100 for now. | ||
// We limit the pubkey for each slot to be 100 for now, when this limit is reached, | ||
// we drop 10% of pubkeys with lowest stakes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10%
is probably too low, which would cause high fixed cost.
We can let the hash-map go to 2x capacity then remove half of the entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I made it remove half of the entries, right now the "capacity" is 100, we can change it to 200 if we want the real capacity to be 100.
- Always insert a new pubkey then remove half of pubkeys if at capacity - other small changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments but we can also merge this and follow up with smaller patches addressing those; whichever works for you better.
so stamping the pr.
const MAX_SLOT_DISTANCE_TO_ROOT: Slot = 100; | ||
// We limit the pubkey for each slot to be 100 for now. | ||
// We limit the pubkey for each slot to be 100 for now, when this limit is reached, | ||
// we drop 50% of pubkeys with lowest stakes. | ||
const MAX_PUBKEY_PER_SLOT: usize = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want number of pubkeys after pruning 50% does at least have 2/3 of stake.
Base on mainnet stake distribution right now, seems like this number should be more like ~300.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
|
||
fn dump_pubkeys_with_low_stakes( | ||
cached_staked_nodes: Arc<HashMap<Pubkey, u64>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be a just be &HashMap<...>
;
does not need to be an Arc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
let mut heap = BinaryHeap::new(); | ||
for oldkey in slot_chunk_map.keys() { | ||
let oldkey_stake = cached_staked_nodes.get(oldkey).copied().unwrap_or_default(); | ||
heap.push(Reverse::<(u64, Pubkey)>((oldkey_stake, *oldkey))); | ||
} | ||
for _ in 0..(MAX_PUBKEY_PER_SLOT / 2) { | ||
if let Some(Reverse::<(u64, Pubkey)>((_, key))) = heap.pop() { | ||
slot_chunk_map.remove(&key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heap would result in an n log n
performance here.
You can get a linear performance by using select_nth_unstable
:
https://github.com/solana-labs/solana/blob/fd3f26380/gossip/src/crds.rs#L620-L627
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally chose heap because I read that worst case performance of select_nth_unstable is O(n^2), but you are right, first of all we probably won't hit worst case because pubkey incoming order is pretty random, second select_nth_unstable is clearer for understanding, and for capacity of ~300, O(n^2) or O(nlog(n)) doesn't make much difference.
Changed
- Other small fixes
…thub.com:wen-coding/solana into filter_pubkey_in_gossip_duplicateproof_ingestion
Pull request has been modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets merge this and we will make any further improvements in follow up smaller patches.
@wen-coding I don't know why the CI fails, and I retried it a couple of times and it still fails. |
(cherry picked from commit 151585e)
Problem
Previously we use LRUCache in slot map, someone could spam us by using many fake pubkeys. Now we try to kick out the one with lowest stake.
Summary of Changes