Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrades dashmap dependency to v5.5.3 #33659

Merged
merged 1 commit into from
Oct 18, 2023

Conversation

brooksprumo
Copy link
Contributor

Problem

We use dashmap version 4.0.2, which was released January of 2021. There have been a lot of commits since then1. Some address bugs, some improve performance. Since we rely on dashmap in multiple places, it would be good to pull in those changes.

Equivalently, since dashmap is used in multiple places, we'd like to ensure there's adequate testing time to find any possible regressions. Since the v1.17 was just branched, bumping the version now gives the most testing/soak time.

Summary of Changes

Upgrade dashmap to v5.5.3

Footnotes

  1. https://github.com/xacrimon/dashmap/compare/v4.0.1...v.5.5.3 (note, there's no v4.0.2 tag)

@brooksprumo brooksprumo self-assigned this Oct 11, 2023
@CriesofCarrots
Copy link
Contributor

CriesofCarrots commented Oct 11, 2023

Just fyi, we tried to bump dashmap a year and a half ago, but had to revert: #23592 (edit) linked wrong revert 😅
Possibly everything is resolved as of #23765, but might be worth some extra care.

@brooksprumo
Copy link
Contributor Author

Just fyi, we tried to bump dashmap a year and a half ago, but had to revert: #22488 Possibly everything is resolved as of #23765, but might be worth some extra care.

Ooo, thanks for the heads-up!

@codecov
Copy link

codecov bot commented Oct 11, 2023

Codecov Report

Merging #33659 (fbb87b0) into master (fa21a3d) will decrease coverage by 0.1%.
Report is 2 commits behind head on master.
The diff coverage is n/a.

@@            Coverage Diff            @@
##           master   #33659     +/-   ##
=========================================
- Coverage    81.8%    81.8%   -0.1%     
=========================================
  Files         806      806             
  Lines      217477   217477             
=========================================
- Hits       178026   178003     -23     
- Misses      39451    39474     +23     

@brooksprumo
Copy link
Contributor Author

I've spun up a node with this PR to see how it fairs. I've added the following secondary index-related CLI args:

  --full-rpc-api
  --account-index program-id
  --account-index-exclude-key kinXdEcpDQeHPEuQnqmUgtYykqKGVFq6CeVX5iAHJq6

@CriesofCarrots Are these args sufficient to have triggered the old issue?

@brooksprumo
Copy link
Contributor Author

@carllin Do you remember how to reproduce #23765? Was it deterministic? Was it immediate (e.x. during index generation)?

I've been running a node with a secondary index for a few days now with this PR, without any issues. Wanted to make sure I wasn't reintroducing known issues.

@CriesofCarrots
Copy link
Contributor

Are these args sufficient to have triggered the old issue?

Based on my foggy recollection, yes, but that kin address is a mint, not a program id. This combo would be more realistic:

  --full-rpc-api
  --account-index program-id
  --account-index-exclude-key TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA

Or you could pick one of the spl-token secondary indexes to run with excluding the kin mint.

@carllin
Copy link
Contributor

carllin commented Oct 18, 2023

#33659 (comment). Ha this is probably the best way to do it. The set of changes they've made is extensive, but most of them are just introducing new APIs.

The only big one of note is this one: xacrimon/dashmap#204, but seems to have been baking for a 1.5 years now so hopefully should be ok 🤞

@carllin
Copy link
Contributor

carllin commented Oct 18, 2023

@carllin Do you remember how to reproduce #23765? Was it deterministic? Was it immediate (e.x. during index generation)?

I've been running a node with a secondary index for a few days now with this PR, without any issues. Wanted to make sure I wasn't reintroducing known issues.

Hmmm I believe it was a race condition that happened outside of index generation, but I can't say I 100% remember. Checked some discord messages and couldn't find anything either darn! I think it happened pretty fast though, otherwise I would not have been able to debug it :)

We could revert back before my fix and try unpacking an old snapshot, but might not be worth the hassle :)

@brooksprumo
Copy link
Contributor Author

I've spun up a node with this PR to see how it fairs. I've added the following secondary index-related CLI args:

  --full-rpc-api
  --account-index program-id
  --account-index-exclude-key kinXdEcpDQeHPEuQnqmUgtYykqKGVFq6CeVX5iAHJq6

This node has been running A-OK for 136 hours. I'm going to restart it with the args from #33659 (comment) and ensure it still runs fine.

@brooksprumo
Copy link
Contributor Author

I'm going to restart it with the args from #33659 (comment) and ensure it still runs fine.

This node has now been running for 7 hours. I'll keep it running too.

I'm going to mark this PR as ready for review. If anyone has additional tests they'd like me to run, or folks to review, please let me know.

@brooksprumo brooksprumo marked this pull request as ready for review October 18, 2023 19:31
@brooksprumo brooksprumo merged commit d6aba9d into solana-labs:master Oct 18, 2023
27 checks passed
@brooksprumo brooksprumo deleted the bump/dashmap branch October 18, 2023 23:00
@brooksprumo
Copy link
Contributor Author

I'm going to restart it with the args from #33659 (comment) and ensure it still runs fine.

This node has now been running for 7 hours. I'll keep it running too.

I've finally stopped this node (needed it for other dev work). It was running just fine for 168 hours.

Here's some metrics over the last 7 days. Seemed pretty stable once hitting steady state. (Note the memory steps at epoch boundaries—that's a separate issue being debugged, and is unrelated to this change.)
metrics

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants