-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelise data extension and pre-repair sanity check #116
Conversation
Codecov Report
@@ Coverage Diff @@
## master #116 +/- ##
==========================================
+ Coverage 80.96% 83.29% +2.33%
==========================================
Files 8 8
Lines 457 497 +40
==========================================
+ Hits 370 414 +44
+ Misses 52 50 -2
+ Partials 35 33 -2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
1796f12
to
c2c8078
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
When parallelising the crossword loop, it works fine with infectious, but errors on go-leopard and klauspost leopard. Seems like a potential issue with doing parallel decodings. Otherwise, it resulted in 4x+ performance increase for 128x128 blocks with infectious. Branch with parallel decoding (klauspost): https://github.com/celestiaorg/rsmt2d/blob/parallelisation_klauspost/extendeddatacrossword.go#L73 klauspost leopard:
go-leopard:
|
This reverts commit fa3361f.
Marking this as ready for review as we can implement decoder parallelization later when the above issue is resolved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I only had a single question, and will approve after anyone else that wants to gets a chance to review.
While this implementation doesn't allow for a specific configurable number of workers, I really like the increased simplicity we get by not.
on a side note, it's interesting to see that the new leopard codec seems to benefit less from parallelization than the infectious one, at least for extension.
I think it would be easy to add this later by calling SetLimit() in errgroup, and adding ThreadCount to the EDS struct, or something similar. |
Do you think not allowing working count to be configurable may cause performance issues for users of the library? |
not meaningfully, no |
Hmm a 2x performance different between 16 and 128 goroutines seem significant, with 256x256 EDS we would have like 512 goroutines. But not super high priority as long as it doesn't cause problems for users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I agree that limiting the number of goroutines via workers can be done in another PR. Just a small question below
func (eds *ExtendedDataSquare) computeSharesRoot(shares [][]byte, axis Axis, i uint) []byte { | ||
tree := eds.createTreeFn(axis, i) | ||
for cell, d := range shares { | ||
tree.Push(d, SquareIndex{Cell: uint(cell), Axis: i}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be reading this a bit wrong, but isn't the dual axis/Axis terminology a bit confusing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more descriptive than calling the variable something like a
imo
Notes:
cpu: 11th Gen Intel(R) Core(TM) i9-11900H @ 2.50GHz
go-leopard + parallelisation
go-leopard + no parallelisation
klauspost + parallelisation
klauspost + no parallelisation