-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upsert optimization #325
Upsert optimization #325
Conversation
Codecov Report
@@ Coverage Diff @@
## main #325 +/- ##
===========================================
+ Coverage 99.98% 100.00% +0.01%
===========================================
Files 232 232
Lines 12574 12631 +57
===========================================
+ Hits 12572 12631 +59
+ Misses 1 0 -1
+ Partials 1 0 -1
Continue to review full report at Codecov.
|
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
…it on update Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
e20060c
to
81429dd
Compare
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Here are some comparison results for context. All benchmarks were ran on my MBP with a CLI generated 2 node firefly environment, using postgres as the DB: Before #325, we were consistently seeing > 1 minute latency between the last sent message and 0 pending transactions.
With #325:
Batch processing time was consistently reduced to <= 1 second, down from ~4.
🚀 🚀 |
// This is a performance critical function, as we stream data into the database for every message, in every batch. | ||
// | ||
// First attempt the operation based on the optimization passed in. | ||
// The expectation is that this will practically hit of the time, as only recovery paths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this sentence feels like it's missing a word (same with the other places it was copied)
|
||
if existing { | ||
if recreateDatarefs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment a few lines down indicates we "Run through the ones in the message, finding ones that already exist". I don't think that's accurate anymore (we always insert all the refs if we get here, and might delete them all beforehand... but it's never a partial update).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good. Couldn't see any issues with the functionality. Left a few small notes on wording.
Also note that db migrations 44-45 were claimed by #317, so this will need to be updated to use 46.
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
One of the earliest findings in #316 @shorsher is that the speed at which messages can be injected into FireFly using a parallel loading test runner, is significantly faster than the speed at which batches can be processed.
This is not surprising due to one process currently being parallel, and one being sequential. That's something we can look at parallelizing on the back-end later.
However, the most immediately actionable finding from log analysis was just how many DB operations we were doing.
On a simple Laptop test, it was 3-5ms per PostgreSQL request (in a single TX), so that adds up fast!
We found with the first example analyzed:
Upsert
meaningSELECT
message,UPDATE
message,DELETE
datarefs,INSERT
datarefsSELECT
data,UPDATE
dataSo the first improvement in the hardening phase, is to reduce these calls, by optimizing for the case we know we're in.
These optimizations are implemented in SQL, but in a way a similar optimization should be possible in NoSQL DBs once implemented.
node
hint to batches, to know which node sent themmessage
anddata
signatures on the DB layer to take in anUpsertOptimization
optionUpsertOptimizationSkip
- no optimization - same as todayUpsertOptimizationNew
- optimize forINSERT
UpsertOptimizationExisting
- optimize forUPDATE
allowExisting
- logic should always allow for this... givenallowHashUpdate
- now always false - the hash ofmessage
anddata
cannot change on any code pathUPDATE
hash
as a key when updating, and check a row is updatedrowsUpdated
(well adopted we believe)message
thedatarefs
are fixed, so on the optimized path we don't touch them on updateINSERT