Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batch hot paths for a very short duration #1618

Merged
merged 5 commits into from
Mar 14, 2021
Merged

batch hot paths for a very short duration #1618

merged 5 commits into from
Mar 14, 2021

Conversation

ozkatz
Copy link
Collaborator

@ozkatz ozkatz commented Mar 12, 2021

Looking at the access pattern for critical path operations, we mostly call PostgreSQL with the same exact queries many times.

For GetObject, StatObject, ListObjects (which probably make up the majority of data lake calls), we ALWAYS start by doing the same set of roundtrips to PG:

  • Get the repository (to extract the storage namespace)
  • Resolve the ref (i.e. figure out if this is a commit/commit prefix/branch/tag)
  • Resolve the underlying commit ID (if a branch, prefix or tag)

Our access pattern is such that many requests at a given point in time are extremely likely to not only share the same repository details, but also the same branch/commit/tag, as big data systems tend to be bursty in nature.

Since caching is not an option since it sacrifices consistency (in the sense that reading after a successful write returns - might return a stale value), instead of keeping the result around for a while, we can keep the requests around for a while.

This is what this PR does: for a given type of request (i.e. to a specific branch/repo/tag, etc), wait a couple of milliseconds: if other identical requests arive in that time, do a single roundtrip and return the results once for all those requests.

Testing this on the same environment used for the sizing guide (2 x c5ad.xlarge AWS instances), I now get the following results:

  • lakectl abuse random-read on a commit ID: throughput goes up from 8-10k requests/second to 45k requests/second
  • Amount of DB queries drops from ~20k/s to less than 1k/s.

I'm OK with not accepting this due to it being a premature optimization (it is!), but I feel the added complexity is relatively small and the gain is pretty big (if only to show better numbers per core as possible).

@nopcoder
Copy link
Contributor

Think we can we can simplify it a bit more while using singleflight package + wrapping the function call with the same delay before calling the actual function that fetches the data.

@ozkatz
Copy link
Collaborator Author

ozkatz commented Mar 12, 2021

Think we can we can simplify it a bit more while using singleflight package + wrapping the function call with the same delay before calling the actual function that fetches the data.

@nopcoder sounds nice! Feel free to give that a go 🙂

@nopcoder
Copy link
Contributor

Think we can we can simplify it a bit more while using singleflight package + wrapping the function call with the same delay before calling the actual function that fetches the data.

@nopcoder sounds nice! Feel free to give that a go 🙂

#1620

Need to set up a test env for testing the above.

@ozkatz
Copy link
Collaborator Author

ozkatz commented Mar 13, 2021

Think we can we can simplify it a bit more while using singleflight package + wrapping the function call with the same delay before calling the actual function that fetches the data.

@nopcoder sounds nice! Feel free to give that a go slightly_smiling_face

#1620

Need to set up a test env for testing the above.

Used the same env to test your branch - behaves just the same (~45k requests/second). I agree your implementation is simpler.

keys: make(map[string][]*request),
logger: logger,
}
go e.Run() // TODO(ozkatz): should probably be managed by the user (also, allow stopping it)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can move this one into Run() with defer close, so we will not require to add Stop() and/or Close() methods to handle this resource.

// see if we have it scheduled already
if _, exists := e.keys[req.key]; !exists {
// this is a new key, let's fire a timer for it
go func(req *request) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Add WaitGroup that will count ongoing goroutines.
  2. Add Close() method to Executor to wait for the wait group done.
  3. Close() should also train execs and call the responseCallback(s)

// let's take all callbacks
waiters := e.keys[execKey]
delete(e.keys, execKey)
go func(key string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass waiters as you pass the key - just to be symmetric

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or don't pass both and pin instead..

delete(e.keys, execKey)
go func(key string) {
// execute and call all mapped callbacks
v, err := waiters[0].fn()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: will probably will like to capture this one inside a func with recover that will return an error

@codecov-io
Copy link

codecov-io commented Mar 14, 2021

Codecov Report

Merging #1618 (93337f1) into master (b833114) will increase coverage by 0.16%.
The diff coverage is 79.61%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1618      +/-   ##
==========================================
+ Coverage   39.25%   39.41%   +0.16%     
==========================================
  Files         167      168       +1     
  Lines       13563    13621      +58     
==========================================
+ Hits         5324     5369      +45     
- Misses       7474     7487      +13     
  Partials      765      765              
Impacted Files Coverage Δ
pkg/catalog/catalog.go 18.80% <0.00%> (-0.27%) ⬇️
pkg/batch/executor.go 88.88% <88.88%> (ø)
pkg/graveler/ref/manager.go 71.92% <89.28%> (+1.92%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dc0fdf0...93337f1. Read the comment docs.

@ozkatz ozkatz requested a review from nopcoder March 14, 2021 11:10
@ozkatz
Copy link
Collaborator Author

ozkatz commented Mar 14, 2021

@itaiad200 please see the tests I added: I attempted to prove this method does not violate read-after-write consistency

@ozkatz ozkatz marked this pull request as ready for review March 14, 2021 12:02
// let's take all callbacks
waiters := e.keys[execKey]
delete(e.keys, execKey)
go func(key string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or don't pass both and pin instead..

Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Most requested is the change to the test, to ensure reader1 actually starts waiting after writer1 writes.

responseCallback chan *response
}

type Executor struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An Executor is a Batcher, which is a somewhat odd usage of the interface name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm bad at naming, I'll admit to that. Suggestions are welcome :)

delayFn := func(dur time.Duration) {
delaysDone := atomic.AddInt32(&delays, 1)
if delaysDone == 1 {
close(waitWrite)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the write can occur before https://github.com/treeverse/lakeFS/pull/1618/files#diff-c9e7aae146c0798d32ade9be6fa5013612246e323b7a2796dbcdc83a0151c607R82 ever happens (because the scheduler is evil). I think you may need to wait on another channel here, that writer1 will close after it does write.

@ozkatz ozkatz merged commit c68123e into master Mar 14, 2021
@ozkatz ozkatz deleted the feature/delay branch March 14, 2021 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants