-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance problem with a single stripe #38
Comments
To provide some context (I work with Matt)- here's a benchmark we made to try to better understand the effect of changing the number of stripes: Repo: https://github.com/Simspace/resource-pool-benchmarks Example run (grouped by total pool size, higher number of updates is better): It makes sense to me that having only 1 or 2 stripes would incur a performance penalty, but I guess the question is whether this level of drop off is expected. The benchmark appears to show that a small number of stripes is worse by a significant amount. I did play around a bit with swapping in a TQueue for the entries list, which perhaps could reduce contention. My recollection is that it was a little helpful, but not a silver bullet, but I don't think I ever got a full set of benchmarks on it. I can run those benchmarks if that would be helpful. Cheers |
Oh, what connection pool settings are you all using with what levels of traffic? Lumi’s business doesn’t strike me as needing to process huge volumes of API requests like say, an ad network, so wondered what kind of volume this becomes an issue at. |
Different Matt 😄 SimSpace probably handles a lot more traffic than Lumi. |
Ahh ok |
Ah sorry for the delay, this got buried in my inbox. I think there are a bunch of ways to mitigate the problem (better caching, push vs pull, pgbouncer?), but with some areas that have a "naive" implementation we are able to trigger this issue with 1 stripe easily around several hundred requests per second. Around 200-300 I think, although I dont have the numbers easily on hand. Using more stripes certainly improves the situation, although we've found it to still be a less of a major speedup than we anticipated. |
I found a bug in the resource-pool-benchmarks program above. It relies on the resources being destroyed by the time |
I have added a branch which fixes the bug, but also switches the focus of the benchmark from the number of stripes to the "doing tons of retries in stm" claim. stm uses the optimistic concurrency control approach to transactions, meaning it runs transactions in parallel and hope they don't conflict. This approach works best when most threads happen to use distinct resources. If this assumption is not satisfied, that is, most threads happen to touch the same TVars, then all the transactions are going to fail their validation step at the end of the transaction and they are going to retry. If this is what is happening in this case, then the solution would be to switch from stm to something which uses pessimistic concurrency control, like MVars. I have thus changed the benchmark to pit several pool implementations against each other:
If my hypothesis is right, then
The IORef implementation wiped the floor, Note that my implementations do a much simpler job than |
linking #42 |
Matt Russell on Twitter identified that
Data.Pool
is evidently very slow with a single stripe.The text was updated successfully, but these errors were encountered: