-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rowexec: investigate and improve lookup join performance #47472
Comments
Unchecked "better parallelization". It's always been out of scope and might be easier to do once #47473 is complete. |
Also created a more specific issue to track join reader left semi/anti joins and linked that in the checkbox. |
It turns out that lookup join performance is extremely important for TPC-E. Many of its transactions require parallel chains of point lookups, which can be expressed as multi-way lookup joins. So far, "vectorizing" these point lookups into lookup joins has improved performance significantly, but I'm sure there's room for improvement. All that goes to say – TPC-E would be another good testbed for you to take a look at once it's a little more stable and try out changes to while working on this issue. |
Closing this issue as we improved/investigated the items in the list (refer to specific issues for more information). We didn't benchmark TPC-E but benchmarking/investigation would be interesting to do here. |
Performance improvements
ScanRequest
s toGetRequest
s or an even cheaper existence check.General improvements
Possibly out of scope but worth mentioning:
Explore better parallelization. A single lookup join is planned on the leaseholder for the bigger table. These rows might have matches on different nodes. Routing rows by expected lookup range location could allow us to parallelize lookups. This might not be something we do this release, but understanding the solution space will allow us to formulate specific work items. A less invasive change is to bucketing batches by expected lookup node #34997, which would allow us to reduce the total number of round trips.
The text was updated successfully, but these errors were encountered: