-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider using a disk-based hash table for hash join avoiding OOM #11607
Comments
The implementation refers to cdb. And here's an illustration of it at http://www.unixuser.org/~euske/doc/cdbinternals/index.html Consider putting We divide this issue into two steps, 1) The improvement of
|
Feature Request
Is your feature request related to a problem? Please describe:
Consider using a disk-based hash table for hash join avoiding OOM.
HashJoinExecutor
uses a hash table describing the map ofjoin keys
and inner table rows.TiDB's hash join is implemented by
innerResult
andmvmap.MVMap
. TheinnerResult
stores all the rows of the inner table, and themvmap.MVMap
stores the map of (join key, inner table pointer). This allows us to use these two structures to get a map ofjoin keys
and inner table rows.When the inner table is particularly large, the
innerResult
will take up a lot of memory; when the join key is particularly large,mvmap.MVMap
will also take up a lot of memory. There will be problems with OOM at this time.Describe the feature you'd like:
mem-quota-query
, which set the memory quota for a query in bytes.oom-use-tmp-storage
, default istrue
. Set to true to enable use of temporary disk for some executors(in this issue, it is hash join) whenmem-quota-query
is exceeded.explain analyze
SELECT * FROM information_schema.processlist;
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Migration Strategy:
tasks:
explain analyze
show disk usage information in explain analyze #12625Some tiny issues
SELECT * FROM information_schema.processlist;
Show disk usage of a query in information_schema.processlist #13931mem-quota-query
change the default value ofmem-quota-query
#12937The text was updated successfully, but these errors were encountered: