-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Description
Describe
In the original logic, Hashtable uses a vector-like structure to store actual data. When constructing the hash table, there may be about a quarter of the time copying data continuously. Especially in the case of building more columns, it will take more time. So I changed this to a raw pointer to avoid extra copy overhead. There will be good results in the hash table construction phase
Here is my test case, LINE_ORDER and LINE_ORDER_V2 is from SSB datasets:
SELECT count(*) FROM LINE_ORDER t1 join LINE_ORDER_V2 t2 WHERE t1.LO_ORDERKEY=t2.LO_ORDERKEY;
| Type | Right Table Rows | Build Time | Probe Time | Time Cost (s) |
|---|---|---|---|---|
| After | 6001215 | 658.288ms | 1s451ms | 4.07 |
| Before | 6001215 | 1s428ms | 1s512ms | 4.69 |
Metadata
Metadata
Assignees
Labels
No labels