You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a user specifies a hash partitioning in STORE(), they presumably anticipate join queries on that partition key with other relations. Provided the local join optimization in RACO is applied, and the local join is pushed into the local storage engine (i.e, Postgres), the join should be considerably optimized if an index on the partition key already exists for both relations (it should guarantee that either an indexed merge join or indexed nested loop join is chosen). We could automatically create such an index whenever a DbInsert operator with hash partitioning is executed.
The downside of this automated approach is that considerable resources may be expended during index creation, which may take a long time for large relations and slow down queries in progress. We should benchmark this overhead and automate this index optimization if the overhead seems acceptable.
The text was updated successfully, but these errors were encountered:
A possible refinement is to use the C locale for collation on these indexes. That massively speeds up sorting (because Unicode normalization can be avoided), and allows the Postgres "abbreviated keys" optimization to kick in (which had to be disabled for non-C locales because of glibc bugs).
If we did use the C locale for these indexes, we would have to ensure that they couldn't be used for user-visible comparisons, or users might get unexpected results.
When a user specifies a hash partitioning in
STORE()
, they presumably anticipate join queries on that partition key with other relations. Provided the local join optimization in RACO is applied, and the local join is pushed into the local storage engine (i.e, Postgres), the join should be considerably optimized if an index on the partition key already exists for both relations (it should guarantee that either an indexed merge join or indexed nested loop join is chosen). We could automatically create such an index whenever aDbInsert
operator with hash partitioning is executed.The downside of this automated approach is that considerable resources may be expended during index creation, which may take a long time for large relations and slow down queries in progress. We should benchmark this overhead and automate this index optimization if the overhead seems acceptable.
The text was updated successfully, but these errors were encountered: