rowexec: speed up lookup joins when no ordering is required #48117
Labels
A-sql-execution
Relating to SQL execution.
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
Lookup joins currently always output rows based on the lookup input ordering. This can be inefficient if we don't have to do it.
The lookup joiner currently reads a batch of input rows and creates lookup spans based on those rows. These lookup spans need to be ordered properly before they are passed to the kv layer (due to scan resumption behavior and other things I can't remember). Because of this, the lookup join receives rows out of order relative to the lookup side. Once the results are retrieved, the lookup joiner iterates over the lookup rows and looks up the results once more based on the span key.
This can be bad for performance since all results need to be buffered (the first result that needs to be output can be the last result received), and in some cases when the result set is very large, spilled to disk. This ordering is not required in all cases (the optimizer passes this down through
reqOrdering
in alookupJoinNode
) so we should optimize the lookup joiner to stream results in these cases.Investigation into #39471 has also shown that a blocker to increasing the lookup batch size is that performance tanks once we spill to disk. If we have a specialized no-ordering lookup join, this won't be a concern, and we can increase the lookup batch size to a larger constant.
The text was updated successfully, but these errors were encountered: