Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: create parallel lookup join hint to let people use the unsafe parallel lookup join when required #42708

Closed
jordanlewis opened this issue Nov 22, 2019 · 6 comments
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) E-quick-win Likely to be a quick win for someone experienced. T-sql-queries SQL Queries Team

Comments

@jordanlewis
Copy link
Member

jordanlewis commented Nov 22, 2019

19.2 closed a critical stability bug that permitted lookup joins to use unlimited memory in certain cases, causing OOMs. Because of #19721, the way we had to close the bug was by banning parallel lookup joins unless the optimizer could prove that the joined relation only had one row per lookup.

This has the potential to erode performance from 19.1 baselines in cases that didn't suffer from potential OOM conditions, which is problematic. We need to restore the performance for cases where it matters for customers.

We need a hint that always turns on the parallel behavior, even if the optimizer can't prove that the join columns are a key. I propose the syntax:

SELECT * FROM t <join type> PARALLEL_LOOKUP JOIN u...

This syntax is deliberately ugly. We shouldn't document it, because it's unsafe, but I also don't think we need to stick unsafe into the name necessarily. We should remove this syntax in 20.1 once #19721 is resolved and lookup join no longer has this issue.

Jira issue: CRDB-5332

@jordanlewis jordanlewis added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Nov 22, 2019
@awoods187
Copy link
Contributor

Can we follow the other hint syntax with something like INNER PARALLEL_LOOKUP JOIN? We have so far required hints to be inside the INNER ___ JOIN space. cc @RaduBerinde

@jordanlewis
Copy link
Member Author

What you wrote is not different from what I wrote.

@jordanlewis
Copy link
Member Author

I see, Andy's right - I forgot that you need to have the join type specified up front. I wasn't suggesting that this one be special - was just proposing the name of the thing that would replace lookup/hash/merge.

@RaduBerinde
Copy link
Member

How about UNSAFE_PARALLEL_LOOKUP just to discourage usage?

@RaduBerinde RaduBerinde self-assigned this Nov 24, 2019
@yuzefovich yuzefovich added the E-quick-win Likely to be a quick win for someone experienced. label May 11, 2021
@jlinder jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021
@mgartner
Copy link
Collaborator

@yuzefovich does the work on the streamer make this suggestion obsolete?

@yuzefovich
Copy link
Member

Yes, I think we can close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) E-quick-win Likely to be a quick win for someone experienced. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

No branches or pull requests

6 participants