Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exec: handle NULL group by keys in hash aggregator #38900

Merged
merged 1 commit into from
Jul 17, 2019

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Jul 16, 2019

Previously, if an aggregation was requested with NULL group keys, an
index out of bounds error would occur in the hash aggregator. The reason
was that the hashTable data structure is shared between the hash joiner
and the hash aggregator. It was originally implemented for the hash joiner,
and in that case, NULLs are always supposed to be treated as non-equal.
However, in the hash aggregator, we want NULLs to compare as equal to each
other.

This change allows the tests in logic_test/aggregate to pass with the
local-vec configuration.

In order to exercise more of the null handling logic, the tests now set
garbage data in the corresponing position of the vector whenever an
element is null.

fixes #38750

Release note: None

@rafiss rafiss requested review from yuzefovich and a team July 16, 2019 16:09
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice investigation! Your explanation makes sense to me, but I think a minor additional adjustment is needed for this PR to be bullet-proof.

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)


pkg/sql/exec/hashjoiner_tmpl.go, line 73 at r1 (raw file):

	probeVal := probeKeys[_SEL_IND]
	var unique bool
	_ASSIGN_NE(unique, buildVal, probeVal)

I think this part needs an adjustment. In theory, we can have two Vecs with nulls at the appropriate indices but with garbage in the actual values. Null bitmap takes a priority, so it would still be a correct representation. But we would get that two nulls are different because of the leftover garbage.

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/exec/hashjoiner_tmpl.go, line 73 at r1 (raw file):

Previously, yuzefovich wrote…

I think this part needs an adjustment. In theory, we can have two Vecs with nulls at the appropriate indices but with garbage in the actual values. Null bitmap takes a priority, so it would still be a correct representation. But we would get that two nulls are different because of the leftover garbage.

thanks, this makes sense... hmm, i wonder then, i think i also will need to adjust the rehash function in hashjoiner_tmpl.go so that it computes a hash consistently when something is null.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)


pkg/sql/exec/hashjoiner_tmpl.go, line 73 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

thanks, this makes sense... hmm, i wonder then, i think i also will need to adjust the rehash function in hashjoiner_tmpl.go so that it computes a hash consistently when something is null.

Good point. I agree, rehash needs an adjustment as well.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)


pkg/sql/exec/hashjoiner_tmpl.go, line 73 at r1 (raw file):

Previously, yuzefovich wrote…

Good point. I agree, rehash needs an adjustment as well.

I think we should make somewhat artificial test that would trigger these possible issues with nulls and with garbage in values. Maybe other places need adjustments, I'm not too familiar with this code to be honest.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/exec/hashjoiner_tmpl.go, line 73 at r1 (raw file):

Previously, yuzefovich wrote…

I think we should make somewhat artificial test that would trigger these possible issues with nulls and with garbage in values. Maybe other places need adjustments, I'm not too familiar with this code to be honest.

Hmm, I might have an idea for a different approach. Could we edit _CHECK_COL_BODY to have a 4th case, for when ProbeHasNulls is true, BuildHasNulls is true, and also the allowNullEquality boolean (that could be a template variable, btw) is also true? Then, _CHECK_COL_BODY could have special case handling for the situation in which both sides are NULL. I think?

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rafiss)


pkg/sql/exec/hashjoiner_tmpl.go, line 73 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Hmm, I might have an idea for a different approach. Could we edit _CHECK_COL_BODY to have a 4th case, for when ProbeHasNulls is true, BuildHasNulls is true, and also the allowNullEquality boolean (that could be a template variable, btw) is also true? Then, _CHECK_COL_BODY could have special case handling for the situation in which both sides are NULL. I think?

I feel like rehash will still need to be taught about allowNulEquality.

@rafiss rafiss force-pushed the hash-aggregator-null-equality branch from 36a9bcb to 0890944 Compare July 17, 2019 17:43
Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please take another look!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/exec/hashjoiner_tmpl.go, line 73 at r1 (raw file):

Previously, yuzefovich wrote…

I feel like rehash will still need to be taught about allowNulEquality.

Sorry, I didn't notice this comment thread. I think I ended up doing basically as Jordan suggested, but maybe it's a little more clunky than what you had in mind. I also did fix rehash, and also edited test infrastructure so garbage is inserted whenever something is set to NULL.

@rafiss rafiss requested review from yuzefovich and a team July 17, 2019 17:47
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I cannot say that I fully understand how vectorized hash table works, but this change :lgtm:

Reviewed 5 of 5 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis and @rafiss)


pkg/sql/exec/utils_test.go, line 300 at r2 (raw file):

					col.Index(int(outputIdx)).Set(val)
				} else {
					panic(fmt.Sprintf("Could not generate a random value pf type %s\n.", typ.Name()))

[nit]: s/Could/could/ and s/pf/of/.

@rafiss
Copy link
Collaborator Author

rafiss commented Jul 17, 2019

benchmark numbers aren't affected too much

name                                                                   old time/op    new time/op    delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-24         562µs ± 4%     567µs ± 7%    ~     (p=0.436 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-24      44.0ms ± 2%    43.0ms ± 2%  -2.30%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-24      668ms ± 1%     661ms ± 2%  -1.16%  (p=0.002 n=8+9)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-24        545µs ± 1%     548µs ± 2%    ~     (p=0.173 n=8+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-24     43.9ms ± 2%    42.7ms ± 4%  -2.59%  (p=0.002 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-24     1.21s ± 7%     1.15s ± 4%  -5.29%  (p=0.034 n=10+8)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-24          583µs ± 1%     561µs ± 4%  -3.71%  (p=0.000 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-24       45.1ms ± 1%    44.3ms ± 1%  -1.81%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-24       690ms ± 2%     679ms ± 2%  -1.61%  (p=0.002 n=10+9)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-24         558µs ± 2%     530µs ± 9%  -5.07%  (p=0.001 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-24      43.4ms ± 3%    41.5ms ± 2%  -4.42%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-24      1.00s ± 5%     1.02s ± 5%    ~     (p=0.280 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-24          643µs ± 3%     662µs ± 1%  +2.93%  (p=0.000 n=10+9)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-24       53.2ms ± 1%    54.2ms ± 1%  +2.00%  (p=0.000 n=9+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-24       811ms ± 1%     827ms ± 1%  +2.01%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-24         624µs ± 4%     633µs ± 3%    ~     (p=0.190 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-24      57.2ms ± 2%    54.9ms ± 1%  -3.97%  (p=0.000 n=9+9)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-24      1.38s ± 4%     1.39s ± 6%    ~     (p=0.529 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-24           656µs ± 1%     665µs ± 1%  +1.35%  (p=0.002 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-24        54.4ms ± 1%    56.1ms ± 3%  +3.18%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-24        824ms ± 1%     844ms ± 1%  +2.41%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-24          625µs ± 2%     646µs ± 1%  +3.32%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-24       53.6ms ± 2%    53.0ms ± 1%  -1.09%  (p=0.043 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-24       1.14s ± 3%     1.17s ± 3%    ~     (p=0.089 n=10+10)

name                                                                   old speed      new speed      delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-24       233MB/s ± 4%   232MB/s ± 7%    ~     (p=0.436 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-24     381MB/s ± 2%   390MB/s ± 2%  +2.35%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-24    402MB/s ± 1%   406MB/s ± 2%  +1.18%  (p=0.002 n=8+9)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-24      240MB/s ± 1%   239MB/s ± 2%    ~     (p=0.173 n=8+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-24    382MB/s ± 2%   393MB/s ± 3%  +2.69%  (p=0.002 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-24   221MB/s ± 8%   233MB/s ± 4%  +5.42%  (p=0.034 n=10+8)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-24        225MB/s ± 1%   234MB/s ± 4%  +3.91%  (p=0.000 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-24      372MB/s ± 1%   379MB/s ± 1%  +1.84%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-24     389MB/s ± 2%   396MB/s ± 2%  +1.63%  (p=0.002 n=10+9)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-24       235MB/s ± 2%   248MB/s ± 9%  +5.51%  (p=0.001 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-24     387MB/s ± 3%   405MB/s ± 2%  +4.61%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-24    269MB/s ± 5%   264MB/s ± 5%    ~     (p=0.280 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-24        204MB/s ± 3%   198MB/s ± 1%  -2.86%  (p=0.000 n=10+9)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-24      316MB/s ± 1%   309MB/s ± 1%  -1.96%  (p=0.000 n=9+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-24     331MB/s ± 1%   324MB/s ± 1%  -1.97%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-24       210MB/s ± 4%   207MB/s ± 3%    ~     (p=0.190 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-24     293MB/s ± 2%   306MB/s ± 1%  +4.13%  (p=0.000 n=9+9)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-24    195MB/s ± 4%   193MB/s ± 6%    ~     (p=0.529 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-24         200MB/s ± 1%   197MB/s ± 1%  -1.34%  (p=0.002 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-24       309MB/s ± 1%   299MB/s ± 3%  -3.07%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-24      326MB/s ± 1%   318MB/s ± 1%  -2.36%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-24        210MB/s ± 2%   203MB/s ± 1%  -3.22%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-24      313MB/s ± 2%   316MB/s ± 1%  +1.10%  (p=0.043 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-24     235MB/s ± 3%   230MB/s ± 3%    ~     (p=0.089 n=10+10)

name                                                                   old alloc/op   new alloc/op   delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-24         805kB ± 0%     805kB ± 0%  +0.00%  (p=0.000 n=9+8)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-24      46.4MB ± 0%    46.4MB ± 0%  +0.00%  (p=0.000 n=9+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-24      697MB ± 0%     697MB ± 0%  +0.00%  (p=0.001 n=8+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-24        809kB ± 0%     809kB ± 0%  +0.00%  (p=0.000 n=10+8)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-24     44.6MB ± 0%    44.6MB ± 0%  +0.00%  (p=0.001 n=10+9)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-24     668MB ± 0%     668MB ± 0%    ~     (p=0.222 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-24          808kB ± 0%     808kB ± 0%  +0.00%  (p=0.000 n=10+9)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-24       46.7MB ± 0%    46.7MB ± 0%  +0.00%  (p=0.003 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-24       701MB ± 0%     701MB ± 0%  +0.00%  (p=0.000 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-24         812kB ± 0%     812kB ± 0%  +0.00%  (p=0.000 n=10+7)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-24      44.9MB ± 0%    44.9MB ± 0%  +0.00%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-24      672MB ± 0%     672MB ± 0%    ~     (p=0.338 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-24          805kB ± 0%     805kB ± 0%  -0.00%  (p=0.000 n=10+9)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-24       46.4MB ± 0%    46.4MB ± 0%  -0.02%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-24       697MB ± 0%     697MB ± 0%  -0.02%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-24         809kB ± 0%     809kB ± 0%  -0.00%  (p=0.000 n=9+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-24      44.6MB ± 0%    44.6MB ± 0%  -0.01%  (p=0.000 n=9+9)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-24      668MB ± 0%     668MB ± 0%  -0.01%  (p=0.000 n=6+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-24           808kB ± 0%     808kB ± 0%  -0.00%  (p=0.000 n=9+8)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-24        46.7MB ± 0%    46.7MB ± 0%  -0.02%  (p=0.000 n=8+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-24        701MB ± 0%     701MB ± 0%  -0.02%  (p=0.000 n=10+8)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-24          812kB ± 0%     812kB ± 0%  -0.00%  (p=0.000 n=9+6)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-24       44.9MB ± 0%    44.9MB ± 0%  -0.01%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-24       672MB ± 0%     672MB ± 0%  -0.01%  (p=0.000 n=9+10)

name                                                                   old allocs/op  new allocs/op  delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-24         8.28k ± 0%     8.28k ± 0%    ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-24       1.05M ± 0%     1.05M ± 0%    ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-24      16.8M ± 0%     16.8M ± 0%    ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-24        6.24k ± 0%     6.24k ± 0%    ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-24       527k ± 0%      527k ± 0%    ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-24     8.40M ± 0%     8.40M ± 0%    ~     (p=0.721 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-24          8.28k ± 0%     8.28k ± 0%    ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-24        1.05M ± 0%     1.05M ± 0%    ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-24       16.8M ± 0%     16.8M ± 0%    ~     (p=0.248 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-24         6.24k ± 0%     6.24k ± 0%    ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-24        527k ± 0%      527k ± 0%    ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-24      8.40M ± 0%     8.40M ± 0%    ~     (p=0.173 n=10+9)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-24          8.28k ± 0%     8.27k ± 0%  -0.10%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-24        1.05M ± 0%     1.05M ± 0%  -0.10%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-24       16.8M ± 0%     16.8M ± 0%  -0.10%  (p=0.001 n=8+9)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-24         6.24k ± 0%     6.23k ± 0%  -0.10%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-24        527k ± 0%      527k ± 0%  -0.10%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-24      8.41M ± 0%     8.40M ± 0%  -0.10%  (p=0.000 n=7+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-24           8.28k ± 0%     8.28k ± 0%  -0.10%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-24         1.05M ± 0%     1.05M ± 0%  -0.10%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-24        16.8M ± 0%     16.8M ± 0%  -0.10%  (p=0.002 n=8+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-24          6.24k ± 0%     6.23k ± 0%  -0.10%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-24         527k ± 0%      527k ± 0%  -0.10%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-24       8.40M ± 0%     8.40M ± 0%  -0.10%  (p=0.000 n=9+10)

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis and @yuzefovich)


pkg/sql/exec/utils_test.go, line 300 at r2 (raw file):

Previously, yuzefovich wrote…

[nit]: s/Could/could/ and s/pf/of/.

done

Previously, if an aggregation was requested with NULL group keys, an
index out of bounds error would occur in the hash aggregator. The reason
was that the hashTable data structure is shared between the hash joiner
and the hash aggregator. It was originally implemented for the hash joiner,
and in that case, NULLs are always supposed to be treated as non-equal.
However, in the hash aggregator, we want NULLs to compare as equal to each
other.

This change allows the tests in logic_test/aggregate to pass with the
local-vec configuration.

In order to exercise more of the null handling logic, the tests now set
garbage data in the corresponing position of the vector whenever an
element is null.

Release note: None
@rafiss rafiss force-pushed the hash-aggregator-null-equality branch from 0890944 to cc3b523 Compare July 17, 2019 18:26
@rafiss
Copy link
Collaborator Author

rafiss commented Jul 17, 2019

thanks all
bors r+

@craig
Copy link
Contributor

craig bot commented Jul 17, 2019

Build failed

@rafiss
Copy link
Collaborator Author

rafiss commented Jul 17, 2019

bors r+

craig bot pushed a commit that referenced this pull request Jul 17, 2019
38900: exec: handle NULL group by keys in hash aggregator r=rafiss a=rafiss

Previously, if an aggregation was requested with NULL group keys, an
index out of bounds error would occur in the hash aggregator. The reason
was that the hashTable data structure is shared between the hash joiner
and the hash aggregator. It was originally implemented for the hash joiner,
and in that case, NULLs are always supposed to be treated as non-equal.
However, in the hash aggregator, we want NULLs to compare as equal to each
other.

This change allows the tests in logic_test/aggregate to pass with the
local-vec configuration.

In order to exercise more of the null handling logic, the tests now set
garbage data in the corresponing position of the vector whenever an
element is null.

fixes #38750 

Release note: None

Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Jul 17, 2019

Build succeeded

@craig craig bot merged commit cc3b523 into cockroachdb:master Jul 17, 2019
@rafiss rafiss deleted the hash-aggregator-null-equality branch July 17, 2019 21:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

exec: 'index out of range' errors for some aggregations
4 participants